User Experience Insights
Unleashing the power of experience data by bringing Machine Learning and Text Mining to our Future of Work Continuous Listening Landscape
Survey programs such as SAP’s #Unfiltered are a fantastic way to better understand employee views and create insights for the business from thousands of feedback responses. For example, this year in April more than 80.000 of SAP’s employees provided feedback on a variety of topics, ranging from engagement to health and well-being to learning and development opportunities. Traditionally, a lot of survey data is collected on items phrased as questions or statements that are then rated on so-called Likert scales. The great advantage of Likert items is that they are easily analyzed: the responses are captured on a numerical scale, typically 1-5, and with those numerical indices we can do a lot ranging from simple descriptive statistics (e.g., averages or percentage favorable) to advanced analytics (e.g., regression models or forecasts). But sometimes, this way of approaching experience data isn’t enough!
The need for discovery
Relying solely on Likert-type data gathering can lead to very narrow and skewed view of employee opinions, ideas, and sentiments, because it assumes we have perfect top-down knowledge of the range of potential topics that are worth inquiring about and all that is left to do is do convert agreement on each topic into a number between 1 and 5. But we live in fast and tumultuous times, and sometimes the “how much do you agree” approach of Likert items isn’t enough to understand the topics occupying the minds of the business, sometimes we need to go into discovery mode instead! We can do this the same way we would in a conversation: we ask an open-ended question. This approach, typically called qualitative (in contrast to our quantitative Likert items), can help us discover a lot more about the views of employees, but it brings its own drawbacks. For example, in the April #Unfiltered we received almost 60.000 comments in response to two open questions we asked, making it impossible for our small team to manually extract insights from each comment.
Introducing text mining to our continuous listening strategy – a little case study
So, this is where text mining comes into the picture. The world of data science has increasingly embraced the fact that not all data come in nice rectangular frames with numbers but are often quite verbose. Yet the field of natural language processing, which very much sits at the interface of linguistics, computer science and machine learning, has opened several avenues to process such data in a manageable fashion. Here we’ll take a quick look at an example of one such analysis we did.
- Asking our employees about their hybrid work needs
With the recent SARS-CoV-2 pandemic the world saw an unprecedented move to remote work. SAP showed a particularly proactive approach to this challenge and implemented a holistic framework to enable hybrid work outside pandemic times as well, the Pledge to Flex program. In summer this year, a time where most countries abandoned social distancing measures and regulatory prescriptions, we ran Future of Work Pulse to ask our employees how hybrid work is going. An obvious concern in this circumstance is the question of resources. Do employees have everything they need to work in the hybrid world? Well, we asked that with the Likert-rated item “I have all of the resources that I need to successfully transition into a hybrid working set-up.” But what do we do about people who say they don’t have the necessary resources? In normal conversation you would simply ask: “What resources are you currently lacking?” And that’s exactly what we did.
How text mining helped us discover the hot topics
When asking that question, we got hundreds of responses from employees across the globe. We used a machine learning method called Latent Dirichlet Allocation (LDA) for the open comment analysis. LDA is a generative statistical model commonly employed in topic modelling and natural language processing, which aims to discover topics within a collection of documents. It treats each document (in this case, each open comment submitted in response to our question of what resources are lacking) as a mixture of topics, and each topic as a mixture of individual words. Documents may therefore overlap each other in terms of content, mirroring typical use of natural language. Our analysis suggested that the answers fall into three broad categories:
- Guidance on Pledge to Flex implementation
- Equipment and workspaces
- Internal travel and bringing teams together
Making sense of the discoveries
Obviously, it still takes a good old human brain to make sense of the initial output, and then gain real depth in understanding. The model provides the user with the most common terms associated with each topic (but they may pop up in different topics, e.g., the term “hybrid”). But one particularly cool feature of LDA is that it estimates the probability that a particular comment has been “generated” by a particular topic. This means it is possible to sort the comments by their probability of belonging to a topic, look at the top comments for each topic and thereby make more sense of what the LDA believes is happening. Basically, an easy way to get the comments most closely related to a topic. Alternatively, we can look at which topics appear most frequently, or see how positive or negative the emotional tone of comments within topics.
Bringing it together
Text mining and topic modelling methods can be a great way to make more sense of open comments in surveys more quickly. They don’t take away all the work and ultimately the comments to need to be read carefully to really understand what’s going on, but text mining methods provide a great entrance into understanding employee needs better and arriving at a bird’s eye view more quickly and more objectively than assigning topic categories completely manually. Most importantly, it allows for more open questions in surveys, knowing that even if comments are in the thousands, we can do them justice. We can discover what drives employees, what worries they have and what actions can we take based on those insights to best support them. We can ask them in the same way you would ask a colleague, but in the Future of Work we can listen to the responses of thousands of colleagues at the same time.
Thanks a lot for sharing your insights, David - great to read! I totally love the "discovery mode" - there is so much about the Future (of Work) that we do not know about yet. Do you think the qualitative approach in combination with the LDA analysis method as you described can be helpful to detect "internal trends", e.g. what our employees expect from SAP as an employer of choice now and in the future?
Thank you for the positive feedback and insightful question! Funny enough, the current #Unfiltered run contained the item "What do you think sets our company apart from others, and how can we build on these strengths?", which kinda taps into the latent thing your question also taps into, but maybe in a slightly more global way. We closed the survey yesterday morning and I ended up starting the LDA yesterday afternoon, pretty excited to dive into this a bit further over the coming days. Trends over time are an interesting topic, with the obvious idea to just ask the same question again and again and see how responses change. But sometimes the questions also change, especially at the crazy pace technology is accelerating in the tech space (like, who knows how long until LDA retires and gets largely replaced with some LLM-ish thing). So many options 😀