Skip to Content
Personal Insights
Author's profile photo Martin Kauschinger

Scientific study published about the SAP Community: Leveraging machine learning to mine software requirements of third-party developers

Dear SAP Community,

 

during the last years, me and my colleagues at Technical University of Munich extensively studied how large software companies such as SAP can generate information benefits from their online communities.

Relying on data from the SAP Community, we recently published a paper, in which we show how machine learning models can be used to automatically detect feature requests of third-party developers. Based on manually labeled data set of 1,500 questions, our classifier reached a high  accuracy of 82%. This means that our classifier was able to predict 82% of all feature requests correctly. To us, this is big steps towards data-driven requirements engineering.

With this post, I would like to thank all contributors of the SAP Community. Without you, this piece of research would not exist.

The full article is available for free at ResearchGate: https://www.researchgate.net/publication/364165571_Detecting_Feature_Requests_of_Third-Party_Developers_through_Machine_Learning_A_Case_Study_of_the_SAP_Community

If you have any questions about our study, feel free to reach out to me. I am happy to discuss.

 

Best regards

Martin

Assigned Tags

      3 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Peter Baumann
      Peter Baumann

      Hi Martin Kauschinger!

      Thank you for your contribution bringing insights about SAP Community to SAP Community! I'm really surprised that from 2.6 million questions only 1500 where left for training.

      I some questions about your study:

      • As SAP names of products change from time to time. Sometimes they even come back like we have seen with SAP Build. Often acronymes are used not always the same way. Have you had ways to handle that?
      • If I understand right you basically classified whether or not a question is a possible feature request. This means you give the lable to new questions. But what kind of feature it is would be needed to extract afterwards. Right?
      • Do you see practical applications for SAP and the SAP Community? Like recommending a Influencer portal entry for a new questiton not resolved?

      Thanks,

      Peter

      Author's profile photo Martin Kauschinger
      Martin Kauschinger
      Blog Post Author

      Hi Peter Baumann,

      thank you for your interest. Here are my thoughts on your questions:

      • Yes, we only used 1500 questions for the labeling. The reason is that we had to label the questions manually as we adopted a supervised machine learning approach. Moreover, we had to get three labels per questions to ensure scientific rigor and to obtain a conesus label. Of course, one could increase the number of labeled questions and I am sure that the results would be similar. We see our study as a proof of concept, e.g. is the automatic detection of feature requests possible?
      • We addressed varying terminology for SAP products by limiting ourselves to questions from a particular time period, in our case 2016 - 2020. Moreover, our labelers were all SAP experts and we thus assume that they were aware of varying terminology.
      • Yes, you are correct: our classifier assessed whether a question contains a feature request or not and the classifier made a binary decision for that. In so doing, we are able to predict feature requests in newly posted questions. But, as you said, the classifier does not predict what kind of feature it is. This could be sorted out by a requirements engineer or also by future research.
      • To be honest, I see great potential for using this classifier in SAP's product development and requirements engineering. With the help of the classifier, and some qualitiative analysis of a requirements engineer, it is relatively easy to assess what kind of features the developer, and ultimately the end-user / customer wants. Coming to back to your question: your suggested use case ("recommending a entry in the customer influence portal") is definitely supported by our classifier. I am also happy to discuss the application of the classifier further.

      Please let me know if you have further questions. I am happy to help.

      Best

      Martin

      Author's profile photo Peter Baumann
      Peter Baumann

      Thank you for your detailed answer! No further question. Very interesting to hear about all that, as it is a topic all active SAP Community users are a kind of domain experts here.

       

      Best regards,

      Peter