How We Added Computer Vision to AI Chatbots
This post was originally published on the SAP Conversational AI blog
This is a story of opportunity. This is one of those jaw dropping moments when you look at the person in front of you and say, “Oh my, now that’s an idea!”
It was while working for one of our clients that we faced a challenge we hadn’t anticipated: taps are complicated.
The Plumberbot Issue
The client is a company specializing in home improvement. Their bot had to be extra specific on many different objects such as radiators, pipes, air conditioners, or, well, taps. Because of the highly precise nature of the client’s service and our general lack of plumbing experience, the entire development team was confused when building the conversation flow.
“Wait, so, if it’s a thermostatic mixer tap, we’re not going to have the same conversation path as a simple water tap, right?”
“What makes a tap thermostatic? Aren’t they all?”
After several mistakes and a global sense of confusion, the team called Google Images to the rescue. Different pictures made everything clear, but the team kept wondering.
“People using the chatbot are going to be just like us, how can we ask them this question when we don’t even know ourselves?”
“Well, why don’t we give them the option of sending the bot a picture to determine what kind of tap they have?”
The Computer Vision Epiphany
At that time, SAP Conversational AI had never touched image recognition or computer vision. And while it was an excellent idea, it was too ambitious for our deadline. We simply did not have the time, resources or capabilities to implement this at such short notice.
Or did we?
A few of us didn’t sleep too well that night. It was a relief in the morning, when Paul, our lead R&D engineer, came in and announced that detecting entities in an image is actually possible, people had done it before, and it was even possible to integrate it into SAP Conversational AI!
So we did. The biggest part of the job was to list and compare all computer vision APIs available: what they detect, how they handle heavy datasets and what they send back. The process was pretty straightforward and we quickly came up with a favorite API. One week later, it was integrated into the platform and we were able to show a first demo to our client, who was thrilled.
With a team of curious and crafty developers, the tendency around here is to understand and develop our own technology from scratch. While that remains the core of our vision, providing efficient and cutting-edge chatbots is also part of our business strategy, and not to be overlooked.
Conversation is, after all, the new interface!
A New Mindset
It’s that mindset that makes our team overcome challenges. Something new? Something odd? Something that seems really really hard to do? Discuss, challenge, discover, rebuild, analyze, and before you know it, it might actually be getting done. It’s so refreshing to be working with people who don’t take “that’s too hard” for an answer.
We all sat together the week after, reflecting.
“Well, that was fast.”
“How can we top that next week?”
To be continued…