Skip to Content
Technical Articles
Author's profile photo Bernard Rummel

5 Challenges to Your Machine Learning Project

Machine learning (ML) projects are often technology-driven. This is perfectly OK when new technologies open new horizons. However, when it comes to actual product development, it’s time to think about people. There will always be someone interacting with the new technology. If you don’t want to get surprised by user needs discovered too late, it’s a good idea to think ahead about what those needs might be – and to research and validate them with real people!

This post presents five challenges to address in your project, along with an example how to use these challenges to improve your product concept. Each challenge is designed to generate questions and ideas for product solutions and further research. Got you interested? If so, let’s go – and enjoy the ride!

A Simple Idea

Meet Carl. Carl is a customer of ACME Utilities, a popular energy and water provider in his area. ACME is an innovative company that is currently heavily investing in smart meters and Internet of Things (IoT) services to create a modern and effortless service experience for customers. Carl however is skeptical about this new IoT stuff. Equipping his old home with new smart meters is out of the question.Well, here comes ACME with an idea Carl actually finds rather appealing! They say you can simply take a picture of your gas, water, or electricity meter with your smartphone, and email it to the company – they’ll take care of the rest.
ACME says they use an image recognition algorithm to figure out the meter type and ID number, and collect the reading right from the picture. The data is then automatically assigned to your bill. Sounds cool, doesn’t it? Carl doesn’t have to bother with installing any new equipment, and he doesn’t need to worry that his neighbor’s teenage son might hijack his old heater for bitcoin mining. All he needs to do is take a picture and send it to an email address! That’s as easy as sharing pictures with his grandkids!A simple, cool scenario, isn’t it? Let’s challenge it with some questions and see what we can learn.

Challenge 1: Who is in charge?

Machine learning re-distributes work in innovative ways, making life easier for humans. However, when workload is taken away from people, this doesn’t mean they are off the hook of responsibility and accountability. Suppose Carl, after sending in a picture, receives a bill that doesn’t match what he paid in previous years. Who is responsible for making sure the meter ID is correct and the numbers are adding up all right? If they aren’t, what is supposed to happen? Who, eventually, will have to pay and how much?

In Carl’s case, he certainly is responsible for sending in the right picture – neither an old one, nor a photoshopped one. Correctly reading the numbers then would be ACME’s responsibility. But wait – what if the picture Carl sent is difficult to read? What if a misreading is due to a smudge on the lens of Carl’s smartphone? Validating a reading is almost trivial; you just need someone to look at the meter and check for signs of tampering. Nevertheless, the process must be defined, and for a good customer experience, it’d better be designed as well.

In more complex scenarios, procedural and legal questions can pile up quickly to mind-boggling complexity, in particular if multiple processes, or even organizations are involved. Think of autonomous driving, for instance: the scenario is easily described, but responsibility questions and legal issues are far from resolved. Doing so involves not only considering the normal operation scenario, but also cases of failure, dis- and even misuse. Someone, in many cases the system designer, will be held accountable if a system fails or is employed for unwanted purposes. It is rather likely that unexpected security and privacy issues will pop up here once you start seriously investigating.

Challenge 2: Risks & Stakes

It is unlikely that ACME would go broke over a dispute about Carl’s power bill, but once courts and lawyers get involved, it might become rather painful for Carl. The risk and costs of misreading a meter depend a lot on the overall process design, in particular the handling of failures and disputes. The higher the risks and stakes are, the more you need to deeply understand the respective reliability of human vs. machine decisions, and balance this against the efficiency of machine-based reasoning. The most reliable process would probably be to have an ACME representative take the meter reading and have Carl counter-sign it, but this would be inefficient. Fully automated image processing will probably only be reliable enough if the process is supervised by human operators, and backed with a well-designed dispute handling process.

Mind that risks and stakes can strongly depend on context factors. Consider face recognition: greeting someone with the wrong name is merely embarrassing, misidentifying a potential terrorist can be disastrous. Note that the situation can change instantly, and dramatically, when systems get dis- or misused, so those risks need to be considered as well. In our example, a prankster might spoof Carl’s email address and send in a photoshopped image (a variant of the pizza ordering prank). Not very likely, probably not very costly, but obviously, some security precautions will be necessary.

Challenge 3: Are those in charge also in control?

How can you make sure that those who will be held responsible and accountable can actually fulfill their obligations? What decisions do they have to make?  What will be the time frame to make those decisions? What information can you provide to support this? Will this information need to be explained? What can they do to correct a wrong decision?

The answers to those questions tie directly into user interface and interaction design. In our little scenario, Carl is responsible for sending truthful information to ACME, but can he check on the outcome of the image processing algorithm? He can’t, unless the system lets him double-check.

Since Carl is certainly not in charge of ACME’s image processing algorithm, some ACME employee will need to monitor the process and step in if necessary. Say hello to Ben (also featured in a different post on working with intelligent systems)!  Ben is working at ACME’s headquarters. His job is to verify suspicious meter readings. Every day, several thousand meter readings come in. Ben surely doesn’t want to check on all the readings, so how can he tell which ones are worth looking at? What would be most informative – an image recognition reliability score? Or rather the image itself? Or something completely different: the amount of the bill, a critical difference from Carl’s payment history? Once Ben decided to look into a specific case, how would he proceed to analyze it? And what’s next?

Challenge 4: What level of automation is appropriate?

Initially, our scenario idea was that Carl simply sends a picture, and a computer at ACME does the rest. Let’s reconsider. We already found that Carl does not have an obligation, but certainly an interest, in checking on whether the image he is about to send can be correctly processed by ACME’s algorithm. Let’s look more closely at the flow of events. Carl walks down into the basement of his house, takes a picture of the meter. His internet connection down there isn’t that good. To check whether the picture can be processed, Carlwould have to walk up to his living room where connectivity is better, and in case the image doesn’t work out as expected, to walk down again to take a better picture. Sounds cumbersome, doesn’t it?

Well, if the image processing algorithm was on Carl’s smartphone, he might check results right in his basement, and simply try again. The system might show him what it recognized and ask whether this is correct. Even if the recognition quality was not perfect, taking a second picture would certainly beat having to walk up the stairs.

Thinking of it, why don’t we simply give Carl a UI to punch in the numbers himself? He certainly is able to do that, and this approach would give him a maximum of control. Hmmmm… he wouldn’t even need a phone for that, would he? A stamped postcard and a pen would actually do…

What are we doing here? Let’s take a step back. We’re looking here at three different levels of automation. Initially, we envisioned the system to do . It would be cool if Carl could just send a picture and forget about it, but can we pull that off? In order to guarantee a satisfactory experience, the system would need to be pretty accurate and reliable. Next, we thought of a suggest-approval pattern: the phone-based algorithm shows a result to Carl, and he approves or rejects it. Last but not least, we considered fully manual operation. Probably, the optimal solution will be somewhere along this spectrum, but where exactly?

With the increasing capability of AI and ML systems, it is easy to overlook that full automation is not always the most sensible goal. Aiming at lower levels of automation doesn’t mean a lack of technological ambition. Instead, this strategy may give you earlier and easier access to markets and profitable business cases, while mitigating or even eliminating complex responsibility issues. Also, implementing partially automated systems in practice will give you first-hand experience in side effects, risks, and yet unobserved opportunities. Consider autonomous driving, a scenario various players are expecting to become feasible within the next decade. While this may very well be, numerous technical, legal, and business issues still need to be resolved. On the other hand, there are already profitable products available that operate at lower levels of automation. Navigation systems are ubiquitous these days: they don’t drive, but make mere suggestions where to go. On a higher automation level, you have systems like cruise control or brake assistants. Such systems can range from simply switching an assistant on (rain sensors controlling the windshield wiper) to close system-user cooperation (in electronic stability systems, system and driver cooperate in a close loop to coordinate driver intent and execution by the system). All those systems have brought up important real-world operational experience initially no-one had thought of: the importance of circumnavigating undocumented roadblocks, reliability of sensors in case of snow, user acceptance issues in acceleration control, etc. For autonomous driving, those experiences are critical.

Challenge 5: Designing Take-Over

In our little scenario, we have already identified two situations when having a human in the process would have distinct advantages:

  • In Carl’s basement, given the lighting conditions, Carl’s eyes might be more reliable than an image processing algorithm on his phone.
  • At ACME’s headquarters, the decision needs to be made which meter readings are so suspicious that they should be forwarded to a human processor.

Both are troubleshooting cases, when human operators need to take over an otherwise automated function because it is no longer working reliably. Now troubleshooting cases don’t have to become emergencies. Often, situations when a system reaches a boundary of its operational space can be anticipated, so there is a good chance to design the take-over process. To do so, you’ll need to:

  1. identify situations requiring take-over,
  2. identify which system functions would need to be taken over,
  3. determine information needs of the human(s) taking over,
  4. determine proper information channels and protocols to initiate and…
  5. …perform the take-over.

Let’s look at the headquarters scenario. The situation is clear: it is Ben’s job, our operator persona, to check on suspicious meter readings. Yet we need to consider how to identify cases that need attention, and how to distribute the workload. Low-quality meter pictures might be in one worklist; in another one Ben might have cases of excessive consumption, in yet another one suspected fraud (indications of photoshopped pictures or tampered-with meters).

One obvious function to take over is to check pictures, to let Ben read the numbers if he can. If the meter ID is unreadable, he might simply retrieve it from Carl’s master data, because it’s rather likely the hardware didn’t change – oops, we found an information need! If Ben has Carl’s meter ID right next to the picture, this will make his life easier!

The channels and protocols question is also interesting: low-quality pictures, excessive consumption, and suspected fraud pose problems on different time scales, and require different levels of expertise. Is Ben the right person to handle them all? Which cases will he need to delegate, and how?

Already, several UI ideas take shape: Ben will need a piece of UI to enter the correct numbers for Carl’s bill. To check on excessive consumption, he’ll need Carl’s consumption history. In case of suspected fraud, Ben also needs access to the appropriate communication channels to act on his findings, e.g. to forward relevant information to an in-house lawyer.


Let’s wrap up and see what we learned. Thinking about responsibilities, we found that handling misread images is a critical part of the process. Considering this, together with risks and stakes, we discovered an important process role (Ben) that was initially not in the scenario. When thinking about Ben’s and Carl’s information needs, we discovered a range of design alternatives, some at lower automation levels. Eventually, when diving deeper into the various situations when Ben would need to get active, we identified important information needs, part of which are even outside the field of image recognition.

The overall purpose of this exercise is to open up the design space, generate alternative ideas, and research questions whose answers may let you make informed decisions between alternatives. It would be interesting to apply it to your project idea, wouldn’t it? Let us hear how it worked out, and how you enjoyed the ride!

Assigned Tags

      Be the first to leave a comment
      You must be Logged on to comment or reply to a post.