Today I would like to speak about a common dilemma for many law enforcement agencies around the world. And this dilemma is based on the accelerated digitization that has taken place.
On the bright side, it is the digitization of basically every aspect of live that provides a vast amount of information. Nowadays there is hardly any investigation without a more or less large volume of digital data that needs to be processed and analyzed to gather evidence. Just think about cyber crimes or financial crimes.
On the flip side, digitization is drowning our cops in data and many agencies have realized that they need to act. Let’s for example have a look at a forensic scientist who is tasked with the collection, preservation, and analysis of digital evidence during the course of an investigation.
While there are numerous tools available for forensic scientists – to extract data from smart phones, mails, social media, notebooks, PCs etc. – they face a set of key challenges these days:
- Too much data in too many formats: The volume of collected data is constantly increasingly and becoming a major challenge. It places a strain on limited resources such as forensic scientists. And an absence or lack of technical support hinders investigations.
- Too many & too complicated tools to work with: Extraction and processing of data from confiscated notebooks, digitized files, smart phone, internet, etc is achieved by a vast number of forensic tools, each specialized on the respective data source. For example, it is not unlikely that a forensic scientist will have 140 different forensic tools, many of those tools require expert expertise on how to run them.
- Too hard to understand the results: A lot of the existing tools extract data from the specific data sources very well, but the results are not very well presented (e.g. in tabular form or on ‘old’ UIs with the look and feel of Windows 3.1.) Results are very hard to understand for an investigator or prosecutor as the tools are designed for forensic experts.The latter got a dedicated multi-level training to understand them, but still miss out many hints. And sadly, options for a visualization are limited. For example, you might be able to visualize data on a map, but only data about a single entity.
- Too many data silos: There is not much that really helps to correlate data form one tool with data from other tools.
- Too time consuming: Currently forensic processing takes too long. It could well be you can only ingest 1 TB of data into your tool per week. (Please remember a typical case can well have 10 TB of data). As a result, a forensic analysis on a lot of those confiscated items really doesn’t take place.
How to address these challenges?
Growing cyber crimes coupled with rising safety concern have led to a rise in the need for digital forensics solutions. Digital forensics technology plays an integral role in preventing internet related crimes and misuse of company data. SAP, with unrivaled Big Data and Analytics technologies, is in a unique position for a Digital Forensics software.
SAP HANA Vora / Apache Spark can help to process, correlate and aggregate data extracted by the large range of forensic tools described above. And to visualize the results on a common UI without a media disruption which would make it much easier for investigators and the forensic team to interpret data.
Keep posted for further updates in this area from SAP!
But for now, look at the following video to better understand HANA Vora.
Megan Meany on SAP HANA Vora: http://news.sap.com/spin-megan-meany-episode-41/
Global Solution Manager for SAP Public Security and Future Cities in EMEA
Industry Business Solutions