In October of last year, the Melbourne Institute released findings of a working paper series on Intergenerational Disadvantage, which illustrates that young people are almost twice as likely to need social assistance if their parents are on benefit (http://melbourneinstitute.unimelb.edu.au/__data/assets/pdf_file/0003/2514135/wp2017n28.pdf). While this conclusion will be unsurprising to those with experience in Social Protection, this is a landmark study in terms of the data analyzed and the analytics approach that was applied. Indeed, at the time the study was initiated, the technologies that would ultimately facilitate the analysis did not yet exist! Now, with emerging predictive analytics, machine learning and real-time computing technologies, there is unprecedented opportunity for data-driven policy and program insights to deliver better social and economic outcomes.
Part of what makes the Melbourne Institute’s findings unique, is that the analysis provides irrefutable evidence of intergenerational disadvantage in Australia, on the basis that the study has been conducted against a full dataset – not just a sample. The Department of Social Services’ Transgenerational Data Set (TDS) provides access to the records of 124,285 Australians born between October 1987 and March 1988, and 98% of these subjects were able to be matched to their primary carers, thereby enabling them to be included in the study. A longitudinal analysis is being conducted over 18 years, and has already been applied across 126 million fortnightly social assistance payments, with transaction data currently available for these young Australians through to age 26. This is an excellent use case for big data analytics within a Public Sector context.
Big data analytics enables us to challenge preconceptions that might have been formed through having only a limited view of the data. In the case of the Melbourne Institute’s study, the data argues against the notion of a widespread welfare culture in which values are shaped and disadvantage becomes increasingly entrenched. Rather, the data shows that disadvantage caused by circumstance (e.g. disability) is much harder to overcome than that caused by personal choice.
Big data analytics also has the potential to provide new insights across datasets, enabling us to develop a more complete understanding of people and their circumstances. Again, in the case of the Melbourne Institute’s study, the data shows a strong cross-program correlation across the spectrum of social benefits. This is particularly pronounced in the case of parental mental health disability, which is identified as having a broad range of consequences for young people who take on the burden of caring for their parents.
Now, three emerging technologies have the potential to shape how we consume big data and how we might apply new insights to deliver better social and economic outcomes:
Predictive Analytics is a form of advanced analytics that uses both new and historical data to forecast future activity, behavior and trends (http://searchbusinessanalytics.techtarget.com/definition/predictive-analytics). It encompasses a range of statistical techniques used to predict the probability of certain outcomes for individuals, based on observed patterns in historical data of people with a similar profile. These predictions can be applied to inform and influence policy decisions, such as identifying high risk cases in a child protection scenario and prompting early intervention to prevent child abuse and neglect.
Machine Learning is a type of artificial intelligence (AI) that allows software applications to become more accurate in predicting outcomes without being explicitly programmed (http://whatis.techtarget.com/definition/machine-learning). It extends predictive analytics through the computational exploration of correlations between sample inputs and known outputs, which can be used to refine predictive models over time. This approach can be applied to optimize service plans by uncovering hidden patterns in the data and proposing interventions with the highest probability of delivering the desired outcomes in a given circumstance.
Real-time Computing is the use of, or the capacity to use, data and related resources as soon as the data enters the system (http://searchcrm.techtarget.com/definition/real-time-analytics). It enables analytical techniques to be applied at the point of service, in the operational system, thereby reducing the lag time traditionally associated with data warehousing. This capability is key to making the analytics proactive rather than reactive, so that for example, new customers can be segmented based on their risk profile at the time of intake.
One US State Government agency is a leading adopter of emerging technologies in their Management and Performance Hub (MPH). The MPH is a real-time computing platform, including predictive analytics capabilities, based on SAP technologies. In 2014-15, the State successfully piloted the MPH to statistically quantify the importance of risk factors driving their persistently high infant mortality rate. The pilot project applied predictive analytics to 9 billion rows of data across 15 data sets to establish correlations and causations between previously unknown risk factors, and enabled them to identify subpopulations with underlying drivers for infant mortality. As a result, they secured $13.5 million additional budget appropriation for new programs targeting early intervention for high-risk cohorts. Having established an enterprise-wide data analytics asset, the State has recently applied the MPH to combating the opioid epidemic and improving traffic safety.
In another great example, an Australian Government agency has recently completed an 8 week trial of machine learning technology with SAP. The Government’s strategic objective is to reduce citizen debt propensity by enabling earlier notice and targeted interventions through root cause analysis. The purpose of the trial was to apply machine learning to a vast array of data, to provide early indicators of customers who may not have the capacity to pay compulsory contributions. In the trial, 187 million transaction records were analyzed across 97,000 customers, and the prototype achieved a 71% debtor prediction rate after 4 weeks of training. Implementing this capability on a real-time computing platform will enable the agency to build dynamic risk profiles that can inform evidence-based decision-making and customer segmentation. The agency is continuing to work with SAP to refine the machine learning model, and they intend to apply this capability across their business to reduce total liabilities.
Emerging technologies have the potential to substantially automate and significantly improve the business processes associated with needs analysis, risk assessment, service planning, service delivery and outcome tracking. In some cases, as has been demonstrated by the Melbourne Institute’s study into Intergenerational Disadvantage, we already have the data and the ability to analyze it retrospectively. The opportunity is to leverage this and apply today’s emerging technologies, as demonstrated by the Management and Performance Hub in the US and the Australian Government’s machine learning prototype, to deliver better social and economic outcomes for future generations.
In the coming months, the SAP Institute for Digital Government (SIDG) will create a series of articles exploring the opportunities for emerging technologies to enable data-driven policy and practice within a Public Sector context. We are keen to interview Government agencies that have undertaken pilot projects or developed prototypes using predictive analytics, machine learning and real-time computing technologies. We are seeking to explore the use cases and benefits of these emerging technologies to enable data-driven policy and practice. We have a particular interest in how agencies are overcoming the challenges associated with data access and cross-agency collaboration, and the success/failure of particular implementation strategies. Ultimately, we would like to develop a set of industry-informed guidelines and templates for leveraging emerging technologies to enable data-driven policy and practice within a Public Sector context. To get involved in this exciting initiative, please contact Ryan van Leent at firstname.lastname@example.org