Isolating Customer Data: When is it a Better Fit for the Cloud?
For all that global business has embraced cloud computing—welcoming its low cost, low barrier to entry and reduced IT burden—not everything about this architectural sea change is working out perfectly. Limitations in security and customization are a growing source of discontent for many current and prospective cloud customers.
The need to serve similar data for many customers usually shows up in hosted environments. And for the majority of providers the default approach to satisfy that requirement is an amalgamated, “multi-tenant” customer database.
Unfortunately, multi-tenant database architecture is inherently less data-safe than an isolated, per-customer approach. Concerns over trust prevent many organizations from adopting hosted applications. Not every organization is willing or able to accept the risk of letting its data reside in a shared repository. Beyond that, a shared database limits the way that providers can service their customers, forcing them to leave revenue on the table.
Is it time to rethink our default approach to data architecture in the cloud?
If it’s been awhile since you’ve performed an objective assessment of your data isolation strategy, the answer is probably yes. Too many organizations settled on an architectural standard early on and stuck with it—paying little attention to subtle shifts in customer demand as the marketplace matured. Worse, they’ve often based this important decision primarily on what’s convenient for the IT department, not what’s most beneficial for the business at large.
The Case for Isolation
New ideas are steadily emerging about how best to service customers through the cloud and these are challenging the status quo. Robust data security and governance policy are two issues that have long been incompatible with shared databases. But business are also beginning to demand flexible service options and data schema customizations that cloud providers often simply can’t accommodate.
Still, there are reasons why ISVs tend to prefer a multi-tenant database for their cloud applications. Foremost among these is reducing the management burden, an issue which will be discussed in detail later in this paper.
After management, data analysis is an oft-cited advantage of the shared database approach. With all the customer data in one large database, the application provider has a hassle-free way to analyze the data extensively for interrelationships. The data is imminently available for a broad range of business analytics purposes.
A shared database will also minimize hardware costs and make it easier to maximize the utilization of your server resources. Finally, for those ISVs that have been using multi-tenant architectures for years, there will be the additional benefit of familiarity and comfort, which can result in faster development time.
Isolated data architecture, on the other hand, can’t provide the same level of simplicity that an amalgamated database can for data analysis. Data analysis is still possible, but it requires the additional step of creating an automated routine to extract the data you need from each database, anonymize it and aggregate it for your analysis. It’s an inconvenience to be sure, but not a disastrous one.
More importantly, isolated data architecture has some notable advantages in the win column.
- eliminates risk of data leakage
- simplifies audit process compliance
- makes it possible to comply with laws related to physical data storage boundaries
- introduces flexibility to customize databases on a customer-by-customer basis
- creates opportunities to provide premium services
In the following paragraphs we’ll look at each of these benefits in more detail.
In shared databases, very massive and damaging data leaks can be caused by surprisingly trivial software bugs. By isolating each customer’s data you virtually eliminate the risk of a competitive data breech. Data stored in disparate databases cannot be accidentally leaked. Even with an application-level error the worst that can happen is that the company’s own information is leaked within the company. For many companies, especially large enterprises, a sense of security, control and ownership over the data is mandatory.
Many industries today are subject to rigorous standards (such as HIPPA and SOX for the healthcare and finance industries, respectively), and these standards often require a formal audit. The regulations themselves are rarely specific about how to architect data stores but when it comes time to do the auditing, a shared database can make it difficult or impossible to present the data in a way that complies with auditors’ demands. In some industries, isolation of data is mandatory, and customers will need to be able to show proof of their compliance.
Occasionally, customers are subject to strict limitations on where their data can be stored. These limitations can be internal or part of a regulatory requirement, but when ISVs can’t guarantee that data will reside within the physical boundaries mandated by these policies, the sale is lost. In these cases, only a disparate, standalone database will satisfy the requirements. A globally distributed data center—the most common architectural choice for large multi-tenant hosted apps—is a non-starter.
When each customer has a disparate data store, the possibilities for customization and premium services are virtually limitless. For ISVs who may be looking for new ways to monetize their existing assets, premium services can be an exceptional source of accretive revenue. Obviously the types of premium services you might offer will vary based on the application and the customer but they can include:
- Charging usage fees, by the month or the hour
- Providing an option to query the database directly for reporting purposes
- Providing an option for local backups, for added peace of mind
- Adding custom tables to one customer’s database without forcing the change on others
- Adding custom indexes without negatively impacting other customers’ performance
When considered across the whole spectrum of pros and cons, it quickly becomes clear that the benefits of isolation far outweigh the costs. Or does it? All of these disparate databases have to be managed.
The Management Dilemma
The benefit list for single-tenant data architecture is undoubtedly long. But for ISVs, many of whom are managing thousands of very similar databases, the entire debate can boil down to one issue: Management. It’s a universal truth that managing one thing is easier than managing many, and databases are no exception.
Sadly, with little or no tooling in existence to help providers manage the task of deploying changes to some databases but not others, for many ISVs the business advantages of isolated data architecture are moot. The cost to hire sufficient DBA expertise to manage all of the databases effectively would be extraordinary. The management burden trumps all other rationale.
Interestingly, not all ISVs agree with that logic. Familiarity seems to play a big role in how different data architects view the significance of the management dilemma. In a 2011 survey of North American and European ISVs conducted by Sybase, respondents who said that they already were in the habit of managing many disparate customer databases reported that single-tenant architectures eased management issues rather than making them more complex.
Regardless, the widespread use of multi-tenant databases means an uphillbattle for hosted application providers as they adapt to new market pressure for more secure and flexible alternatives. Solving the management dilemma, by providing significant tools and automation for common processes such as provisioning new databases, would tilt the scale drastically in the opposite direction.
Such automation is hardly a pipe dream. Self-managing and self-tuning databases have existed for decades in niche and embedded environments. The prospect for a wider spectrum database solution that can make managing many databases as easy as managing one is very near. ISVs that are strategically prepared for the opportunity will reap a competitive advantage in the cloud marketplace of the future.