This insightful post was originally written by Glenn Paulley and posted to sybase.com in June of 2008.
Recently, it was discussed whether or not distributed data management (versus centralized data management) is a necessary condition for growth. Perhaps, but “growth” makes the issue too complicated – I think that perhaps a more relevant question is
Is it necessary for an organization’s data to be truly distributed to drive gains in productivity?
In my experience, productivity improvements are often-quoted but rarely materialize, which is why I am often skeptical about productivity claims (and I think most managers are too). The quoted gains rarely if ever live up to expectations, and I think there’s a really good reason why.
The issue is latency. If you think about it, every business process deals with queues. We send email, thus queuing the work, and await the response. We context-switch from task to task, each time queuing another piece of work. We play telephone tag trying to ask a colleague a question. Queuing applies to equally well to both business and to software development. Fred Brooks’ experiences at IBM developing OS/360 led him to write The Mythical Man-Month, which describes how more people on a project lowers the productivity of each (there are more queues to manage).
Queuing is also why remote management – think off-shoring – rarely offers the anticipated cost savings, because productivity is so poor: the latency of each item in the queue increases. If we can remove latency, productivity improves. That’s the most important reason why Ivan’s use of the IvanAnywhere robot has been so successful. Today, however, off-shoring, or geographically-dispersed plants, is de rigeur of virtually any major corporation, but it is exceedingly difficult to do it well. Both Google and Microsoft are examples of companies that have fought the urge to expand their core development to other geographical centers, and I think with good reason.
However, if operational autonomy enters the picture, then things are very different: autonomy coupled with the elimination or reduction in process latency offer significant advantages, to the point where they offer what’s occasionally known as a paradigm shift. So I think it’s not really whether or not data management is distributed that improves productivity, but rather that distributed data management is the enabler for reducing latency through the availability of data at any point in the process.
Which, of course, is the whole story behind SQL Anywhere.