Additional Blogs by Members
cancel
Showing results for 
Search instead for 
Did you mean: 
former_member181923
Active Participant
0 Kudos
I was at the IEEE Database Conference at the Houston Doubletree in 1994 when Gio Wiederhold (who was then still deciding what IT projects (D)ARPA should/not fund) made the following request of the audience at a plenary session.  He asked them to stop submitting proposals to "extend SQL" - he said that although SQL was the worst of all possible outcomes - not machine-friendly and not user-friendly - the community had decided to make it THE standard for data retrieval, so what was really needed was bright proposals for new data-retrieval paradigms, not extensions to a bad paradigm.  Well of course no one listened - for a very simple reason.  As someone once said - if you tell folks they've made a mistake, they will not be angry with you; but if you tell them they're fundamentally misconceived, they will never forgive you.  Nonetheless, I'm here to say that:  1) SQL is fundamentally misconceived;  2) it is fundamentally misconceived because the relational paradigm on which it is based is fundamentally misconceived;  3) the relational paragidm is itself fundamentally misconceived because it is data-centric rather than index-centric;  4) SAP can and SHOULD use ADABAS to develop a new index-centric data-retrieval paradigm that would better support its Enterprise SOA aspirations.  To understand what I mean by index-centric, one need only recognize the obvious fact that you can't have an index on a field in a relational table unless the field is physically filled in the rows of the table.  But why should this be the case?  (Please! - in this day of RAID arrays and mirroring and parallelism - don't talk to me about back-up and recovery issues as if this was 1986 instead of 2006.)  And in fact, one key developer of a highly-esteemed pre-relational database asked this very question and decided that there was no good reason to place this limitation on indices.  So he (Bill Mann) decided to make the Model 204 database really scream by permitting the addition of "invisible fields" to records in any Model 204 file.  (Bill's resume currently reads like a history of IT from 1960 to 2010 - he was recruited out of MIT to work for Merrill at Lincoln Labs, then worked at BBN on projects for which he still holds patents, then for CCA on Model 204, then for Lotus ... and is a now one of Michael Stonebraker's key staff-members working on the new Vertica database.)  As Bill now recalls somewhat ruefully, "invisible field" was exactly the wrong name to call his construct - which was nothing more than an index without a physical base.  It scared customers away from a very simple and powerful idea which can be used to develop an entirely new set of tricks in the repertoire of data magicians.  Suppose you need to know all of the records in a file (rows in a table) that have a certain field-name:value pair.  Basically - all you need (for any given field-name:value pair) is a set containing:  a) record numbers   or   b) 1's in a bit-map (at least in a DB like ADABAS, which, like Model 204, is all the better for the fact that it still implements IFAM indices)  What you DON't need (unless you hold stock in Oracle or IBM and are therefore a true-believer in Codd's legacy) is for the value of the field to be physically present in a field of a record of a file ("column" of a "row" of a "table") And in fact, it is a distinct (though cynical) possibility that the relational paradigm originally forbid this notion for the same reason that it originally forbid multiply occurring fields in the same row - it was developed by a company that was very very interested in selling strings of 33x0's and their controllers, so the more space required, the merrier.  Now back in the days before David Patterson's predictions about cheap disk space came true beyond his wildest imagination, one could argue that the last thing a CIO wanted was a construct which encouraged folks to add even MORE indices to a file table.)  Particularly CIO's who were already ticked at the fact that a non-relational database engineered like Model 204 made it possible to add as many regular (physically-based) b-tree or inverted indices to a file with no degradation in performance.  But David Patterson's wildest dreams about cheap disk-space have come true, and as noted earlier, RAID arrays, mirroring, and parallelism now provide the infrastructure needed to:  a) update as many indices as you want in real-time or damn close to it;  b) successfully recover non-physically realized indices in case of failure (an admittedly real issue back in 1986)   So, to sum up, what SAP should seriously consider doing is re-implementing Bill Mann's old notion of "invisible field" inside ADABAS.   If anyone can't immediatley see how the availability of such "invisible" indices would completely change the current OLAP paradigm for the better - just let me know and I'll be happy to devote more than one blogpost here to the topic - particularly to how and why an index-centric data retrieval paradigm could help support SAP's Enterrpise SOA aspirations in a number of ways.  And BTW - if you think you see a common-thread running through my blogposts here (except for the first two on metadata repositories), you're entirely correct.  It's simply embarassing to see a company as intelligent and intelligently-run as SAP having to rely on fundamentally misconceived relational products such as DB2 and Oracle, and now that SAP has its own database, it no longer really has to. 
6 Comments