Bookshelves, Milk Trucks and Databases
I spend quite a considerable amount of my spare time reading. Mostly, I read fiction, science-fiction and fantasy, but sometimes I read crime novels and non-fiction. I have in the past swallowed whole books in a single sitting and when done, I find it hard to part with these literary beasts. I really enjoy the smell of a new book and how as you move through the tale, each page shapes your imagination with a bend or crease of the books spine. For me it is one of life’s simple pleasures, and while I am heavily involved in technology, I just don’t see how an eBook Reader could possibly compare. As you can imagine, this has resulted in a very many books lying around differing nooks and crannies of my home.
With this in mind and under a slight amount of pressure from my better half, I decided to buy a book shelf and get to organising my mass of published pulp. Strolling through the display aisles of Ikea’s storage department, I pondered. How much space do I need to store my current book collection? How much space will I need in the future? How will I organise these books to best make them easily accessible? What if I add more books or run out of space? What if I am not the only one reading from my collection? I suppose these questions amongst others need consideration when storing and organising any information. I remember my Uncle describing a similar issue he faced when starting his first business.
During the early 80’s my Uncle bought a milk float. With no intent of just giving milk away, he spent a few days canvasing for customers. With 21 names and addresses jotted down on a list, one cold morning he began delivering his first batch of Cleopatra’s bubble bath. As you can well imagine finding 1 of 21 names on a piece of paper was quite a simple task. Problems began to occur over the next 3 months, when my Uncle became the most popular milk man in his local area. His single page of 21 names grew to 415 and simply scanning down the page was no longer an option. One evening my Uncle set himself a challenge; organise.
My Uncle bought an address book and his initial approach was to organise his customer’s by name. It was in some way a solution to his problem, but left him driving from one side of the area to the other, although John Smith and Paul Taylor lived next to each other in my Uncle’s book, they lived nowhere near each other in the real world. A different approach was needed. My Uncle bought a new address book and at the top of each page wrote the street name. He then listed each street number down the page. For his customers, he populated the row with their order details and unpopulated rows were addresses that could be canvassed. Hey presto, my Uncle had indexed his data, and in turn sped up his delivery time.
Over the next few months my Uncle faced another problem, his service was too good. As new customers flocked in, indexing was no longer enough, he would need to buy another milk float and hire another milk man.
Initially this worked out quite well. They split the area geographically in two, my Uncle would work from 4am to 10am and his associate Fred, worked from 10am to 4pm. At 10am my Uncle would hand Fred the address book so he could do his area, there was only one, so they could not use it concurrently. One day my uncle suggested that Fred bring the address book home and make a copy. It would mean they had an address book each, overcoming the concurrency problem, and could take turns with shifts, vacations and cover periods of illness. They were in simple terms creating a replicated, distributed version of the address book.
Fred like any new employ was working avidly to demonstrate his value. Every day when his shift would end, Fred would pick a street from his area of the address book and knock on each door. He would canvas new customers and offer existing customers a chance to change their daily deliveries. This was a major success and when Fred brought it to my Uncle’s attention, he gave him the next day off.
Getting up at 4am, my uncle began his milk round, he hurried through his deliveries realizing he also had Fred’s area to do. At 3pm he was finished. He was both delighted and surprised at how quickly he had finished. The day was a Friday, which normally was quite slow, as customer’s got a double weekend delivery. When Monday morning came, Fred faced many angry faces. Some existing customers got their old deliveries and newer customers got no deliveries at all. By Fred only updating his address book, he had left my Uncle’s address book in an old, outdated state. Both books were no longer consistent. How would they overcome this new problem?
Fred’s first suggestion was, at the end of every week they would meet and make changes to their respective address books. The problem here would be, any changed deliveries or new orders would not be update until the end of the week. So they were still always slightly out of date and inconsistent, a major problem in the event of absenteeism due to illness. My Uncle suggested that Fred should take both books at the end of his shift and update them. But if for any reason Fred was unable to drop the book back on time, my Uncle’s book would be unavailable. A final suggestion was for my Uncle and Fred to log their changes on a separate piece of paper. Every morning they would check these logs and although it added an administrative overhead, in balance it was an acceptable compromise.
My Uncle eventually sold his milk round, but had his business kept growing, I assume these minor inconsistencies would have been the least of his worries. In modern databases we suffer a lot of the same issues experienced by my Uncle, however we have now scaled from hundreds of records to hundreds of millions. We have systems that are being read and updated by thousands of users every few minutes. Now more than ever is the trade-off of consistency, concurrency and availability paramount to enterprise success. These hugely scaled databases can never be perfect, but as long as the user is unaware of these imperfections, job done.
Now what colour bookshelf should I get?