Monday morning thoughts: a cloud native smell
Continuing on from my earlier random thoughts about what cloud native means to me, I was musing this morning on the nature of the web, and specifically URIs – or rather their specialisation that we see most commonly – URLs.
The importance of stable URLs
There’s a well-known article by Tim Berners-Lee (TBL) entitled “Cool URIs don’t change” which I’d encourage you to go and read at some stage over a cup of coffee. It’s from the World Wide Web Consortium (W3C). There are many implications of what he says in this article, which I’ll leave you to ponder. There’s also one particular observation that’s worth sharing now – the URL of the article hasn’t changed since its publication 20 years ago. Now that’s what I call setting a good example!
In thinking about the content of this article, I’m reminded of the excellent set of productivity tools from Google – known these days as G Suite. I say “known these days”, as they’ve been referred to previously as “Google Apps”, and – perhaps colloquially – as “Google Docs”. But while the suite’s name might have changed, the fundamental design that underpins this awesome set of tools has not.
Because Google get the web, they build their tools “of the web”, rather than merely “on the web”. That means, for example, that the URL of a Google spreadsheet, for example, or a Google document, or a Google Form, or whatever – is unique and permanent. It doesn’t change. Even if you change the name of that “document” (we should really say “of that resource”), the URL remains the same. And that brings a superset of web-based by-products that we take for granted. For example, I can jump to a document I’ve been working on recently in literally a handful of keystrokes, without having to think of where I stored it or what the URL might be, or what the navigation path might be that I have to take once if I have to first find some sort of “root” resource. This is all I have to do:
- Cmd-T to get a new browser tab and have my cursor placed in the omnibar
- then three or four characters to identify the working title of the document
- finally a down-arrow or two and Enter to request that resource
Bam. In and editing.
More than can be said for, ahem, certain other online productivity / collaboration suites – merely change the title of the document you’re writing and the URL changes! Ouch! What use is that if I want to share the working draft resource with you?
Anyway, let me tear myself away from this becoming another post entirely, and look at a typical G Suite URL to help me get to my next point. Here’s an example, for a spreadsheet (I’ve changed the URL slightly for security reasons):
The structure of the URL is quite simple, and most of it is the unique code (1nT4…42) that identifies the individual online spreadsheet resource.
The Opacity Axiom
So that brings me to a parallel thought relating to cloud native, and something that (to me) is a “smell” (as in something that just gives me a subtle hint about something – not necessarily negative).
When I first started out exploring the SAP Cloud Platform (SCP), I noticed that a lot of the URLs had similar opaque identifiers in them. For example, if I created a temporary trial account, or a temporary member within a trial account. If I added a subaccount or was given access to a new global account, whether in the Neo or (now the) Cloud Foundry context … each time, I saw unique, opaque identifiers.
Here’s another couple of (modified) examples:
Even services within the SCP had parts of their URLs that I couldn’t control (or initially understand) – provider account identifiers, for example. And they looked a little bit ugly to me, initially.
But then I remembered another W3C article, from even earlier in the web’s history (1996) – Universal Resource Identifiers – Axioms of Web Architecture. This is also a great article and worth a read. Of particular interest to us here in that article is the section on “The Opacity Axiom“, which states:
“The only thing you can use an identifier for is to refer to an object. When you are not dereferencing, you should not look at the contents of the URI string to gain other information.”
This axiom somewhat goes against the grain of what I like to think, but is actually crucial. First from the point of view of the side-effect of trying to infer structure from a URL, but more importantly from the perspective of where we are today, in the cloud native context of resources being spun up, created, instantiated, conjured up … and then after their utility has been spent, being deleted, destroyed, disappeared*.
(*yes I know I’m using this verb transitively, but there you go. Talking of unusual words and unusual usages, did you notice TBL using the word “disillusion” in the “Cool URIs Don’t Change” article – also as a transitive verb? That use has been waning since the early C20, but still wonderful.)
Resources, such as those that are spun up on cloud platforms such as SAP’s and Google’s, that are ultimately ephemeral need to be born and then die, and in that intervening period, have an identifier that is as equally anonymous as it is unique.
The cloud native smell
And it’s the very presence of these superficially ugly but essentially throwaway identifiers in URLs (after all, URLs *are* URIs, aren’t they) that to me form a subtle hint, a signpost, a smell, that what we’re dealing with is something that is cloud native. Resources, services, VMs, clusters, subaccounts – they’re created and destroyed all the time, not just in a web environment but in on-premise contexts and sometimes within proprietary architectures.
The fact that resources — and I’m using the word “resource” while thinking about how that word is used in Representational State Transfer (REST) — need identifiers in the context of the web (and yes, “cloud” doesn’t mean “just web (HTTP)”), but our interface to the cloud is predominantly via that protocol) means that the increasing occurrence of URLs with long strings of opaque characters often triggers a thought in my head that we’re moving further towards the age of cloud native.
What are your thoughts?
This post was brought to you by Finca Buenos Aires coffee and some happy memories from the early days of the web.
Read more posts in this series here: Monday morning thoughts.
Putting the R in URL! Transient resources...
Thanks DJ for this post!
This is - as is all of what you write - a delightful read. Wayyyyy back in the last millennium I remember working at a company when the Web was young and arguling with the Webmaster about URLs and whether they should never change or not (I was on the misguided side of the argument). It strikes me as amazing how many of those early themes -- from 20+ years ago -- we're replaying. As it "building a Web site" is a brand-new thing.
And now I shall retreat before I start to wax poetic about things like the <blink> tag or frames...
It is astonishing to think how the web has both changed so much on the one hand, but on the other hand it’s stayed the same. The design, curation and care of its technical underpinnings, making it into the most massive and scalable distributed web service in the world, has been so important to its survival and growth. Sometimes we forget those in organisations such as the W3C and the various working groups, whose efforts often go unnoticed, but without whose care we wouldn’t have the platform we have today.
I wish I could get our intranet to maintain stable URLs. Over the course of the last nearly twenty years, my group has published lots of documentation for our end users, but our web team has changed platforms for hosting our internal website several times during that same timeframe, including moving from internal hosting to external hosting and then switching hosting vendors, and undergoing a major site redesign. The links to our documents have broken several times, and each time it has been a major PITA to fix things. Of course, it's not like we're trying to maintain SEO or host documents for the public to consume on that intranet, but....
As you can guess, it's a hard problem, and one that doesn't go away. Hindsight is always useful too, but it's not even that - it's finding the balance between up-front planning of the resource address space, and organic growth and use for content or direction that wasn't even a twinkle in the eye.
Always great to read your thoughts. And also thanks for pointing to the W3C article on URIs Nice to see references to cgi-bin 😀 Reminded me of my college dorm where I was hacking away some perl code with cgi-bin. It is worthwhile to strive for but very few organizations have been able to do so. Reasons for not doing so could be many. Commercial, technical or just poor understanding.
Just like what Matt Fraser said above, on a personal level I have maintained my personal blog pnarula.com for a long time. It has gone thru different backends like Moveable Type, WordPress and now Hugo/Gitlab. Every change of backend introduced minor URI changes purely because a particular backend didn't supported that style. We have seen similar with SAP Community. I have learned to live with it and marvel at google's indexing that I can still find what I am looking for with one search. (BTW I am probably the only reader of my blog)
I would point to another scenario which is more related to how should machines refer to each others. At a higher level we have the pets vs cattle where on one side we had fixed IP and ports etc while on the cattle side they are ephemeral. When we go a bit deeper and get into containers and microservices - the whole service discovery pattern is there because we don't/can't use fixed URIs.
Since I have gotten into the coffee game in the last few weeks, and you mentioned your brew I will be a copy cat and mention these random thoughts were brewed with Colombia Las Cochitas from Sweet Maria's
Thanks for a great reply and contribution. Yes, as I mentioned to Matt, it is a tough prospect, and like you, I also have experienced (or caused, I guess) URLs to change for my posts. So we know first hand what it's like as a producer as well as a consumer. Blogging software wise, I've moved from Blogspot, through Blosxom, Moveable Type, WordPress and now Ghost, with the main URL differences being in the way the date of each post is represented. But also I've moved hosts (sometimes hosting at home, sometimes on a shared colocated server, now on a VPS) with different backends.
Your reference to pets vs cattle is spot on. Actually that metaphor was probably rattling around the back of my mind, as I'd only just been introduced to it by Former Member in a recent Twitter conversation.
You're right about service discovery, but it does send a small shudder down my spine to hear that phrase - it brings back memories of SOAP and UDDI - that super successful service discovery mechanism of the past (tongue firmly in cheek here). I'd like to use the HATEOAS reference here, but I know it's a lot more complex than that 🙂
ps nice addition to the coffee conversation too!