Monday morning thoughts: upload / download in a cloud native world
This weekend I was exploring some Cloud Foundry features on the SAP Cloud Platform (SCP) and came across a pattern that is pretty much everywhere – not only within the SAP space but far beyond too. It got me thinking of whether this pattern will remain, or be replaced (or at least augmented) by something more – in my mind – cloud native.
The upload / download pattern
This pattern is the upload / download pattern for transferring artifacts “up to” the cloud or “down from” the cloud. An example (the one I can across this weekend) is shown in the screenshot here:
It’s clear that this pattern is pervasive, I see it in all sorts of situations. And I’m wondering if I’m thinking in too extreme a manner when I say that the assumed presence of a workstation-local filesystem (from which artifacts can be uploaded to the cloud, or to which they can be downloaded from the cloud) is something from which we’ll be able to free ourselves.
But first, do we want to free ourselves from that?
It’s not a secret that I’m an old mainframe dinosaur, cutting my work computing teeth on IBM big iron at the first company I worked for after finishing my degree. Moreover, during my degree I used various minicomputers, such as Vaxen*, while moonlighting in the computer lab and earning money programming for insurance companies in my spare time. Even before that, my first exposure to computing was at school where we had a PDP-11 with paper terminals.
*yes, that’s the colloquial plural for Vax machines.
So my view on what a computer is has been strongly shaped by my experiences sitting at workstations that looked like this:
One obvious common factor was that these terminals were known as “dumb”, in that they had no real local processing power of their own. Which also implies no local filesystem.
The new mainframe
More recently I’ve been adopting the excellent Chrome OS, on various devices including the Chrome Pixelbook, as my go-to platform, as it’s the nearest I’ve got to the nature of the mainframe which is my ideal scenario, even tweeting about my experiences and thoughts with the #webterminals hashtag.
While I use my work laptop (running macOS) during the week, at the weekend I switch to what I consider the future of computing, which is my Chrome OS device, and which, for consistency, I’m going to refer to as my “web terminal” for the rest of this post. Yes, this particular web terminal has some local computing power, of course (it’s Linux underneath) but to me, storage of anything other than ephemeral files such as screenshots seems like a warning “smell” to me.
So when I came across the UI at the top of this post, on my web terminal, I paused. I’d been developing my app, that I’d wanted to deploy, not locally on the web terminal, but on a virtual server, provided to me (for free) in the form of the Google Cloud Shell:
So when it came to deploying this app to the cloud, there was a mismatch – my app’s artifacts were also in the cloud, on a “virtual local” machine, rather than on the web terminal itself. There was no “down” to browse to, nor upload from. Well, of course, there is with this Chrome OS based web terminal – generally, the facility to upload local files via a web browser to a server is indeed a marvellous one, but in this case, I deliberately had no local files.
A pragmatic but short-term solution
So this got me thinking. At least, from a practical, short-term perspective, I was able to download files from the Google Cloud Shell to the storage area on my web terminal, and then browse to that storage area and upload them:
Quite a round-about journey for the artifacts themselves.
But over a cup of coffee, it got me thinking. Will there be alternative solutions for true cloud native computing? That’s my interpretation of “cloud native”, by the way – where we progress (i.e. return positively, rather than regress) to the original mainframe era.*
*That’s extreme thinking, I know, and deliberately so, but there you have it.
Will there be facilities to say “go and find this file on this virtual machine”, exposed perhaps via a short-lived HTTP connection that’s spun up for that very purpose and then destroyed again?
Of course, we have this facility already in slightly different circumstances, in the form of git-based deployments. Look at how, for example, we can move artifacts from design time in the SAP Web IDE to runtime on the SCP, via git.
Perhaps that’s the answer longer term too – provide two alternatives for deployment sources – a file upload facility as we saw in the first screenshot at the top of this post, but also a source code control endpoint – perhaps a special “one time use” endpoint with an opaque GUID. This is not something that exists today of course – we have to have the itch to scratch, to see it built.
I’m not sure what the answer is. I’m not even sure this is a significant and widespread problem today. It just appears to me that any sort of upload / download from a workstation-local filesystem doesn’t feel right to me long term.
I know folks like their local computing power, and who am I to deny them that? What are your thoughts? I’d love to hear them.
This post was brought to you by birdsong in the early dawn, and a nice cup of tea.
Read more posts in this series here: Monday morning thoughts.
Often I've got the same feeling. Not that I'm at the same level of webworking like you, but my lil' Lenovo Laptop is already on ChromeOS (Cloudready).
In the Cloud (call it what ever you want, Host, Internet, Web...) we shouldn't need to talk about files anymore. Everything should be simply a resource, resp. the URI to this resource.
Google already is going in the right direction. In every Google app we have the possibility to select a resource on Google Drive.
Leaving now for a cloud training (ServiceNow), cheers Uwe
Nice to see I'm not the only one headed in this direction 🙂
(And yes, Neverware's Cloudready is a great way to turn a workstation into a, well "cloud ready" device, esp. for older hardware that would otherwise be gathering dust).
You're absolutely right - the idea that artifacts should simply be resources (referred to via a URI) is spot on. And yes, the Google Drive idea of being able to select a resource rather than a "workstation local" file is a great example of this.
Enjoy your cloud training!
I agree with both, and Google drive or OneDrive should be that thing, or some more basic storage concept like an S3 bucket. I think then the debate moved to do we want a PaaS or an IaaS version of a cloudy place my files reside .
Yes, I think that the cloud storage question adds a layer to the question (or the answer) - as it's fragmented there isn't a single standard that could obviously emerge "with just a few tweaks". But the fact that we *are* using cloud storage is great and puts us in a good position.
To be honest, I don’t know where the majority of my data (personal or work) physically resides. I map filesystems to local drives at home and dropbox and S3; At work, there’s data that ultimately resides on AWS instances as EBS storage (or S3 Glacier or EFS …), or in VMs backed by real disk in one or more of our data centres or other cloud vendors (and I’m not even thinking about how the customers store their data)…
I just use a file explorer representation when it’s the most convenient way to view these files and “data objects”.
At a more prosaic level, over the years my music collection has got a bit fragmented – I’m currently consolidating it on one music service, but at the moment, that service is splitting the tracks from some of my Albums into multiple “folders – e.g. It thinks I have three albums called “Briefcase Full of Blues”, each one containing a subset of tracks of the entire album. The root cause is that I (and the music services and file stores) treat each track as a separate object rather than treat the Album as an object.
Even now, google drive or dropbox or S3 browser, even S3 itself, are “just” representations of a workspace. In general, I (can’t speak for anyone else) have got used to that workspace being represented in something like a windows file explorer, with a hierarchy, for example folder “Artist” contains multiple “Albums” contains multiple “tracks”.
To use your example, are you transferring ONE git file from a central repository to your workspace, or are you transferring a whole project ? At a technical level it’s a series of the former, even if you’re just synchronising one change, but conceptually, you are synchronising one big object (the GIT project).
However, to understand the project or album enough to change it (or just to more deeply understand it without changing it, rather than just consume it), you DO need to know the underlying technical structure. For example, imagine playing the tracks of an ELP album without knowing the correct order ? Same music, totally different experience !!
my 2 cents …
Your 2 cents definitely appreciated, Martin! Those are great thoughts, which has got me thinking about how I organise(d) my music (I use Spotify predominantly now so I don't even think of organisation, much less storage - but that's a story for another time I guess).
I like how you draw the distinction between the different layers - actual storage solution, storage exposure, and the view on that exposed storage (e.g. Explorer style windows).
To answer your question - I guess it varies - sometimes it will be a single file, other times it will be a ZIP file (still a single file technically I guess) of multiple files and folders. In fact the screenshot at the top of this article has *both* those variations.
Then again, if we move to something more cloud native, perhaps the concept of (or need for) a ZIP file goes away - one could see that as a result of having to upload (or download) a collection of related artifacts when starting from the position of a filesystem. Perhaps the non-upload/download solution means that we don't have to think in those terms any more ...
John Patterson just commented on Twitter:
"Simplify, Standardize, Automate should be the mantra eg 'cf push myapp.git'"
I like this, and it made me think of ngrok, that awesomely useful and insanely easy HTTP tunnel mechanism. With a single command, one can instantly fire up an HTTP tunnel from any machine to make a protocol/port available there through a public interface.
I'm not saying ngrok itself is a solution here, but the way it's so easy to fire something up and continue with whatever you're trying to achieve is exactly the experience I think of when I read the "simplify, standardise and automate" mantra in John's comment.
If you've not tried ngrok, have a look and see. Depending on your circumstances, it may change your (development) life!
Made me remember when ngrok saved my life when trying to access a SAP system installed in a VM. Really easy to use tor a rookie like me (:
Nice blog, by the way.
Thanks Christian. Yes, ngrok is pretty darn useful!