Skip to Content

In this post, I think about mainframes and a message documentation aspect from the mainframe world that, while originally proprietary, is a big plus for operators and developers alike and something I’d love to see return.

On my route into Manchester for Saturday’s run with the Mikkeller Running Club Manchester chapter I listened to an InfoQ interview: Adrian Cockcroft on Microservices, Terraservices [sic] and Serverless Computing. From 2016, it’s a great interview conducted by Wesley Reisz, and contains much to ponder even today.

Teraservices and mainframes

One of the subjects covered was “teraservices”, starting around 17 minutes into the audio recording. I didn’t really know what they were until Adrian explained; basically, they’re services at the opposite end of the scale to microservices. You know the sort of thing:

smaller <-- pico nano micro milli ~~ kilo mega giga tera --> larger

The irony of the misspelling in the InfoQ title makes me smile – the word “terra” (rather than tera) means ground, or earth in Latin, which couldn’t be further from where these services have their home – in the cloud.

Teraservices run in memory sizes far greater than microservices. Adrian made the comparison between microservices running in 100 megabytes of RAM and teraservices running in 2 terabytes of RAM, on machines available by the hour on Amazon Web Services.

I’d like to think that the advent of in-memory data storage and processing with HANA and the corresponding demand for machines with super large memory footprints has pushed the industry to the point where terabyte-sized machines are available as commodities in the cloud today. Adrian and Wesley laughed about things turning full circle (towards the mainframe era) and running a whole bunch of services on one “great hulking machine”. Making effective use of such a machine would involve using the internal memory as a communications backplane between all the containers running in that machine, a backplane much faster than any network-based backplane would allow.

And yes, it does remind me* of the mainframe era in general. But more specifically, making best use of specialised hardware for giant workloads is what mainframes are all about.

*Of course, I’ve written about the mainframe era before, when thinking about web terminals and the cloud in general in an earlier post in this Monday morning thoughts series: “upload / download in a cloud native world“, and way back beyond then too: “Mainframes and the cloud – everything old is new again“.

I read somewhere that the computing industry* cycle is at the decade scale – in other words, ideas and concepts come around and are (re)invented every ten years or so. I think the cycle is a little longer than that, but a cycle does exist. On the one hand, mainframes never really went away – they’re still handling a large part of the world’s transactional processing in specific sectors such as banking and airline industries. On the other hand, the conscious shift away from mainframes in enterprise computing in the early 90’s is now perhaps being reflected, like an echoing musical phrase that comes later in the piece, by a move to the cloud … perhaps it’s fair to say that the general idea of mainframes are back, albeit in a different guise.

*and not just the computing industry – companies today are looking at “insourcing”, where a decade ago they were falling over themselves to outsource a lot of their processes.

 

The IBM documentation machine

After leaving university in 1987 I started work at Esso Petroleum in London, and joined the Database Support Group. Esso was an IBM mainframe shop (I’d *just missed* the punched card era, which was a shame, on reflection) and in the data centre, sixty miles away out in Abingdon, Oxfordshire, there were IBM 3090 series mainframes running version of MVS – first MVS/XA and then MVS/ESA. While the machines were remote, the documentation was local, in the form of huge printed tomes that hung from their spine on a rack system, in a special documentation room.

I have visceral memories of spending time in that room, selecting the right tome, carefully unhooking it from the rack (they were usually hundreds of pages each and quite heavy), opening it up on the desk and finding, invariably, exactly what I was looking for. Then getting sidetracked and spending more time than my initial investigations required me to, looking through related documentation and, discovering new worlds of related information.

 

Message identification and documentation

What I remember the most though is how thoroughly messages were documented. Not just that, but how the messages were actually organised and catalogued. Messages were issued on consoles, and in batch job log output, for example, and each message was prefixed with the message ID, identifying the subsystem, the actual message number and the severity. Over time, I found myself being able to just glance at the “pattern” of messages in a job log, and discern, without much further investigation, what had happened.

Even more impressive – and useful – than just the consistent identification of messages was the actual message documentation itself. Taking a printout of a job log, say, to the documentation room, I’d reach for one of the MVS system messages volumes and look up the documentation for the message identification specified. I’d find not just a rehashing of the message line, in a different format, but a decent and full description, thoughtfully and accurately written, and where there was relevant action required, an explanation of what we needed to do.

My memory might be rose-tinted today, but I honestly can’t think of an area of IBM big iron computing in that facility that was a mystery. Everything that happened was described in messages, every message was identified, and every identified message was documented in great detail.

I’ve recently started carving out a little bit of time again for Hercules, the open source System/370, ESA/390, and z/Architecture emulator (I’ve played around with Hercules before: see the post Turnkey MVS 3.8J on Hercules S/370 Screenshot from 2005).

More specifically, I’ve been looking at Juergen Winklemann’s TK4- turnkey distribution of MVS 3.8j (a version of MVS that is now in the public domain). I powered up an emulated 3033 IBM mainframe and watched the Initial Program Load (IPL) sequence on the console before logging in.

Messages in an unattended console of an emulated IBM 3033 mainframe, and a connected 3270 session

 

Here’s a really simple example of what I mean about message identification and documentation. Looking at the first two messages at the top of the display in the screenshot, we have:

IEF236I ALLOC. FOR JES2 JES2
IEF237I 00E  ALLOCATED TO PRINTER1

Both message identifiers show that the messages belong to the IEF family, which is for events relating to IPL, Job Entry Subsystem (JES) and scheduler services, and more (see the z/OS Message Directory for more information). We’re pointed to the system messages volume 8 (IEF-IGD), the older equivalent of which would have been hanging in that documentation room all those decades ago.

This is the page from the volume that contains documentation for IEF236I:

Just look at the richness and precise nature of the information on that page. That volume (IEF-IGD) is almost 1000 pages in length, by the way.

You can imagine that with some patience, a desire to master the very explicit skill of navigating & using the IBM documentation, and some naus, there were few problems that occurred that you couldn’t fathom. Everything was there, if you only took the time to find it and read it.

 

A proprietary plus

One of the reasons why this was possible, of course, was because IBM dominated the market, and was large enough to cause customers to buy wall-to-wall IBM hardware and software. “No-one got fired for buying IBM” was the phrase. Their dominance and size & breadth of offerings made it possible to produce such a wonderfully rich layer of information, the likes of which I have never seen since, not even close.

The proprietary nature of that era meant that everything that one did as an operator, systems programmer or developer was within the context of what IBM produced. The word “proprietary” has now taken on negative tones, and there are good reasons for that. But I’d posit that one big plus from the proprietary circumstances in this age was that this was documentation done right. It shows that it’s possible to build complex systems and subsystems and make them understandable at an operations level. One could argue, as well, that message documentation was more important when the source code wasn’t available. But the availability of source code only helps a small percentage of those working with systems like this, and doesn’t excuse good documentation.

Today’s open source world is wonderful and a major reason why we’ve progressed so much in terms of computing, research and enterprise. The nature of open source means, indirectly, that the kind of documentation that existed for the complex array of subsystems of an IBM facility is unlikely to be produced for the systems that we’re building today. We take best of breed products, tools and processes and glue them together, and it’s more or less guaranteed that the operator surface area is so disparate and disconnected that a uniform — dare I say ivory tower — approach to consistent documentation, even a consistent way to output and identify error messages, is not going to happen without some sort of massive concerted effort.

One other thing that only occurred to me this morning while writing this – the messages being output on the console of my emulated 3033 mainframe from decades ago are documented in modern PDF-based files for IBM’s z/OS series. They are the same messages, and are still relevant today. Now if that’s not solid consistency I don’t know what is.

I yearn for the days when messages were consistent, easy to identify and read, and were documented to within an inch of their life. It’s one aspect of the first mainframe era that I’d love to see revived in the second mainframe era. What’s not to like?

 

This post was brought to you by today’s quiet run in the dawn light that rose on a new week, and by the happy nostalgia that looking through old IBM documentation (available at bitsavers) brings.

 

Read more posts in this series here: Monday morning thoughts.

To report this post you need to login first.

12 Comments

You must be Logged on to comment or reply to a post.

  1. Martin English

    Hi Dj,

    I started out my professional coding on an ICL 2980 and remember being amazed at the idea of it coming with an ICL 1900 emulator. Until I found out it was ‘just’ a real ICL 1900 “hanging off the side like a bag” (one of my favourite quotes from The Soul of a New Machine).

    Even though IBM mainframes were not my first systems, I soon came to love that documentation… You didn’t need to know anything about a particular sub-system, so long as you knew how to use the documentation. Once I got into systems programming, I used to have my own well thumbed and read copy of the Principles of Operations documentation – I doubt a printed copy of the latest version would fit in my brief case 🙂

    While there was room for innovation in the PCM market (I worked on a Fujitsu system running their version of MVS/XA, that had a COBOL compiler with syntax for for bit-wise operations), one of the things I noticed at the time was interoperability standards. For example, every vendor had to be able to connect over VTAM to IBM systems (I’m old enough that, at one stage, TCP/IP was an experimental protocol). The commercial imperative drove almost everyone to use it, so it became a defacto standard.

    Of course, the standard was defined by what IBM wanted, not necessarily what the customer wanted or needed, which is why wider industry standards became necessary, but because of IBM’s level of documentation you knew what the target was. And of course, back then IBM was open (though not free) source; you could look at the code, there was documentation on what control blocks meant what, and so you could debug the IBM code at the OS level (for example, CICS was successful mainly due to a customer rewrite of the code).

    The downside of all this was the cost and IB’s control and lack of responsiveness, which brings us back to why companies like DEC and Digital and other came into being (do read The Soul of a New Machine) if you’re at all interested in the history of computing. And, yes, you’ll recognise some of the similarities between then and now, as well !!

     

    hth

    (2) 
    1. DJ Adams
      Post author

      What a great comment, thanks Martin! I haven’t read The Soul of a New Machine yet, but it’s been on my list for a while. This may well have prompted me to bump it up to first place.

      I too remember VTAM, SNA generally and LU6.2 specifically, especially in conjunction with SAP’s CPI-C protocol. Heady days for sure!

      The fact that you remember and refer to the Principles of Operations and can still link to a (modern) version of it on IBM’s site really makes me smile, and underlines one of the points I made in the post. It really was a special era.

      (0) 
  2. Lars Breddemann

    I yearn for the days when messages were consistent, easy to identify and read, and were documented to within an inch of their life. It’s one aspect of the first mainframe era that I’d love to see revived in the second mainframe era. What’s not to like?

    Great documentation always went far beyond listing and naming facts about systems and actually taught the reader something. If one wants to learn why contemporary systems don’t come with such documentation, s/he just needs to look into important the quality of documentation is to sell those systems. That’s true for the mainframe vs. cloud-subscription as well as for the plane vs. car buying situation. Documentation is a hard, high effort and on-going commitment – that doesn’t pay off for a barely-cover-the-costs product that needs to be sold millions of times.

    Another aspect of the “good old” systems documentation is that it exclusively covered the system it sought to document. While that seems the obvious thing to do, it’s equally obvious today, that no system stands by itself anymore. Not considering the impact of systems to users, partners, customers, competitors and society is one of the grave sins of information technology today and so far.
    I’m eager to see if IT is ever going to get a grip on this or if we just wait until we will be regulated by laws (and thereby push responsibility away to the lawmakers, similar to what banks are doing).

    DJ, thanks for the nice blog post!

    (2) 
    1. DJ Adams
      Post author

      Thanks Lars, food for thought indeed – pushing the responsibility to the lawmakers, now that certainly seems a double edged sword!

      You’re right about the interplay between today’s systems – no system is an island, etc. The composability of systems and products is great, but comes at a cost that you explain very well.

      Actually the idea of systems composability makes me think, of course, of the Unix philosophy and what Thompson and Richie tried to (and very successfully) achieved with the idea of small pieces loosely joined (to repurpose the title of Weinberger’s book).

      (0) 
    2. Martin English

      Perhaps we need a Freedom of Information act for documentation, or is it all just some conspiracy to keep you coming back to the software vendor rather than fixing the bugs yourself ?  I note without comment that DJ’s twitter name starts with a Q….

       

      PS it’s a JOKE, people

      PPS (and I feel so weird that I thought some people may need that PS)

      (1) 
  3. Andy Bondarev

    Hi Mr. Adams,

     

    Totally agree, good programming and good systems are not just the ones that do everything right when everything is right, but also handle exceptional situations properly and are able to log what exactly has happened. So, what and where has changed in the paradigm of detailed error messages and proper exception handling? Is it the way IT is taught in the Uni or is it the due to pressures and tight deadlines pushing towards ‘signing off the scripts’, i.e. getting from A to B in the most common scenario. I believe it is a bit of both. On top of that there is a tendency to integrate quite a few systems form different vendors within any big programme, and quite a few messages would have been lost in transit. Even within the same solution/box there are often different layers of basis, business logic, DB, etc. which do not always pass on the errors properly. In the end it becomes quite hard to find anyone responsible for ‘This program has performed an illegal operation and will be shut down’ – this is how it started, hasn’t it? ))

     

    (1) 
    1. DJ Adams
      Post author

      Thanks for the comment Andy – I think it’s a combination of a lot of factors, many of which you list here. The pressure to get things done quickly is the cause of many quality problems, not least documentation. The integration of different systems and subsystems is another big one; without the central coordinated “force” that must have existed to some degree in the IBM mainframe world, there’s little chance of “spontaneous documentation coordination”.

      The reference to the “illegal operation” message – I’m guessing it’s a Windows thing – reminds me that also the advent of GUIs in operating systems perhaps also was a catalyst in this – it was more likely that we’d want to investigate system messages that were issued and stayed around on the screen or on printouts, than ones shown in a temporary dialog box.

      (0) 
      1. Andy Bondarev

        If only if only… Let us open up a bit and start dreaming. What if there was a coding license, just as there are driving licenses? Get penalty points for sub-optimal performance, lack of comments, lack of documentation, and, quite importantly, poor exception handling? 12 penalty points and the license is taken until one re-sits the exam on best practices. Total fugitives get banned for years… Yes, all major vendors to give courses and certificates, but how many do actually follow (and enforce!) whether what they teach is really implemented?

        (1) 
          1. Andy Bondarev

            Haha, no one is perfect. We all make mistakes, we should all be given a chance to correct them. The idea is to keep deliberate speeding coders and aggressive coders from the industry. I am sure you are (and were) none of these :))

            (1) 
  4. Gregory Misiorek

    Hi DJ,

    nice trip down the memory lane. around the same time or maybe a few years later, i was trying my hand at a dumb terminal when in business school, but alas during the times of IBM fearing for their life, Excel became the tool of trade and nobody would want to practice CLI commands with me when client server PC was all the rage. in any event, i also worked at IBM in early 2000’s and again it was all SAP and no mainframes as we lived the early internet (ITS) and the sunset of the client server architectures. now that we have linux foundation fabric project it may be worth taking another look at the grand old dame, and of all places HPI is now offering their first mainframe class in English (German version was last year): https://open.hpi.de/courses/mainframes2018 and who knows maybe we can meet more than one aficionado there?

    thx, gm

    (1) 
    1. DJ Adams
      Post author

      Hey there Gregory, thanks for the comment. Amazing, a mainframe course … enrolled! Yes, unfortunately I also remember the days when client-server came along and the Office malaise started to take hold. Ah well, I guess not every memory has to be a happy one.

      I like the reference to a mainframe as a “grand old dame”. Something very regal and awe-inspiring, somehow 🙂

      (0) 

Leave a Reply