Annotated links: Episode 13 of Hands-on SAP dev wi...

qmacro · ‎04-11-2019

This is a searchable description of the content of a live stream recording, specifically "Episode 13 - Stdio-ifying the CSV filter mechanism" in the "Hands-on SAP dev with qmacro" series. There are links directly to specific highlights in the video recording. For links to annotations of other episodes, please see the "Catch the replays" section of the series blog post.

This episode, titled "Stdio-ifying the CSV filter mechanism", was streamed live on Wed 20 Mar 2019 and is approximately one hour in length. The stream recording is available on YouTube.

Below is a brief synopsis, and links to specific highlights - use these links to jump directly to particular places of interest in the recording, based on 'hh:mm:ss' style timestamps.

Brief synopsis

In the previous episode (Ep.12) we looked at a simple CSV filter utility. In this episode we improve it by giving it the ability to read from STDIN and write to STDOUT so it plays nicely in pipeline contexts. Then we’ll be ready to use it to finalise our data for our CAP based “Northbreeze” service.

Links to specific highlights

00:03:45: A brief glimpse behind the scenes, as it were, where I produce the annotation blog posts for the video recordings that get uploaded to the SAP Developers YouTube playlist. I write in Markdown in Vim, which is a pretty nice combination of reliable and comfortable tech.

I use a Vim macro while writing the content, which we have a look at, starting from a simple template, and converting the HH:MM:SS timestamps to a Markdown-annotated link. I don't have this macro in my standard .vimrc, as it's only really relevant for this particular task, so I have it in a "project-local" .vimrc which is possible via this configuration:

set exrc

(see my dotvim repo for more details).

This is what the macro looks like:

let @t = '0Ypihttps://www.youtube.com/watch?v=VIDEOID&list=PL6RpkC85SLQAIntm7MkNk78ysDm3Ua8t0&t=wwrhwrmAs0ys$)A:k0ys$]Jx0ys$*.'

(note the escape characters just after the t=, As and A: bits).

00:10:30: Based on speri's question, we have a quick go at creating a simple macro in Vim.

00:14:30: A recap of the Unix STDIO philosophy, thinking about the three streams STDIN, STDOUT and STDERR, and the power that brings. There's a parallel between that and the dotchains that we're writing in JS with the Axios library.

00:15:55: Talking about Node.js debugging, based on a comment from nabheet.madan3.

00:17:04: Reminding ourselves of what we have in the project directory csvf, including taking a quick look inside the project's package.json. This shows us that we have two dependencies, one on @Sap/cds and the other on the command-line-args package.

00:17:40: A feature in the terminal file manager Ranger that I discovered accidentally is the "zoom" feature with i, which I show here.

00:20:18: Setting out our new intentions by modifying the option definitions in the constant optionDefinitions. Remembering how it works right now with a simple example:

=> ./cli.js -v -i tmp/Suppliers.csv --fields supplierID city

>> Processing tmp/Suppliers.csv

>> Filtering to supplierID,city

>> Written to _out.csv

What we actually want to be able to do now is something more like this:

=> cat tmp/Suppliers.csv | ./cli.js --fields supplierID city > file.csv 2> error.log

using the pipe operator (|), the redirect operator (>) and the redirect-to-stderr operator (2>).

Anyway, we end up with options that look like this (from the help text):

Options:

    -i, --input    Input CSV file (reads from STDIN if not specified)

    -o, --output   Output CSV file (defaults to STDOUT)

    -f, --fields   List of fields to output (space separated)



    -h, --help     Shows this help

    -v, --verbose  Talkative mode



Examples:

    csvf -i data.csv -f supplierID companyName city -o smaller.csv

    csvf --fields supplierID city < input.csv > output.csv

    cat input.csv | csvf -f supplierID city | less

00:28:45: I fix the call to process.exit() by changing the value passed from a 1 to a 0, denoting success rather than failure (!).

00:29:10: Looking at the NPM module get-stdin which allows us to read from STDIN, which we will use. And it's promised-based, too!

00:29:50: We install get-stdin with:

=> npm install --save get-stdin

... write a simple test script to try it out:

const getStdin = require('get-stdin')

getStdin()

  .then(xs => console.log("GOT:", xs))

... and try it out:

=> echo "hello" | node stdin.js

GOT: hello

and also:

=> node stdin.js < /etc/hosts

GOT: ...

00:32:00: Someprogrammingdude points out that a future version of JS will indeed get a pipeline operator - this is great. See https://github.com/tc39/proposal-pipeline-operator for more details. This information causes me to think of a talk on the F# language that I saw at Manchester Lambda Lounge which in turn reminded me of the Elm language.

At this point I dug out the content from a talk I gave at Lambda Lounge a couple of years ago: Discovering the beauty of pattern matching and recursion where I look at these two language features in different languages, including Elm, Elixir, Haskell, JavaScript and Clojure. This talk was slide based but the slides are mostly full of code, so that's OK, right? 🙂

00:35:08: Starting to refactor the code a little bit, turning the main part of the script into a new function process - this lays the ground for a really odd error which we'll come across a bit further on!

00:37:20: Now we can refactor the final part which looks at the value of options.input and we use util.promisify to make things a little nicer - see the post Node.js: util.promisify() for more detail.

00:47:58: So we end up with this:

if (options.input) {

  readFileAsync(options.input, { encoding: 'utf8' })

    .then(process)

    .then(console.log)

Notice how this is so clean and solid state, there's nothing really "moving" in here, not even any explicit arguments passed to the functions in the then() calls, nothing that can go awry.

Notice also, however, a glaring oversight that we'll discover shortly!

00:49:40: We note that right now, while we can write to STDOUT already (with console.log), both our verbose output (prefixed with ">>") and also the real data output goes to the same place, which we see in a test run. We fix this in the log function by changing the console.log call to process.stderr.write.

00:51:10: Bang! When we try the new script out, we get a nice error: "Cannot read property 'write' of undefined". What the heck? Anyway, after a few brief moments of head scratching, we realise that this is because I inadvertently redefined a major feature of the Node.js runtime in that we created our own function called process, clobbering the standard process object which has, amongst other things, the stderr property (to which we're referring with process.stderr.write).

What a fool!

00:57:30: Renaming our function from process to something else (filtercsv) fixes the problem 🙂

We can now finish the refactoring of the script, which we do, including writing an output function in a way that we can partially apply it, too.

01:01:50: Trying the refactored script out, with STDIN and appending to the STDERR output using the append (>>) operator (as opposed to the (over)write operator (>)), and trying the write-to-file option too. All seems to work as intended. Great!

Phew!

Annotated links: Episode 13 of Hands-on SAP dev with qmacro

Brief synopsis

Links to specific highlights

Get Your SAP HANA Idea Incubator Badge Today!

SCN Mission - SAP HANA Quiz Challenge is now retired

Share your #HANAStory and Win