This is a searchable description of the content of a live stream recording, specifically "Episode 13 - Stdio-ifying the CSV filter mechanism" in the "Hands-on SAP dev with qmacro" series. There are links directly to specific highlights in the video recording. For links to annotations of other episodes, please see the "Catch the replays" section of the series blog post.
This episode, titled "
Stdio-ifying the CSV filter mechanism", was streamed live on Wed 20 Mar 2019 and is approximately one hour in length. The stream recording is
available on YouTube.
Below is a brief synopsis, and links to specific highlights - use these links to jump directly to particular places of interest in the recording, based on 'hh:mm:ss' style timestamps.
Brief synopsis
In the previous episode (Ep.12) we looked at a simple CSV filter utility. In this episode we improve it by giving it the ability to read from STDIN and write to STDOUT so it plays nicely in pipeline contexts. Then we’ll be ready to use it to finalise our data for our CAP based “Northbreeze” service.
Links to specific highlights
00:03:45: A brief glimpse behind the scenes, as it were, where I produce the annotation blog posts for the video recordings that get uploaded to the
SAP Developers YouTube playlist. I write in
Markdown in Vim, which is a pretty nice combination of reliable and comfortable tech.
I use a Vim macro while writing the content, which we have a look at, starting from a simple template, and converting the HH:MM:SS timestamps to a Markdown-annotated link. I don't have this macro in my standard
.vimrc
, as it's only really relevant for this particular task, so I have it in a "project-local"
.vimrc
which is possible via this configuration:
set exrc
(see my
dotvim repo for more details).
This is what the macro looks like:
let @t = '0Ypihttps://www.youtube.com/watch?v=VIDEOID&list=PL6RpkC85SLQAIntm7MkNk78ysDm3Ua8t0&t=wwrhwrmAs0ys$)A:k0ys$]Jx0ys$*.'
(note the escape characters just after the
t=
,
As
and
A:
bits).
00:10:30: Based on
speri's question, we have a quick go at creating a simple macro in Vim.
00:14:30: A recap of the Unix STDIO philosophy, thinking about the three streams STDIN, STDOUT and STDERR, and the power that brings. There's a parallel between that and the dotchains that we're writing in JS with the Axios library.
00:15:55: Talking about Node.js debugging, based on a comment from
nabheet.madan3.
00:17:04: Reminding ourselves of what we have in the project directory
csvf
, including taking a quick look inside the project's
package.json
. This shows us that we have two dependencies, one on
@Sap/cds
and the other on the
command-line-args
package.
00:17:40: A feature in the terminal file manager
Ranger that I discovered accidentally is the "zoom" feature with
i
, which I show here.
00:20:18: Setting out our new intentions by modifying the option definitions in the constant
optionDefinitions
. Remembering how it works right now with a simple example:
=> ./cli.js -v -i tmp/Suppliers.csv --fields supplierID city
>> Processing tmp/Suppliers.csv
>> Filtering to supplierID,city
>> Written to _out.csv
What we actually want to be able to do now is something more like this:
=> cat tmp/Suppliers.csv | ./cli.js --fields supplierID city > file.csv 2> error.log
using the pipe operator (
|
), the redirect operator (
>
) and the redirect-to-stderr operator (
2>
).
Anyway, we end up with options that look like this (from the help text):
Options:
-i, --input Input CSV file (reads from STDIN if not specified)
-o, --output Output CSV file (defaults to STDOUT)
-f, --fields List of fields to output (space separated)
-h, --help Shows this help
-v, --verbose Talkative mode
Examples:
csvf -i data.csv -f supplierID companyName city -o smaller.csv
csvf --fields supplierID city < input.csv > output.csv
cat input.csv | csvf -f supplierID city | less
00:28:45: I fix the call to
process.exit()
by changing the value passed from a 1 to a 0, denoting success rather than failure (!).
00:29:10: Looking at the NPM module
get-stdin which allows us to read from STDIN, which we will use. And it's promised-based, too!
00:29:50: We install
get-stdin
with:
=> npm install --save get-stdin
... write a simple test script to try it out:
const getStdin = require('get-stdin')
getStdin()
.then(xs => console.log("GOT:", xs))
... and try it out:
=> echo "hello" | node stdin.js
GOT: hello
and also:
=> node stdin.js < /etc/hosts
GOT: ...
00:32:00: Someprogrammingdude points out that a future version of JS will indeed get a pipeline operator - this is great. See
https://github.com/tc39/proposal-pipeline-operator for more details. This information causes me to think of a talk on the F# language that I saw at
Manchester Lambda Lounge which in turn reminded me of the Elm language.
At this point I dug out the content from a talk I gave at Lambda Lounge a couple of years ago:
Discovering the beauty of pattern matching and recursion where I look at these two language features in different languages, including Elm, Elixir, Haskell, JavaScript and Clojure. This talk was slide based but the slides are mostly full of code, so that's OK, right?
🙂
00:35:08: Starting to refactor the code a little bit, turning the main part of the script into a new function
process
- this lays the ground for a really odd error which we'll come across a bit further on!
00:37:20: Now we can refactor the final part which looks at the value of
options.input
and we use
util.promisify
to make things a little nicer - see the post
Node.js: util.promisify() for more detail.
00:47:58: So we end up with this:
if (options.input) {
readFileAsync(options.input, { encoding: 'utf8' })
.then(process)
.then(console.log)
Notice how this is so clean and solid state, there's nothing really "moving" in here, not even any explicit arguments passed to the functions in the
then()
calls, nothing that can go awry.
Notice also, however, a glaring oversight that we'll discover shortly!
00:49:40: We note that right now, while we can write to STDOUT already (with
console.log
), both our verbose output (prefixed with ">>") and also the real data output goes to the same place, which we see in a test run. We fix this in the
log
function by changing the
console.log
call to
process.stderr.write
.
00:51:10: Bang! When we try the new script out, we get a nice error: "Cannot read property 'write' of undefined". What the heck? Anyway, after a few brief moments of head scratching, we realise that this is because I inadvertently
redefined a major feature of the Node.js runtime in that we created our own function called
process
, clobbering the standard
process
object which has, amongst other things, the
stderr
property (to which we're referring with
process.stderr.write
).
What a fool!
00:57:30: Renaming our function from
process
to something else (
filtercsv
) fixes the problem
🙂
We can now finish the refactoring of the script, which we do, including writing an
output
function in a way that we can partially apply it, too.
01:01:50: Trying the refactored script out, with STDIN and appending to the STDERR output using the append (
>>
) operator (as opposed to the (over)write operator (
>
)), and trying the write-to-file option too. All seems to work as intended. Great!
Phew!