(originally published at www.sybase.com in August 2010; moved to SCN in December 2012)
Many SAP Sybase ASE users are familiar with the concept of using Unix/Linux named pipes (FIFOs) in combination with BCP in order to speed up transferring data between ASE servers. These named pipes are the things you create with the Unix/Linux
Guess what? Despite this having been used by many Sybase users forever, using BCP with FIFOs has formally always been unsupported by Sybase.
Yes, that’s right.
That is, until recently. Here’s the story.
But first, a bit of context. A common scenario for using BCP+FIFOs involves BCP’ing a table from server A to server B: when the table is large, you have to wait until the table is fully BCP’d out to a file before you can start BCP’ing it into the target server. By BCP’ing into a named pipe instead) of a file, you can immediately start BCP’ing into the target server, thus saving time. This works because a named pipe is a FIFO, to which one user can write and another can read at the same time, and the reader reads the data in the order it was written. In older years, using this approach was necessary when the operating system could not support files greater than 2Gb, and BCP’ing out a large table would have exceeded this file size.
Given that this is a pretty standard way of working, and named pipes are a standard Unix mechanism, you’d expect this to be supported, right? So did I.
Earlier this year, a customer asked me why it was that Sybase did not support BCP+FIFO. Having been using FIFOs with BCP for as long as I can remember, I told the customer he was mistaken and Sybase surely supported this combination. Not willing to give in, the customer then took me to the Utility Guide manual (scroll down to the “Usage” section), and there it said, unmistakably:
You cannot use named pipes to copy files in or out That was quite a surprise for me: I had been using this for so many years! I promised the customer to look into it, and made a mental note not to assume again that a customer is wrong without checking the facts first (a principle I try to hold high, but sometimes you let things slip…)
My first thought was that in some recent ASE version this restriction had somehow crept into the documentation. So I dug up the oldest version of the Utility Guide manual I could find, which appeared to be for 11.0, dating back to 1995. To my astonishment, that old version of the document included exactly the same restriction. Apparently I had missed something over the years…
Puzzled, I then embarked on what came to be a journey past many Sybase collegues across various engineering teams. Almost everyone in Sybase whom I asked about the reason for this restriction were unaware of it, and instead, were pretty certain that BCP+FIFO was fully supported. At least I wasn’t the only one who had missed something, which made me feel a little better. However, nobody felt happy to simply remove the restriction from the documentation without a full understanding of what this was all about. It didn’t exactly help that the documentation provided no details on the technical reasons for this restriction.
On the other hand, some other Sybase collegues immediately agreed that this restriction was correct — because it mentiones “named pipes”. While for most people a “named pipe” is a Unix/Linux FIFO, some folks primarily think of “named pipes” in Windows. However, Windows named pipes is really a client-server transport protocol, and has nothing to do with the FIFO mechanism (and indeed, has nothing to do with BCP as such). While the documented restriction didn’t provide any clarification on what a “named pipe” really meant in this context, we decided we focus on the Unix/Linux FIFO since that was most relevant to customers.
At this point, the options seemed to be:
- the documented restriction was simply incorrect, and BCP+FIFO was supported. But in that case, why was the restriction put in the docs?
- the restriction stemmed from an issue in some old ASE version which had been fixed long ago, but the documentation had never been updated
- the restriction was real — but in what way?
We had to figure out which was the right answer. Eventually I found someone who had once heard that there could be issues with text/image columns when using BCP+FIFO. But a quick test showed everything to work fine, so this meant that if the restriction was real, it was probably something more complex. Eventually some engineers started digging into source code in search for clarity.
After some time, it was discovered that BCP sometimes performs an lseek() system call, which would work fine on regular files, but could potentially fail on named pipes. As far as the engineer was concerned, that answered the question, but obviously this explanation would be of little use for customers.
After further searching and discussion, it was determined that this lseek() system call would occur only under a very specific set of conditions, all of which must apply:
- it has to be a BCP-in, i.e. from a named pipe into a database table; BCP-out to a named pipe is unaffected
- the table being copied into has one or more text or image columns (or similarly stored data, like off-row Java classes)
- the BCP-in operation uses a BCP format file
- the BCP format file specifies a non-zero prefix length for a text/image column
- the actual data size of the text/image column being copied is greater than the maximum size copied for such columns by BCP (this size is 32Kb by default and can be overridden by the -T flag)
If all of the above are true, then it could happen that BCP would fail to correctly copy data into the table from the named pipe. So that’s what it was all about…
This restriction is not likely to have a wide impact. Chances are that most ASE users have never used a BCP format file in the first place, let alone with a specification of non-zero prefix lengths (I must admit I had no idea what a “prefix length” was; see the BCP documentation for details if you’re really interested). It seems therefore safe to conclude that the original restriction was formulated too broadly (to the point of overkill, I’d say), and that BCP+FIFO are perfectly safe for use in the vast majority of cases.
The question remains why such a very specific situation was covered with such a sweeping restriction. I guess we’ll never know the reason since the origins date back so many years. It just proves once more that writing documentation is hard, especially if you want it to be correct, useful and informational.
Anyway, you can rest assured now that BCP is safe to use with FIFOs — the Unix/Linux FIFO mechanisms, that is.
A technote with a short summary of the above has been published at http://www.sybase.com/detail?id=1080140.