RFC. An acronym to make the steeliest ABAP detective shudder in his debug boots. I was sitting in my second story office, catching up on the latest local developments after a fascinating trip to the pseudo-desert of Phoenix Arizona where SAP’s technical education conference set up camp for a few days, when I got the call. “RFCs gone missing.”
“Gone missing? – did you call Missing Processes?”
Yes, well they called that mob, then they called me. If it’s one thing that’s harder to track down than a vanished remote function call, it’s one that started the vanishing act months ago. It turned out we’ve been getting the signs, you know the ones – CPIC error – for a long time, but no one paid it much thought when it was just a lonely batch job in the middle of the night. But then it started happening to continental operatives, and more often. We’ve got the logs, they said, but not the ones they carved the Maltese Falcon out of.
“What’s the reward likely to be for straightening out this crooked RFC deal,” I asked?
“Practically nothing” was the quick answer, “though you’ll earn our undying admiration.”
“Perfect!” I said, “I’ll get right on it. Can you leave the graven images of this miscreant? I get 10 SCN points up front and 1 point per day for expenses.”
“All we’ve got are these leftover Google Wave invites that came in the email.”
“Fine, just leave them with my secretary, sorry, administrative officer, on your way out.”
Sunset deepened red and orange outside my cubicle window as I pondered my next move. Should I shadow the server or the client? What kind of agent was this anyway, to work only part of the time? I knew that time was not on my side, since a lowlife process hanging around the dispatcher and work controls could gum up people’s lives, making even the most mundane business seem tedious and seedy.
I started by poking around the trash cans outside the biggest application servers in town, checking out the classified ads in the dw logs. And I didn’t go in the front lobby through the SAP GUI like a high-class security officer; I went straight to the basement with the raw logs and eggshells, places like “dev_w1” and “dev_rfc8.”
A bunch of these clues are red herrings, or MacGuffins as Hitchcock called them. When I see:
|CPIC-CALL: ‘ThSAPCMRCV’ : cmRc=20 thRc=223|
I had to look up what a return code 223 was. “Network read error” was what the SCN forums and a few other choice places had to say. Not extremely helpful, but a promising lead since this is a network case.
The other place I looked was in the switching yard of the Socket and Memory Railroad. If any place might harbor a fugitive transient process, it would either be by the side of those tracks, or maybe in the great internet harbor docks of the Web Application Server Bay.
As I could not be assured of spotting a fleeting socket gap, I enlisted the help of a couple of stool bots, who were perfectly willing to sit in a railside cafe, drinking BitWise coffee and smoking EtherNet cigars, watching for suspicious Unicode characters. I instructed them to look out for a socket junkie who plugged in, then nodded out and forgot to disconnect. All the well-behaved, clean shaven processes would drop off the map as soon as they got their fill of network data and posted their state on the dispatcher’s electronic clipboards.
After a few days of taking the electronic pulse of the netstat underworld, I took a look at the product. What does this mean, I wondered?
Not much, I realized. I was spending a lot of time cooling my heels, and only finding out what I already knew: “network connections come and go.” There wasn’t anything here shining out like a giant searchlight, saying “Come and get me, network coppers! You won’t take me alive!” It was all run-of-the-mill, logical units of work, or people unplugging their terminals to catch the next streetcar home. Nothing to indicate my missing process was here.
I called the customer back, and told them I needed to take a fresh look at the case. The customer was actually glad to hear from me for a change, even though I had none of the usual suspects to parade in front of them.
What I learned was that the process faults, while unpredictable, occurred fairly often, and were noted in the custom application files. It mostly happened to automated scripts, but sometimes to live users.
What I found was perfectly normal behavior, and then, a bunch of errors. It was as if the process just couldn’t take it any longer and went all google-eyed.
N040124,67,WB43,0,80,80,0,0,80,GetWIPQuantity: Error - Connect to SAP gateway failed
After gleaning what I could from the voluminous traces, which wasn’t much of substance, I went over the plan with the customer. We would look for patterns in timing, whether by day of week, hour of day, or minute of the hour. It might not pan out, but there had to be a good reason this worked sometimes but not all of the time. ABAP Detective work isn’t all fun and glamor. In fact, it isn’t any fun and very little glamor.
We’d look into altering the custom application so that it would provide more details at the time it got lost, including how many peers were logged in, and whatever else we could think of.
And oh, yeah, I’d post the story on SCN. Maybe another detective in another agency had seen this grifter before.
After that, I sat back to drink a strong cup of java, with a hint of cough syrup, pondering what to look at next.
|CONTINUED IN CHAPTER TWO . . .|