Skip to Content

In the last installment, the ABAP detective had received many tips on tracking the whereabouts of the missing Remote Function Call process.  What motivated the tipsters remains to be seen.  Some are detectives in their own schema, some are looking for answers to similar questions, and some are just naturally helpful.  A lot of the leads pointed toward the server, even though the evidence points to the client being untrustworthy.  After reading the evidence again, I decided it was time to set a tail.

My first thought was to catch the process red-handed, keeping such close tabs on the fork that I could be a tine.  I asked for a repeat code performance, but with blanks in the return function call, but my request for extra cycles would need to wait for court proceedings.  With that avenue closed for now, I was left with casing the scene of the crime, hoping to catch a glimpse of the elements that triggered the process to stall.  I knew the job had paperwork, log files showing each detail, and figured that might yield more clues.  Before I had merely looked at the error type, without seeing any pattern other than the time of day or day of week.  I needed to get closer to the action, and see what else was there.

I took a few log files apart, looking for new clues.  My first attempt to set a tail by spotting a newly created process and viewing the log as it was being created turned into a short taxi ride.  I knew the processes restarted every 15 minutes, but by the time I spotted a new one, it was all over.  That gave me the idea, though, to look at the times more exactly.  I knew Windows, like other global operations, had punch clocks for every file. Maybe the created or modified times would mean something.

Example:

12/18/2009  04:18 PM            55,098 case-file-2009161849.txt

Created:  Today, December 18, 2009, 4:18:49 PM
Modified: Today, December 18, 2009, 4:18:57 PM

It looked like the file was created in less than 10 seconds.  It didn’t dawn on me at first, but the name included that time.  Was the file in on the caper at the beginning, or was it an accessory after the fast?  Knowing that might tell me where to look next.  As I looked closer at each case file, I realized there were errors at different places.  I needed to lay them out for a better look.

In my detecting arsenal of pattern matching tools, one of the trustiest is grep.  It’s been in service for several decades, and just gives the facts, man.

$ grep -i time case-file-2009161849.txt
TIME        Fri Dec 18 16:18:39 2009
TIME        Fri Dec 18 16:18:39 2009
TIME        Fri Dec 18 16:18:39 2009
TIME        Fri Dec 18 16:18:40 2009
[…]

I focused on the chronology of the crime, seeing the repetitions.  Like coffee perking to a rich blackness, I distilled these pearls further.  This time, I pulled another tool out of my bag, the unique spotter.  Somehow the last two letters hard worn off from years on the job.

$ grep -i time case-file-2009161849.txt | uniq -c

      3 TIME        Fri Dec 18 16:18:39 2009
      7 TIME        Fri Dec 18 16:18:40 2009
      4 TIME        Fri Dec 18 16:18:41 2009

 

I began to understand this crime happened quickly, and seemed almost non-stop.  I drew a few of them into graphs to see if I could tell how the crime wave grew.  I had the idea that it started small, and got worse near the end, but after examining the results wasn’t so sure.  Most of the hits were near late, but not all.  The other conclusion was the high frequency of errors per second, over 20 in a few cases.

While I was puzzling out the sequence of events, trying to track backwards to the first pebble in the avalanche, I noticed that the times in the log files were before the file was opened on the operating system.  Different coders use different methods; this one seems to be storing all the evidence in memory, like a chipmunk, then writing it all out at once.  The danger with this, of course, is that evidence could be tainted by being filtered or sorted before writing, meaning I needed to be cautious in drawing conclusions.  No code room judge is going to buy a case so neatly gift-wrapped. Each file that I looked at was the same, with the writing time ranging from almost no time to over 10 seconds.  That seemed strange, in these days of GHz buses, but maybe there were some hidden round trips I hadn’t stumbled across yet.

I put the case aside to return to other work, hoping my subconscious, or another operative, would shed light on the root cause.  After looking at one of the charts a couple days later, I noticed that the clock skipped backwards near the end (outlines in red and blue).  I could understand seconds going by with no errors, but it is impossible for the same second to occur more than, say, once.  I suspect even more now that the alleged single thread process is actually a gang of them, with the loot being passed back and forth.  If one of them wrote a debug letter and made it on the books before another wrote its own confession, that would explain the time warp.  The threads are starting to unravel.  I really need to get the source code motion filing acted on.

 

Figure 1

image

 

Figure 2

 

 

image

 

Figure 3

image

Figure 4

 

image

 

Next steps

While there is no lead suspect in this case, the head of the Source Code Bureau told me there’s a new development (so to speak) that will help us to infiltrate this syndicate.  There’s a new client box online; they’re going to move the suspect process onto it, watching to see if it goes bad by itself, or if it is being led astray by the neighbors.  With the holidays approaching, we might not see the results of this test case until early in 2010.

 

In flight

 

Joe H. suggested looking at RZ12 and SM51 for more RFC process details.  My initial take is that nothing jumps up and waves a white flag of surrender, but I’ll need to take a look at one or two suspiciouscharacters there.

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply