Skip to Content
Personal Insights
Author's profile photo Jim Spath

The ABAP Detective Takes The Summer Heat

Previously, my detective work into measuring ambient environmental data led me to the clues presented in the winter-time post “The ABAP Detective Takes The-Heat … Cleanse/”. And then the spring time change post “The ABAP Detective Gets Their Clock Cleaned.”

As the sunrise got earlier and the days became longer, solstice kicked in. When I checked a few gauges, I thought the reported temperature values were too high. I could glance at a wall thermometer or HVAC (heating/ventilating/air conditioning) controls and see a discrepancy. Any time the numbers don’t look right, it’s like the spider sense kicks in. And then the hypotheses begin, even before enough data to draw logical conclusions has been captured.

My first take was to pull out the old spiral notebook and note data manually, as Columbo might do, with pencil and paper. A little of that goes a long way, as any hacker will try to automate a repetitive task after the 3rd or 4th manual effort. Looking around at available reference gauges, nothing nearby was automated, including, especially, the house thermostat. Ripping that device out to get a smarter unit would be an expense and a possible commitment to a third party big tech host, and I decided that could wait.

Digging into the morgue (old clippings, not bodies) I found references to federal sites that publish environmental data, particularly weather related. Links below. To cut to the chase, I set up a recurring job to pull in current weather conditions, then load them into the same monitoring system as the CPU and environment gauges “in house”.

Here’s an example of the supplied parameters and values:

Baltimore / Martin, MD, United States (KMTN) 39-20N 076-25W
Jun 27, 2022 - 04:55 PM EDT / 2022.06.27 2055 UTC
Wind: from the WNW (300 degrees) at 9 MPH (8 KT):0
Visibility: 10 mile(s):0
Sky conditions: clear
Temperature: 80 F (27 C)
Dew Point: 66 F (19 C)
Relative Humidity: 61%
Pressure (altimeter): 29.96 in. Hg (1014 hPa)
ob: KMTN 272055Z 30008KT 10SM CLR 27/19 A2996
cycle: 21

For a future refinement I’d probably clean up the logic to verify anything being captured and experiment with more frequent updates. It looks like the feed site is updated hourly, and that was the easiest to schedule. There could¬†most likely be xml or json feeds. I maximized the capture by including both Fahrenheit and Celsius in the local archives, as the data volume is minimal at 24 samples per day. The charts below show degrees F since that’s the custom here. Hopefully this won’t ruin the plot for non-Americans (not Un-Americans).

Once I had enough records to view the difference between a nearby ambient environmental record, and the home temperature sensors, I could determine if the factor previously calculated to mask out the effects of the monitoring system CPU heat emissions against the too-close sensor device needed to be adjusted (see link below for a curve-fitting topic).

 

External%20and%20Internal%20Temperature%20Readings

External and Internal Temperature Readings

You can tell the ambient reading (in green, and below the other 2 values) is from the government data as the values are integers; average is system calculated. To make the diagnostic more challenging, the delta between inside and outside is not fixed, even with the known state of not using air-conditioning, as there’s a lag in daylight heating as there is one in nighttime cooling. However, I can glance at the other inside thermometers for a sanity check (oh, it’s 85 outside but 83 inside, for example). The chart above shows a 5-10 degree F spread (roughly 3-6 C?). So I came up with a working hypothesis for how much to adjust the previously set offset.

Good thing I waited and looked further though. I had spotted a runaway system process more than once, which looked liked this, on the sensor node:

 

  PID USER  PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND   
12244 user  39  19  480524  34460    672 R  99.7   7.9  20644:16 logrotate 
26807 user  20   0   11352   2900   2488 R   1.0   0.7   0:00.09 top       
   11 user  20   0       0      0      0 S   0.3   0.0  13:33.69 process
22895 user  20   0       0      0      0 I   0.3   0.0   0:01.44 process

Why is logrotate still running? That is still an open case, but the recognition allowed me to make progress in this investigation. Maybe high CPU usage, even on just one core, could generate some heat.

Zoomed%20in%20temperature%20changes

Zoomed in temperature changes

Looks highly suspicious. Timing just after midnight is rookie scheduler work. Always randomize starting times; use prime numbers. Distribute for niceness.

I looked at other systems and parameters to see if the issue was widespread and have not found a conspiracy. One node showed a spike in one CPU metric .

CPU%20system%20time%20-%20reference

CPU system time – reference

 

Zooming in:

 

CPU%20system%20time%20-%20zoomed%20in

CPU system time – zoomed in

 

This looked like a normal “housekeeping” job impact, where file systems were scanned, data digested and indexed, and maybe some uploads/downloads of audit bits. Not a concern since the background level returned to the “before” range.

MORE DATA

 

When I killed the runaway process, I could see the system CPU return to the baseline (just before 2:40 PM local time):

System%20CPU%20with%20a%20twist

System CPU with a twist

 

I thought this case was wrapped up when I viewed the temperature reading subsequent to the process cleanup (by 2:56 PM):

Temperatures%20begin%20to%20coalesce

Temperatures begin to coalesce

 

Then I looked at the internal and external values once more.

 

Temperature%20values%2C%20with%20a%20spike

Temperature values, with a spike

 

Oh dear, invalid testimony. Clean that data spike out of the archives (harder in the summer since 89 degrees is possible, where in the winter it would have been a dead (heat) giveaway.

Fixed it:

Temperature%20ranges%2C%20no%20spike

Temperature ranges, no spike

 

Both sources stayed in synch from the process cleanup until the following midnight. As expected, on warm days the interior temperature drop is slower than external. For my purposes, the values to be recorded have stabilized. Or had. The problem recurred!

 

Process%20caused%20heat%20spike

Process-caused heat increase

As mentioned earlier, I don’t know why this is running yet, nor the best solution given the futility of a fix lasting under 24 hours max.

 

Lessons learned

Calibrate. Check your reference sources. Eat your wheaties.

Calibrate again.

 

 

 

Next steps

Clean this puppy up. Increase the current hourly freely available external data feeds.

 

References

These are U.S. sites except for the wmo.int, Adafruit and PkgSrc. I assume you can find weather stations with little effort locally most places.

 

noaa.gov

weather.gov

https://www.aviationweather.gov/metar

https://weather.gov/xml/current_obs/

https://www.ncei.noaa.gov/products/world-weather-records

public.wmo.int/en

https://learn.adafruit.com/calibrating-sensors/multi-point-curve-fitting

https://pkgsrc.se/sysutils/logrotate

 

https://tgftp.nws.noaa.gov/weather/current/

https://tgftp.nws.noaa.gov/ftpmail.txt

Example: 
To obtain the Atlantic high seas Forecast, WMO header FZNT01 KWBC,
AWIPS header HSFAT1


Send an e-mail to:      NWS.FTPMail.OPS@noaa.gov
Subject Line:           Put anything you like
Body:                   open
                        cd data
                        cd raw
                        cd fz
                        get fznt01.kwbc.hsf.at1.txt
                        quit

Assigned Tags

      Be the first to leave a comment
      You must be Logged on to comment or reply to a post.