The ABAP Detective Takes The Summer Heat
Previously, my detective work into measuring ambient environmental data led me to the clues presented in the winter-time post “The ABAP Detective Takes The-Heat … Cleanse/”. And then the spring time change post “The ABAP Detective Gets Their Clock Cleaned.”
As the sunrise got earlier and the days became longer, solstice kicked in. When I checked a few gauges, I thought the reported temperature values were too high. I could glance at a wall thermometer or HVAC (heating/ventilating/air conditioning) controls and see a discrepancy. Any time the numbers don’t look right, it’s like the spider sense kicks in. And then the hypotheses begin, even before enough data to draw logical conclusions has been captured.
My first take was to pull out the old spiral notebook and note data manually, as Columbo might do, with pencil and paper. A little of that goes a long way, as any hacker will try to automate a repetitive task after the 3rd or 4th manual effort. Looking around at available reference gauges, nothing nearby was automated, including, especially, the house thermostat. Ripping that device out to get a smarter unit would be an expense and a possible commitment to a third party big tech host, and I decided that could wait.
Digging into the morgue (old clippings, not bodies) I found references to federal sites that publish environmental data, particularly weather related. Links below. To cut to the chase, I set up a recurring job to pull in current weather conditions, then load them into the same monitoring system as the CPU and environment gauges “in house”.
Here’s an example of the supplied parameters and values:
Baltimore / Martin, MD, United States (KMTN) 39-20N 076-25W Jun 27, 2022 - 04:55 PM EDT / 2022.06.27 2055 UTC Wind: from the WNW (300 degrees) at 9 MPH (8 KT):0 Visibility: 10 mile(s):0 Sky conditions: clear Temperature: 80 F (27 C) Dew Point: 66 F (19 C) Relative Humidity: 61% Pressure (altimeter): 29.96 in. Hg (1014 hPa) ob: KMTN 272055Z 30008KT 10SM CLR 27/19 A2996 cycle: 21
For a future refinement I’d probably clean up the logic to verify anything being captured and experiment with more frequent updates. It looks like the feed site is updated hourly, and that was the easiest to schedule. There could most likely be xml or json feeds. I maximized the capture by including both Fahrenheit and Celsius in the local archives, as the data volume is minimal at 24 samples per day. The charts below show degrees F since that’s the custom here. Hopefully this won’t ruin the plot for non-Americans (not Un-Americans).
Once I had enough records to view the difference between a nearby ambient environmental record, and the home temperature sensors, I could determine if the factor previously calculated to mask out the effects of the monitoring system CPU heat emissions against the too-close sensor device needed to be adjusted (see link below for a curve-fitting topic).
You can tell the ambient reading (in green, and below the other 2 values) is from the government data as the values are integers; average is system calculated. To make the diagnostic more challenging, the delta between inside and outside is not fixed, even with the known state of not using air-conditioning, as there’s a lag in daylight heating as there is one in nighttime cooling. However, I can glance at the other inside thermometers for a sanity check (oh, it’s 85 outside but 83 inside, for example). The chart above shows a 5-10 degree F spread (roughly 3-6 C?). So I came up with a working hypothesis for how much to adjust the previously set offset.
Good thing I waited and looked further though. I had spotted a runaway system process more than once, which looked liked this, on the sensor node:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 12244 user 39 19 480524 34460 672 R 99.7 7.9 20644:16 logrotate 26807 user 20 0 11352 2900 2488 R 1.0 0.7 0:00.09 top 11 user 20 0 0 0 0 S 0.3 0.0 13:33.69 process 22895 user 20 0 0 0 0 I 0.3 0.0 0:01.44 process
Why is logrotate still running? That is still an open case, but the recognition allowed me to make progress in this investigation. Maybe high CPU usage, even on just one core, could generate some heat.
Looks highly suspicious. Timing just after midnight is rookie scheduler work. Always randomize starting times; use prime numbers. Distribute for niceness.
I looked at other systems and parameters to see if the issue was widespread and have not found a conspiracy. One node showed a spike in one CPU metric .
This looked like a normal “housekeeping” job impact, where file systems were scanned, data digested and indexed, and maybe some uploads/downloads of audit bits. Not a concern since the background level returned to the “before” range.
When I killed the runaway process, I could see the system CPU return to the baseline (just before 2:40 PM local time):
I thought this case was wrapped up when I viewed the temperature reading subsequent to the process cleanup (by 2:56 PM):
Then I looked at the internal and external values once more.
Oh dear, invalid testimony. Clean that data spike out of the archives (harder in the summer since 89 degrees is possible, where in the winter it would have been a dead (heat) giveaway.
Both sources stayed in synch from the process cleanup until the following midnight. As expected, on warm days the interior temperature drop is slower than external. For my purposes, the values to be recorded have stabilized. Or had. The problem recurred!
As mentioned earlier, I don’t know why this is running yet, nor the best solution given the futility of a fix lasting under 24 hours max.
Calibrate. Check your reference sources. Eat your wheaties.
Clean this puppy up. Increase the current hourly freely available external data feeds.
These are U.S. sites except for the wmo.int, Adafruit and PkgSrc. I assume you can find weather stations with little effort locally most places.
Example: To obtain the Atlantic high seas Forecast, WMO header FZNT01 KWBC, AWIPS header HSFAT1 Send an e-mail to: NWS.FTPMail.OPS@noaa.gov Subject Line: Put anything you like Body: open cd data cd raw cd fz get fznt01.kwbc.hsf.at1.txt quit