[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: smart fans



On 9/3/21 3:05 PM, Emanuel Berg wrote:
David Christensen wrote:

Feeding the raw data into LibreOffice Calc was problematic

<snip>

OK, changed.


Your data format still has issues.  This is what LibreOffice Calc wants:

time,governor,processes,CPU_temperature,system_load,CPU_fan_speed,core1_freq,core2_freq,core3_freq,core4_freq
1630072652,conservative,157,33.25,0,0,1685.535,1416.164,3029.758,1415.421
1630072712,conservative,157,33.25,0,0,1537.245,1402.528,3118.173,1286.09
1630072772,conservative,157,33.25,0.06,0,1426.077,1356.078,1358.613,3049.18


Note:

1.  No extraneous characters anywhere -- e.g. spaces.

2.  No spaces in column names; use underscores and abbreviations.

3.  One column name for each and every column.

4.  Number of columns is same in every row.

5.  Commas (field separator) between every row item.

5.  Newlines (record separator) at the end of every row.


The system *must* be otherwise idle before and during the
test. The data indicates it was not.

I didn't use the computer and there was no download or
anything explicit run by me in the background, so I don't know
what more to do in that regard?


Then the "cpu-stats" function must be causing the spikes in the "core*_freq" values, by the several child processes it creates in rapid succession. Get rid of the useless CPU_fan_speed call. Insert a 1 second sleep ahead of each of the remaining data point collection calls. Adjust the iteration counts and the sleep times in the calling loops to compensate.


<snip>

There is supposed to be a cool-down delay at the start of
each governor sub-run. My WAG was 3 minutes. Your test did 2
minutes (off by 1 error).

Weird, if you mean this part of the pseudocode

   loop 3 times
     sleep 60 seconds
     print statistics
   endloop

then that's this part of the zsh code

   repeat 3 {
     sleep 60
     cpu-stats
   }

so I don't know what that isn't 3 minutes ... ?


I was off-by-one -- there is no "print statistics" at the top of the script, which means the first data row comes out after 60 seconds. Please add a 10 second sleep and a call to "cpu-stats" prior to the governor loop.


But, a fundamental reality is the Nyquist–Shannon sampling theorem:

    https://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampling_theorem


Each of the data points your script is collecting has some signal processing behind it; notably a sampling period. Please RTFM and see if you can determine the periods of the CPU temperature sensor readings and the core frequency readings, and if you can adjust those periods so that they are longer than your data collection period.


It might be best to rework the loops so that the data collection period is constant throughout the entire test.


After each loading cycle, the load processes must be killed.

OK.


Looking at the "processes", "system_load", and "core*_freq" fields, the overall test data indicates that the "conservative" sub-run loaded the machine to 100% and that the loading processes never were killed. The second and subsequent governor sub-runs just added more load. Debug the code related to "pids" and kill.


<snip>

Have you isolated and identified the primary noise
source(s)? Have you quantified them (e.g.
sound measurement)?

No?


If the loudest sound source is the PSU or GPU fan, then tuning the CPU and/or chassis fans isn't going to help.


<snip>


David


Reply to: