Re: CPU temp
From: Floyd L. Davidson (floyd_at_barrow.com)
Date: 10/29/04
- Next message: Kamus of Kadizhar: "Re: how to join the gnome session from windows through VNC?"
- Previous message: Tobias Skytte: "Re: What Can Linux do that Windows can not FOR ME?"
- In reply to: Jean-David Beyer: "Re: CPU temp"
- Next in thread: Jean-David Beyer: "Re: CPU temp -- LONG"
- Reply: Jean-David Beyer: "Re: CPU temp -- LONG"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Fri, 29 Oct 2004 10:41:46 -0800
Jean-David Beyer <jdbeyer@exit109.com> wrote:
>I started using lm_sensors a few months ago on my new (March
>2004) machine with two 3.06GHz Xeon processors (the ones with
>1Megabyte L3 caches).
>
>The box it is in has a lot of fans. In the front is an air
>filter and there are three intake fans, the 25mm thick
>ones. These are all constant speed fans. The main one is 120mm,
>the second one is 80mm, and I added a 40mm (only size that would
>fit) that blows right at the top CPU fan intake. None of these
>have tachometers that the lm-sensors package can read (though
>the BIOS can read some of them). lm-sensors reads the CPU fans
>and temperatures, and "System" temperature.
If the BIOS can read them, so can lm_sensors. Of course the
trick to that is figuring out how! (Below is first a generic
discussion, followed by comments specific to your described
hardware.)
First, look in /sys/bus/{i2c | isa}/drivers, to find essentially
a list of drivers which have been correctly loaded to supply
information that lm_sensors can read. (Assuming a 2.6 kernel
and the sysfs pseudo filesystem is mounted.)
There must be at least one or more drivers loaded to provide
connections to at least one or more hardware monitor chips on
the motherboard. Then /etc/sensors.conf is configured to use
the data made available by the driver(s). So two questions come
up:
1) What hardware monitoring chips are available, and
2) How to configure /etc/sensors.conf to see them.
Neither of those questions are necessarily as straight forward as
they may seem!
For example, if that motherboard uses any of the Winbond chips,
you might have the w83681d module loaded, and there will be a
w83681d directory in /sys/bus/i2c/drivers. And in that
directory (if and only if the module is correctly configured
when it is loaded) will be symlinks to one or more other
directories. These will be named according to the i2c bus
address, as different functions on the same chip may use
different addresses. There might be multiple different chips
too!
The problem then is to correlate the data files under each
directory with a specific measurement on the motherboard. The
*only* accurate way that can be done is with information from
the board manufacturer. But of course a few educated guesses
can be useful too!
Hence, if a given setup for lm_sensors is only measuring two
fans and two temperatures, but there are data files for four of
each and the BIOS is reporting four... the right configuration
for modules and for /etc/sensors.conf will provide more
information.
That of course assumes the process the lm_sensors package
recommends for discovery has worked, and that the correct
drivers are loaded and able to detect all of the hardware
monitoring chips. That is *not* guaranteed to happen though! I
have a couple of Tyan S2462 SMP boards using AMD Athlon chips.
The documentation clearly states that it has a W83682D hardware
monitor chip *and* a W83627HF Super I/O chip that can also do
hardware monitoring. The BIOS reads all sorts of things, yet
lm_sensors does not normally detect the W83627HF chip! (When
the board first came out there was no information available
indicating which data was what, and everyone using lm_sensors
had the temperature probes labeled wrong. Later Tyan released
the information and made life easier.)
It turns out that if the BIOS is used to look at the hardware
monitor during the boot process the W83627HF hardware monitor
sections of the chip are enabled and initialized, and then
lm_sensors can see it. Buy a power cycle will make it disappear
again.
I had to write a utility to initialize the W83627HF chip
at boot time.
http://web.newsguy.com/floyd_davidson/code/sensors/w83627hf/
That allows lm_sensors to report everything the BIOS does,
(and more) on a Tyan S2462 motherboard.
Of course I've also discovered that there are differences
between an early version of the S2462 and a later version.
A couple of the voltage probes provide different raw data
for the same voltage, and the ability to control fan speed
is not there for some fans on the early board.
>I have had two "temperature events". At one point, the
>temperature of the "System" slowly went up about 5 degrees
>C. Since the room temperature had gotten up to about 90F, I
[interesting descriptions snipped]
I found that notable because of the 5 degrees. It seems the
normal range of variation can be nearly that much apparently
just from the week to week variations of the probes and the
chip/circuit used to monitor it.
What I'm doing is plotting graphs of all voltages and
temperatures. Under the directory "tellerstats" in the
lm_sensors distribution is an example using gnuplot to generate
a web page. I've modified that and enhanced it significantly,
and the results, plus a link to the scripts that produced these
graphs, are shown at this URL,
http://web.newsguy.com/floyd_davidson/sensors/
I'm generating 20 some graphs every 10 minutes using 2 minute
sampling. Each graph shows 48 hours of history.
>Fri Oct 29 11:00:00 EDT 2004
>w83627hf-isa-0290
>VCore 1: +1.45 V (min = +1.36 V, max = +1.47 V)
>VCore 2: +3.31 V (min = +3.13 V, max = +3.45 V)
>+3.3V: +3.23 V (min = +3.20 V, max = +3.45 V)
>+5V: +4.97 V (min = +4.84 V, max = +5.24 V)
>+12V: +11.97 V (min = +11.48 V, max = +12.58 V)
>-12V: -11.83 V (min = -13.06 V, max = -11.41 V)
>V5SB: +5.43 V (min = +4.84 V, max = +5.24 V)
>VBat: +3.23 V (min = +2.40 V, max = +3.60 V)
>CPU0 fan: 2909 RPM (min = 1500 RPM, div = 2)
>CPU1 fan: 2250 RPM (min = 1500 RPM, div = 2)
>System: +40C (limit = +45C, hysteresis = +42C) sensor = thermistor
>CPU0: +52.0C (limit = +60C, hysteresis = +58C) sensor = thermistor
>CPU1: +51.0C (limit = +60C, hysteresis = +58C) sensor = thermistor
>vid: +1.400 V
That looks pretty good. I'd recommend changing the fan divisor
to 8 though. You'll get better resolution, particularly at
lower speeds.
It appears that you are loading the w83627hf module, and thus
using the LPC (ISA) bus to access the W83627HF chip. Is that
necessary, or is it also connected to the i2c bus? The w83781d
module will provide W83627HF data via the i2c bus. I would
suspect that if your BIOS is seeing more than lm_sensors you
might find that there is either a second W83627HF Super I/O chip
or more likely an additional dedicated hardware monitor chip
like the W83782D (I.e., the same arrangement Tyan boards use.)
In any case, accessing via the i2c bus is probably preferred,
even though accessing it can be more complicated (especially
if there are multiple chips to access).
>The processors do not seem to go over 55C even when the fan
>speeds double. What I think interesting is that the design of
>the fans is "wrong"; the temperature sensing thermistor is at
>the fan intake. It seems to me it would be smarter, though
>technically difficult, to measure the air temperature at the
>output of the processor wind tunnel, giving the fan some idea
>how hot the processors were. But they seem to do the right
>thing, and do it quite well, as they are.
I assume you are talking about the "System" temperature, as
opposed to the CPU temps. The fan speed should be controlled by
the cpu temperature. The exhaust temperature is of little
value...
>Another thing I notice is that the power required by modern
>chips (and by "modern" I mean Pentium IIIs and later) varies
>depending on the processing load. If I run 4 instances of
>setiathome on the two hyperthreaded Xeon processors, the power
>required is noticeably more (say 30% or more) than when I do not
>and the machine is essentially idling. And the temperature and
>processor fan speeds do go up.
Interesting! I'm not set up to monitor power, and maybe that
would be useful to add. Right now I have the onboard hardware
monitoring chips, plus a Crystalfontz CFA-633 temperature monitor
and fan controller device (a *great* toy!).
I had a related but totally different experience a couple days
ago when a friend asked me to take a look at his PC, which was
freezing up after just a couple minutes of operation. It has
a 1.7 GHz (his description, I didn't check) Celeron. It turned
out the bracket to hold the cpu heatsink/fan assembly in place
had one broken retainer, so the heatsink was lifted up in one
corner and not making physical contract with the cpu.
It was interesting to watch (I would never dare do this with my
own cpu's, but since it had been turned on and left on several
times for many minutes, I did exactly that with the BIOS
hardware monitor screen in place). The first indication that it
was too hot (other than being 104C... :-) was that the keyboard
ceased to work. The CAPS LOCK, SCROLL and NUMLOCK keys would
not toggle the LED's on and off. At that point it stopped
getting warmer too.
We took the board out, removed the mounting bracket for the
heatsink, drilled a couple holes in it and mounted a new "catch"
for the heatsink by bending a 1" brad to fit the holes and be in
the right place. A bit of super glue to make sure it stayed in
place, and it was back in business.
-- FloydL. Davidson <http://web.newsguy.com/floyd_davidson> Ukpeagvik (Barrow, Alaska) floyd@barrow.com
- Next message: Kamus of Kadizhar: "Re: how to join the gnome session from windows through VNC?"
- Previous message: Tobias Skytte: "Re: What Can Linux do that Windows can not FOR ME?"
- In reply to: Jean-David Beyer: "Re: CPU temp"
- Next in thread: Jean-David Beyer: "Re: CPU temp -- LONG"
- Reply: Jean-David Beyer: "Re: CPU temp -- LONG"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|