Re: Annoying system logging problem...

From: P.T. Breuer (ptb_at_oboe.it.uc3m.es)
Date: 08/07/04


Date: Sat, 7 Aug 2004 17:54:01 +0200

Fredderic <ciredderf@sumirpi.is_backwards_at.com.au> wrote:
> On Fri, 06 Aug 2004 19:54:42 +0200, P.T. Breuer wrote:
>
> >> > Fredderic <ciredderf@sumirpi.is_backwards_at.com.au> tried to express:
> >> >> Usually sometime after the systems been up a couple hours, the system
> >> >> logging seems to get all jammed up.
> >> >> The next program that tries to write to the system log simply freezes.
> > Meaning what? What is the process state recorded in the process table?
>
> Processes are sleeping... I can't remember exactly what sleep state
> though.

It's consistent with waiting in kernel for more memory.

>
> > What syscall is it in?
>
> How can I check the syscall? Would running su under strace tell me?

The lchan column of the ps output is what I wanted.

> > And what state is the syslog program itself in.
>
> I've never actually thought to check... But I will.

Please do.

> Well, /var/log/messages, actually... Everything goes to messages. Any
> program trying to use the "standard" syslog facilities stalls, but the

Please be specific! I cannot guess what you mean.

> file's still perfectly writable.

> >> 1) The logging functions just fine for a few hours, sometimes a day or
> > What do you mean by "the logging"?
>
> "THE LOGGING"... How many system logging daemons are there? :)

I don't know if you mean syslogd or writing direct to a file. Or klogd,
come to that.

> >> two. I have yet to figure out any kind of pattern, because I don't
> >> notice that it's happened until I notice various daemons have died, or
> >> until I
> > Daemons have died? What do you mean? Are they in the process table or
> > not?
>
> Stalled, as I mentioned perviously in the post (or was it the previous

Stalled how? What state?

> post?). From the point of view of using it, it seems pretty dead.
>
>
> >> try to su to root for some reason.
> > ??
>
> Run the su command, it just stops. Can't Ctrl-C, suspend with Ctrl-Z.
> Didn't try SIGSTOP, though SIGKILL worked, I think. Though don't quote me
> on that last one...

You'll have to strace it.

> >> 2) When it crams up, I can't log in as root, or anyone else. Shutdown
> > Sure you can.
>
> No, actually, I can't. Consoles won't log in as any user, "su" and "sudo"
> both stall, "login" isn't any better.

Login is truing to log, as you said, and is stalled against that.

> >> fails, as does anything else that uses the syslog facility. An XTerm
> >> or a
> > There is a syslog syscall. Is that what you mean?
>
> Probably. I've been poking around some more since I write that post, and
> looked into syslog() and family... I'm guessing it starts there. I might
> roll a quick test program.
>
>
> >> console window makes absolutely no difference.
> > Difference to what?
>
> To being able to do didly squat with any process that tries to use the
> syslog facilities.
>
>
> >> 3) I have tried just about every signal available from my usual
> >> non-root account. They all either can't be sent to a root process from
> >> a non-root user, or don't do anything useful.
> >> 4) It is fixed using SysRq-S (Emergency Sync). Now that sounds just a
> > So it's your kernel. Syncing sends all buffers to disk. You are not
> > aging buffers to disk, and your ram is full. You should be running the
> > bdflush daemon (or modern equivalent), and you aren't.
>
> : ps aux | grep flush
> root 5224 0.0 0.0 0 0 ? S Aug07 0:00 [pdflush]
> root 6244 0.0 0.0 0 0 ? S Aug07 0:00 [pdflush]

This looks bad! How can pdflush be swapped out? Oh well. I suppose it's
normal.

> Oh, and memory isn't full, I can even start other programs. Once, I

Don't presume - print the stats.

> > Anyway, it sounds as though you should also be interested in man
> > syslog.conf(8). You are writing sync to files, when clearly you want to
> > be writing async, since you have decided not to sync your files to disk
> > in general! At a guess I would say that you are running on a portable
> > with bdflush disabled, large age to disk time for buffers, and disks
> > spun down.
>
> Hmmm..... Could be something in this new pdflush thing... Man doesn't

It's a kernel thread, I suppose. Nothing to do with man.

> know squat about pdflush... Now how do I configure this thing.....?

Through the kernel /proc interface. Look for "*flush*", I think.

>
> The only one of my disks that ever gets to spin down is the one with the
> swap partition on it.

Aaaaargh.

> It usually makes me wait a few seconds when I
> unblank the screen (since all the apps have to scramble to redraw
> themselves).
>
>
> Anyhow... I'll take a look at a few of those things you mentioned, and
> see what pops up.

Peter



Relevant Pages

  • Re: Annoying system logging problem...
    ... it affects EVERY program that uses the usual syslog ... How many system logging daemons are there? ... Syncing sends all buffers to disk. ... for "pon" to do its thing, before I realised that "pon" wasn't going didly ...
    (alt.os.linux)
  • Re: Annoying system logging problem...
    ... And what state is the syslog program itself in. ... There is a syslog syscall. ... Syncing sends all buffers to disk. ...
    (alt.os.linux)
  • Re: Openserver 6.0 wc -l /usr/adm/syslog reboots system
    ... I copied syslog to syslog.old and zeroed out syslog in maintenance mode. ... If this were a SCSI system, I would use the SCSI controller to perform ... tool to check the disk. ... But since the client has purchased a used Dell PowerEdge 2600, ...
    (comp.unix.sco.misc)
  • Re: Openserver 6.0 wc -l /usr/adm/syslog reboots system
    ... I copied syslog to syslog.old and zeroed out syslog in maintenance mode. ... If this were a SCSI system, I would use the SCSI controller to perform ... tool to check the disk. ... And many IDE manufacturers have utility disks for download with ...
    (comp.unix.sco.misc)
  • Smart messages in my syslog
    ... I decided to write to this list, because I'm still seeking for my syslog ... New Maxtor HDD: ... this disk I have Debian Sarge and standard smartmontools package. ... Any furher infos on request;-) ...
    (comp.os.linux.hardware)