Re: Repost: Bug with select?

From: Eli Barzilay (eli_at_barzilay.org)
Date: 07/26/03

  • Next message: Eli Barzilay: "Re: Repost: Bug with select?"
    Date:	Sat, 26 Jul 2003 10:25:17 -0400
    To: Marco Roeland <marco.roeland@xs4all.nl>, Ben Greear <greearb@candelatech.com>
    
    

    On Jul 25, Marco Roeland wrote:
    > > len = select(fd + 1, NULL, &writefds, NULL, NULL);
    >
    > A select with no timeout, so it will immediately return.

    The man page says:

           timeout is an upper bound on the amount of time elapsed
           before select returns. It may be zero, causing select to
           return immediately. (This is useful for polling.) If time­
           out is NULL (no timeout), select can block indefinitely.

    But I did (obviously) try adding one just in case -- the problem does
    not go away.

    > > if (!FD_ISSET(fd,&writefds)) exit(0);
    >
    > This might be what Solaris does differently, by _not_ including '1'
    > in the returned descriptors? Linux will say (rightly) that a
    > following call will not block, which is something very different
    > than 'will not fail'!

    I just added that when trying to trace the problem and reading
    somewhere that ISSET must be used... It never had any effect -- never
    exits and otherwise the program is still on a busy spin in Linux and
    fine on Solaris.

    > > len = write(fd, "hi\n", 3);
    >
    > You don't check the exit status here, but when you press Ctrl-C
    > (stdout blocked) it will indicate an error here (exit status -1)
    > with errno set to EAGAIN, meaning you should try again, which is the
    > appropriate result for a non-blocking descriptor or socket
    > here. Anyway, the call "succeeds" and we loop back into the
    > while(1), indeed as you say creating a busy loop. No surprises
    > there I'd say.

    Uh, that's just a stripped down example -- in the original the
    returned value is checked and the write is retried if the result is
    EINTR. The problem is that AFAICT, select should wait until the fd is
    writable, but then write fails with EAGAIN, only to have the next
    select succeed as if there is no problems.

    > > }
    > > fcntl(fd, F_SETFL, flags);
    > > }
    >
    > You might start by checking for EAGAIN as result of the write, and
    > then reacting according to your needs (waiting a while or exiting
    > the program or whatever).

    Yeah, when the problem occurs, write will result in an EAGAIN, but
    the next select should block until writing is ok.

    When I played with this now I saw another strange thing -- when there
    is a timeout in place, the FD_ISSET *will* return 0 after some output
    was done (probably when its waiting for output). So I thought that it
    might be a good place to put a sleep, but the problem is that 0 is not
    returned when the output is stopped.

    This is the program:
    ======================================================================
    #include <unistd.h>
    #include <fcntl.h>
    #include <errno.h>
    int main() {
      int flags, fd, len; fd_set writefds;
      struct timeval timeout; timeout.tv_sec = 1; timeout.tv_usec = 0;
      fd = 1;
      flags = fcntl(fd, F_GETFL, 0);
      fcntl(fd, F_SETFL, flags | O_NONBLOCK);
      while (1) {
        FD_ZERO(&writefds);
        FD_SET(fd, &writefds);
        len = select(fd + 1, NULL, &writefds, NULL, &timeout);
        if (len<0) exit(1);
        while (!FD_ISSET(fd,&writefds)) {
          sleep(1);
          FD_ZERO(&writefds);
          FD_SET(fd, &writefds);
          select(fd + 1, NULL, &writefds, NULL, &timeout);
          if (len<0) exit(1);
        }
        do {
          len = write(fd, "hi\n", 3);
        } while ((len == -1) && (errno == EINTR));
        if (len<0 && errno==EINTR) exit(2);
        /* if (len<0 && errno==EAGAIN) exit(3); */
      }
      fcntl(fd, F_SETFL, flags);
    }
    ======================================================================

    On Jul 25, Ben Greear wrote:
    > I thought select is supposed to tell you when you can read/write at
    > least something without failing. Otherwise it would be worthless
    > when doing non-blocking IO because you can both read and write w/out
    > blocking at all times.

    That was the point I was trying to make.

    On Jul 26, Marco Roeland wrote:
    > My 'analysis' was indeed based on experience with sockets, where you
    > don't get the busy spin. It's indeed a bit baffling why select keeps
    > insisting that fd 1 is writable. A quick test on kernel versions
    > 2.2.12-20, 2.4.20 and 2.6.0-test1 all give the same results, so I
    > suppose select itself is doing it's expected duty, and that in that
    > case the special underlying mechanics of stdout require special
    > mechanics to find out if it's blocked?! Beats me, but that's pretty
    > easy... ;-)

    This doesn't solve the problem, and as evidence, the code will look
    ugly with special cases for terminal output.

    -- 
              ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                      http://www.barzilay.org/                 Maze is Life!
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at  http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at  http://www.tux.org/lkml/
    

  • Next message: Eli Barzilay: "Re: Repost: Bug with select?"

    Relevant Pages

    • Group Policy processing aborted.
      ... timeout 600000 and flags 0x0 ... going on since we moved to 2003 and no compaints. ... not always fail, it is about every other GP update now. ...
      (microsoft.public.windows.server.sbs)
    • Re: a proposed callout API
      ... with a negative value indicating a relative timeout (so an 'uptime' ... i.e. trigger X us from now) and a positive value indicating an ... The bits _will_ go in the flags argument I proposed. ...
      (freebsd-arch)
    • Re: 2.6.11-rc3-bk5: XFS: fcron: could not write() buf to disk: Resource temporarily unavailable
      ... >> let you know once I have a fix or have found the culprit change. ... Turns out it was actually XFS giving back this EAGAIN, ... propogate more sync write errors out to userspace. ... VOP_IFLUSH(vp, flags, error); ...
      (Linux-Kernel)
    • Re: Problem to read on a serial port
      ... >> a consensus between systems which flags exist at all and which ... > Implementations could also have additional non-standard flags for ... > available by returning -1 and setting errno to EAGAIN. ...
      (comp.unix.programmer)
    • Re: Problem to read on a serial port
      ... > a consensus between systems which flags exist at all and which ... According to POSIX fcntlwould turn off O_APPEND, ... > 'FNDELAY' in POSIX, so wouldn't it be better to use O_NDELAY? ... available by returning -1 and setting errno to EAGAIN. ...
      (comp.unix.programmer)