Re: close fd while select/poll/epoll



On Mon, 4 Feb 2008 19:50:28 -0800 (PST) David Schwartz <davids@xxxxxxxxxxxxx> wrote:
| On Feb 4, 7:29 pm, phil-news-nos...@xxxxxxxx wrote:
|
|> The scenario is possible to create.
|
| It is not.
|
|> A test scenario does not have to
|> be perfect by being sure the system timing can't allow the thread that
|> does the closing of the descriptor until after the select/poll.
|
| It has to *know* that it has succeeded. If it does not know that it
| has gotten into a particular state, then the standards do not require
| that it see the behavior for that state even if it "just happens to
| be" in that state.

If the kernel conforms to a standard I have in mind, it would be able
to know it succeeded or failed, whichever the case might be. It would
get EBADF if the descriptor is already closed by the time poll() enters
the kernel code where the kernel atomically checks the list. If it does
get into the poll() waiting state, the close() called by another thread
can set the error flag and make poll() wake up, giving the caller POLLERR
for that descriptor. Depending on which error comes back from poll(),
the calling thread knows. The thread doing the close() can go away and
does not need to ever know.


| The standards never require one behavior if a conforming application
| cannot tell the difference.

Tell the difference of what? That one behaviour happened over another?


|> You can
|> have the select/poll wait for a very long time in a test case that never
|> gets data. You can have the closer thread also wait a very long time to
|> minimize the statistical chance of the closer doing the close before the
|> thread that does the select/poll actually getting into the kernel action
|> to do that. In the very extreme cases where it takes the waiter thread
|> minutes to get into select/poll due to weird system performance, you can
|> simply dismiss that test run as "not what we wanted to test". It will
|> result in an error in select/poll as if waiting on a descriptor that is
|> not open. So the program can even detect if it failed to carry out the
|> intended test.
|
| You're missing the point. The point is not that you may or may not get
| the behavior. The point is, if your code is not *guaranteed* to get
| the behavior, nothing requires that it see what happens with that
| behavior, even if it coincidentally does get into the state you want
| to test.
|
| The standards only say what compliant programs must see when they can
| tell the difference. If the program cannot tell the difference between
| two states, then anything permitted by the standard for one state is
| also permitted for the other.

I'm sure you could come up with some standard to define in which a
program would not be able to see the behaviour. For example we could
require that when a descriptor being waited on gets closed, the computer
must electrocute all dogs within a 10 meter radius. I doubt the program
would be able to detect that. Obviously this is an absurd example and
is intended as such. But I think _my_ idea of what such a standard
should be like would be easily detectable, enough to fully justify
that idea being a viable candidate for such a standard (other ideas
might work just as well).

Maybe we should pick such a hypothetical standard as a basis to talk
about this in a less vaguely abstract way.


|> So we can easily have a test program that will easily be able to carry
|> out the test in virtually all instances of being run, and readily detect
|> when it failed to do so (and possibly even just try again up to some
|> number of times).
|
| The problem is, since the test program will never know for sure that
| it is testing this case, the platform is never required to show the
| test program what it will do in that case. It is only if the program
| *knows* it is in state A that the platform is required to give it the
| behavior for state A.

I don't agree that it will never know for sure. I believe it will know
for sure that it succeeded in testing it in the vast majority of cases
and know that it failed in the extremely few cases it could fail.


| For example, suppose we allocate more memory than the system has. We
| (humans) know the platform is out of memory but the program cannot
| know that it is out of memory. So even if POSIX says something has to
| happen in this case, the 'as-if' rule allows the platform to not do
| that. The program has no way to know (and hence no *right* to know)
| that it is out of memory.

The malloc() call will return NULL. How is that so hard to define?

Of course what is important here is not if the system has or does not
have enough memory for _something_ but rather, if it has enough memory
for the process asking for some. It is a standard defining how an
interface behaves. The definition of malloc() is not about if there
is any memory anywhere, but rather, if some memory can be made available
to the calling process. If not, it gets NULL.



|> So we _can_ test what the system _would_ do in this case. So that is no
|> impediment to defining what it _should_ or _must_ do when it happens.
|
| Right, but we cannot tell what it should or must do, because it only
| has to do that for a program that *knows* it is in that situation.

It's a perfectly definable situation.

|
|> | As for 'epoll_wait', it's perfectly legal. If that's the last
|> | reference to the file, it will not longer be waited on. One of the
|> | advantages of 'epoll' is that you don't need the synchronization you
|> | need with 'poll' and 'select'.
|>
|> I agree that epoll_wait is a better way to do things. But I do not see
|> it as being the exclusive way to deal with this kind of error event.
|
| This is undefined behavior. It has caused security problems in the
| past.

Then maybe it needs to become a defined behavior.

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2008-02-04-2234@xxxxxxxx |
|------------------------------------/-------------------------------------|
.



Relevant Pages

  • Re: close fd while select/poll/epoll
    ... suppose we allocate more memory than the system has. ... If what you are asking is if a program can detect that malloc() returned ... a descriptor being waited on. ... my proposal becomes part of the standard. ...
    (comp.os.linux.development.system)
  • Re: A solution for the allocation failures problem
    ... Can you find anything in the standard which *allows* this behaviour? ... it defines how your C program is supposed to behave. ... previously allocated memory available for further allocation. ... says the memory will be made available for further allocation. ...
    (comp.lang.c)
  • Re: Target market for Intellasys.
    ... I would like to see a full speed general purpose embedded serial interconnect IO bus standard. ... Looking through serial memory I found something like a maximum of 70mhz speed which is pretty useless if it was the only bus in a cutting edge design. ... All other control mode levels also have this ability, to allow developer to instigate their own versions of the features not defined in that level, as they wish, if they wish. ...
    (comp.lang.forth)
  • Re: Copying pthread_t
    ... Where is this different from all the other functions in the standard? ... object is modified through one pointer while the other pointer is ... each 'pthread_t' could have some memory allocated to ... initialize a pthread_t is by one of the pthread_* functions like ...
    (comp.programming.threads)
  • Re: Why no folder manipulation in Std C
    ... obeys every other requirement laid out in the standard. ... correctly execute the "one program" doesn't guarantee anything about ... implementation creates machine code for the "one program" that attempts ... platform which doesn't have enough memory to allow such an object to be ...
    (comp.std.c)

Loading