Re: is select() being misused by the majority of programmers?

noone@xxxxxxx wrote:
I've come across something that doesn't make much sense to me. The
standard Linux manpage for select() defines the first parameter as the
"maximum file descriptor in the set, plus one"...This never made any sense
to me because it would cause one to assume that all file descriptors in a
contiguous range will be polled, and thus negates the need for even
filling a descriptor set prior to calling select.

What you are missing is that there can be thousands of file
descriptors. They are allocated in the bottom range: whenever a socket
or file is opened, the lowest available descriptor is given to it. This
means that the high numbers are never used in most processes. It would
be a waste of cycles in the kernel to loop over empty portions of the
fd_set bitmasks. This is why there is an argument which allows the
process to specify the highest number (plus one).

That doesn't mean that all of the descriptors below that value are in
the set. It just establishes a boundary on the range of fd_set that is
in use.

For instance suppose you have a process which only wants to monitor
sockets for input. It does not care about STDIN_FILENO, STDOUT_FILENO
and STDERR_FILENO. It has five sockets and no other open files. So
these sockets are probably 3, 4, 5, 6, and 7. These are the
descriptors that are added to the set. The descriptors 0, 1 and 2 are
left out. The value that is specified for the nfds parameter is 8. It
means: ``monitor the entries from 0 to 7; do not bother looking at any
of the hundreds of entries in fd_set which are higher than 7''. Of
course, this does not imply that 0, 1 and 2 are monitored; they have
not been added to the set. Their entries are still checked, but ignored
when they are found to be zero in all three sets.

I've since found other documentation stating that in select(n,&inset...) n
is really the number of entries in the set inset.

Which proves that if you insist on misunderstanding something bad
enough, and misinterpret enough documentation, you can often manage to
confirm your misunderstanding.

The above isn't incorrect, if you do not insist on interpreting
``entries'' of a set as being elements, but rather places that hold
elements, and which can be empty.

explicitly set before calling select. Passing the length of the set makes
infinitely more sense.

The only thing that makes sense to pass to select is what the operating
system requires. You can jump up and down yearning for that value to
have different semantics, but it's not going to happen.

All the programmers and examples I've seen use the first (and more likely
incorrect) semantics for the select() parameter.

Sure, everybody is wrong, yet networking software runs somehow.

The nfds argument specifies the range of descriptors to be tested. The first nfds descrip-
tors shall be checked in each set; that is, the descriptors from zero through nfds-1 in the
descriptor sets shall be examined.

What part of this isn't clear?