Anyone know why the linux select() function is broken?
baumann.Pan_at_gmail.com
Date: 03/29/05
- Next message: baumann.Pan_at_gmail.com: "Re: Anyone know why the linux select() function is broken?"
- Previous message: Bryan Batten: "Re: how to write() a double"
- Next in thread: baumann.Pan_at_gmail.com: "Re: Anyone know why the linux select() function is broken?"
- Reply: baumann.Pan_at_gmail.com: "Re: Anyone know why the linux select() function is broken?"
- Reply: David Schwartz: "Re: Anyone know why the linux select() function is broken?"
- Reply: Alvin Beach: "Re: Anyone know why the linux select() function is broken?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 28 Mar 2005 17:53:12 -0800
I am in the trouble that when select func timeout once, it will timeout
next time.
1, I sent a packet to a LAN server A(192.168.0.10:3233), select works
ok.
2, I sent a packet to a virtual lan server B(no this ip in lan:2323),
select timeout, it's correct in this case, but
3, I sent a packet to the LAN A, and etherpeek also captures the
response packet, but select timeout failed.
why? how can I resolve it? thanks any advice.
I also searched the newsgroup, found someone has asked the same
question, but no solutions for the problem.
below is the one:
Anyone know why the linux select() function is broken?
All 7 messages in topic - view as tree
Steve McWilliams Jan 8 2000, 12:00 am show options
Newsgroups: comp.os.linux.development.apps
From: stev...@Radix.Net (Steve McWilliams) - Find messages by this
author
Date: 2000/01/08
Subject: Anyone know why the linux select() function is broken?
Reply to Author | Forward | Print | Individual Message | Show original
| Report Abuse
I posted this the other day in the thread discussing select timeout
problems
but got no response. Since that thread has since degenerated into
absurdity,
I'd like to try again.
The problem I have isolated is that if a udp socket is openned, and a
packet
is sent for which there is no receiver, a subsequent select call on the
socket
erroneously times out immediately. If there is a receiver however, the
select
call times out correctly. This bug only manifests itself under linux,
not
solaris or nt.
Below is the test code to provoke the problem. Thanks in advance for
any ideas.
Steve
--
/*
* file: main.c
*/
#include <arpa/inet.h>
#include <netinet/in.h>
#include <stdio.h>
#include <string.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <unistd.h>
#define REM_PORT 11430
#define REM_IP_ADDR "192.1.1.80"
int fd;
struct sockaddr_in loc_addr;
struct sockaddr_in rem_addr;
int socket_open(void);
int socket_send(void);
int socket_recv(void);
int socket_select(void);
int main(int argc, char *argv[])
{
printf("configured for address %s, port %d\n", REM_IP_ADDR,
REM_PORT);
if (socket_open() < 0)
return 1;
if (socket_send() < 0)
return 1;
if (socket_select() < 0)
return 1;
if (socket_recv() < 0)
return 1;
return 0;
}
int socket_open(void)
{
printf("openning socket ...\n");
if ((fd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP)) < 0)
{
perror("unable to get local socket");
return -1;
}
memset(&loc_addr, 0, sizeof(loc_addr));
loc_addr.sin_family = AF_INET;
loc_addr.sin_addr.s_addr = htonl(INADDR_ANY);
if (bind(fd, (struct sockaddr *)&loc_addr, sizeof(loc_addr)) < 0)
{
perror("cannot bind local socket address");
return -1;
}
memset(&rem_addr, 0, sizeof(rem_addr));
rem_addr.sin_family = AF_INET;
rem_addr.sin_addr.s_addr = inet_addr(REM_IP_ADDR);
rem_addr.sin_port = htons(REM_PORT);
return fd;
}
int socket_send(void)
{
printf("sending message ...\n");
if (sendto(fd, "hello", 6, 0, (struct sockaddr *)&rem_addr,
sizeof(rem_addr)) != 6)
{
perror("send failed");
return -1;
}
printf("sent message\n");
return 0;
}
int socket_recv(void)
{
unsigned int len;
char buffer[256];
printf("receiving message ...\n");
if (recvfrom(fd, buffer, 256, 0, (struct sockaddr *)&rem_addr,
&len) <= 0)
{
perror("recv failed");
return -1;
}
printf("received %s\n", buffer);
return 0;
}
int socket_select(void)
{
int ret;
fd_set read_set;
struct timeval timeout = { 10, 0 };
printf("selecting socket (%ld second timeout) ...\n",
timeout.tv_sec);
FD_ZERO(&read_set);
FD_SET(fd, &read_set);
if ((ret = select(fd + 1, &read_set, NULL, NULL, &timeout)) < 0)
{
perror("select failed");
return -1;
}
else if (ret == 0)
{
fprintf(stderr, "select timeout\n");
return -1;
}
printf("select returned %d, bit is %d\n", ret, FD_ISSET(fd,
&read_set));
return 0;
- Hide quoted text -
- Show quoted text -
}
David Schwartz Jan 8 2000, 12:00 am show options
Newsgroups: comp.os.linux.development.apps
From: David Schwartz <dav...@webmaster.com> - Find messages by this
author
Date: 2000/01/08
Subject: Re: Anyone know why the linux select() function is broken?
Reply to Author | Forward | Print | Individual Message | Show original
| Report Abuse
Steve McWilliams wrote:
> I posted this the other day in the thread discussing select timeout
problems
> but got no response. Since that thread has since degenerated into
absurdity,
> I'd like to try again.
> The problem I have isolated is that if a udp socket is openned, and a
packet
> is sent for which there is no receiver, a subsequent select call on
the socket
> erroneously times out immediately. If there is a receiver however,
the select
> call times out correctly. This bug only manifests itself under
linux, not
> solaris or nt.
> Below is the test code to provoke the problem. Thanks in advance for
any ideas.
The behavior you are seeing is perfectly logical. A socket
should
select for read when there is an error on it. This is consistent with
TCP behavior.
DS
Mattias Engdegård Jan 9 2000, 12:00 am show options
Newsgroups: comp.os.linux.development.apps
From: f91-...@nada.kth.se (Mattias Engdegård) - Find messages by this
author
Date: 2000/01/09
Subject: Re: Anyone know why the linux select() function is broken?
Reply to Author | Forward | Print | Individual Message | Show original
| Report Abuse
In <8590c1$ra...@saltmine.radix.net> stev...@Radix.Net (Steve
McWilliams) writes:
>The problem I have isolated is that if a udp socket is openned, and a
packet
>is sent for which there is no receiver, a subsequent select call on
the socket
>erroneously times out immediately. If there is a receiver however,
the select
>call times out correctly. This bug only manifests itself under linux,
not
>solaris or nt.
According to the linux udp(7) man page:
All fatal errors will be passed to the user as an error
return even when the socket is not connected. This
behaviour differs from many other BSD socket implementa
tions which don't pass any errors unless the socket is
connected. Linux's behaviour is mandated by RFC1122.
For compatibility with legacy code it is possible to set
the SO_BSDCOMPAT SOL_SOCKET option to receive remote
errors only when the socket has been connected (except for
EPROTO and EMSGSIZE). It is better to fix the code to
handle errors properly than to enable this option.
Locally generated errors are always passed.
Steve McWilliams Jan 9 2000, 12:00 am show options
Newsgroups: comp.os.linux.development.apps
From: stev...@Radix.Net (Steve McWilliams) - Find messages by this
author
Date: 2000/01/09
Subject: Re: Anyone know why the linux select() function is broken?
Reply to Author | Forward | Print | Individual Message | Show original
| Report Abuse
- Hide quoted text -
- Show quoted text -
f91-...@nada.kth.se (Mattias Engdegrd) writes:
>In <8590c1$ra...@saltmine.radix.net> stev...@Radix.Net (Steve
McWilliams) writes:
>>The problem I have isolated is that if a udp socket is openned, and a
packet
>>is sent for which there is no receiver, a subsequent select call on
the socket
>>erroneously times out immediately. If there is a receiver however,
the select
>>call times out correctly. This bug only manifests itself under
linux, not
>>solaris or nt.
>According to the linux udp(7) man page:
> All fatal errors will be passed to the user as an error
> return even when the socket is not connected. This
> behaviour differs from many other BSD socket implementa
> tions which don't pass any errors unless the socket is
> connected. Linux's behaviour is mandated by RFC1122.
> For compatibility with legacy code it is possible to set
> the SO_BSDCOMPAT SOL_SOCKET option to receive remote
> errors only when the socket has been connected (except for
> EPROTO and EMSGSIZE). It is better to fix the code to
> handle errors properly than to enable this option.
> Locally generated errors are always passed.
Hmm. That's certainly a new one on me. I assumed that since it's a
connectionless protocol, sending to an address where there may not be a
receiver present was not considered an error.
Thanks,
Steve
andy Jan 9 2000, 12:00 am show options
Newsgroups: comp.os.linux.development.apps
From: a...@news-server.san.rr.com () - Find messages by this author
Date: 2000/01/09
Subject: Re: Anyone know why the linux select() function is broken?
Reply to Author | Forward | Print | Individual Message | Show original
| Report Abuse
On 9 Jan 2000 10:55:44 -0500, Steve McWilliams <stev...@Radix.Net>
wrote:
- Hide quoted text -
- Show quoted text -
>f91-...@nada.kth.se (Mattias Engdegrd) writes:
>>In <8590c1$ra...@saltmine.radix.net> stev...@Radix.Net (Steve
McWilliams) writes:
>>>The problem I have isolated is that if a udp socket is openned, and
a packet
>>>is sent for which there is no receiver, a subsequent select call on
the socket
>>>erroneously times out immediately. If there is a receiver however,
the select
>>>call times out correctly. This bug only manifests itself under
linux, not
>>>solaris or nt.
>>According to the linux udp(7) man page:
>> All fatal errors will be passed to the user as an error
>> return even when the socket is not connected. This
>> behaviour differs from many other BSD socket implementa
>> tions which don't pass any errors unless the socket is
>> connected. Linux's behaviour is mandated by RFC1122.
>> For compatibility with legacy code it is possible to set
>> the SO_BSDCOMPAT SOL_SOCKET option to receive remote
>> errors only when the socket has been connected (except for
>> EPROTO and EMSGSIZE). It is better to fix the code to
>> handle errors properly than to enable this option.
>> Locally generated errors are always passed.
>Hmm. That's certainly a new one on me. I assumed that since it's a
>connectionless protocol, sending to an address where there may not be
a
>receiver present was not considered an error.
>Thanks,
>Steve
So how do you tell there is no receiver?
If it is a local address and the ARP fails would be my first guess.
Any ICMP messages rejecting the message.
But there are still plenty of cases where you can't tell if there
is a receiver or not.
David Schwartz Jan 9 2000, 12:00 am show options
Newsgroups: comp.os.linux.development.apps
From: David Schwartz <dav...@webmaster.com> - Find messages by this
author
Date: 2000/01/09
Subject: Re: Anyone know why the linux select() function is broken?
Reply to Author | Forward | Print | Individual Message | Show original
| Report Abuse
a...@news-server.san.rr.com wrote:
> But there are still plenty of cases where you can't tell if there
> is a receiver or not.
Exactly. The absence of an error is not proof of reception.
However, if
there is an error, the operating system will tell the application about
it.
I strongly recommend that such errors be ignored, however.
Honoring
them makes it too easy for spoofed error packets to break 'connections'
using your protocol layered over UDP.
DS
Rick Ellis Jan 12 2000, 12:00 am show options
Newsgroups: comp.os.linux.development.apps
From: e...@ftel.net (Rick Ellis) - Find messages by this author
Date: 2000/01/12
Subject: Re: Anyone know why the linux select() function is broken?
Reply to Author | Forward | Print | Individual Message | Show original
| Report Abuse
In article <8590c1$ra...@saltmine.radix.net>,
Steve McWilliams <stev...@Radix.Net> wrote:
>The problem I have isolated is that if a udp socket is openned, and a
packet
>is sent for which there is no receiver, a subsequent select call on
the socket
>erroneously times out immediately. If there is a receiver however,
the select
>call times out correctly. This bug only manifests itself under linux,
not
>solaris or nt.
The select "times out" because there is an error to be reported. The
error
is from the previous send being rejected.
--
http://www.fnet.net/~ellis/photo/linux.html
End of messages
watch this topic
« Newer - Compilation problem !! Gtk
- Next message: baumann.Pan_at_gmail.com: "Re: Anyone know why the linux select() function is broken?"
- Previous message: Bryan Batten: "Re: how to write() a double"
- Next in thread: baumann.Pan_at_gmail.com: "Re: Anyone know why the linux select() function is broken?"
- Reply: baumann.Pan_at_gmail.com: "Re: Anyone know why the linux select() function is broken?"
- Reply: David Schwartz: "Re: Anyone know why the linux select() function is broken?"
- Reply: Alvin Beach: "Re: Anyone know why the linux select() function is broken?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|