Descriptor passed w/SCM_RIGHTS is invalid

From: Jo (JoJoTwilligo_at_hotmail.com)
Date: 09/30/04


Date: 29 Sep 2004 20:42:43 -0700

I've been banging my head for a week now over this issue, and now I'm
stuck dead. I have an open socket in a parent process, and I'm using a
pipe created by socketpair() to pass this socket to a child process.
For some bizarre reason, even though I can write to the socket in the
parent, write()s to the socket in the child fail, and errno is set to
EBADF, even though lsof gives the descriptor in the child as a valid
read/write socket.

The child has this piece:
int desNewSocket = GetNewSocket();
fprintf(a_logs_descriptor, "the child's (%d) socket descriptor: %d\n",
getpid(), desNewSocket);
if (write(desNewSocket, "hey man", 8) == -1 && errno == EBADF)
    fprintf(a_logs_descriptor, "but it won't write\n");

So the log reports:
the child's (5555) socket descriptor: 4
but it won't write

Now if I do a lsof after this descriptor has been read, I see that
process 5555 has 4u under FD. So it should be able to write to it,
right? Futhermore, process 5554 (the parent) has 20u open to the same
socket (lsof reports that they both have the same NAME). The parent
CAN write to descriptor 20 without any problem. In case you are
wondering if descriptor 20 in the parent is interfering with
descriptor 4 in the child, don't. I've tried various different ways of
dealing with 20, by closing it right away, or waiting for the child to
use the socket before the parent closes it, etc. BTW, the client
program to which this socket is connected doesn't seem to experience a
peer reset until BOTH descriptors are closed.

With lsof showing the process having a valid descriptor, I can't
imagine what the problem could be, but I suppose you'll want to see my
code.

The part before and just after the fork() is pretty simple:
int desAncils[2];
socketpair(AF_UNIX, SOCK_STREAM, 0, desAncils);

pid_t nPid = fork();
switch (nPid) {
        case 0: { //child
                close(desAncils[1]);
                desAncil = desAncils[0];//desAncil is a global
                break;
        }
        case -1: assert(false);
        default: { //parent
                close(desAncils[0]);
                desAncil = desAncils[1];
        }
}

I've researched routines for sending ancillary messages like crazy,
and I've wasted a lot of time on a lot of crap, but here what I've
ended up w/:
void SendNewSocket(int desChild, int desNewSocket) {
        int sendfd = desNewSocket;
        int nbytes = 100;
        char ptr[nbytes];
        
        struct msghdr msg;
        struct iovec iov[1];

        union {
                struct cmsghdr cm;
                char control[CMSG_SPACE(sizeof(int))];
        } control_un;
        struct cmsghdr *cmptr;

        msg.msg_control = control_un.control;
        msg.msg_controllen = sizeof(control_un.control);

        cmptr = CMSG_FIRSTHDR(&msg);
        cmptr->cmsg_len = CMSG_LEN(sizeof(int));
        cmptr->cmsg_level = SOL_SOCKET;
        cmptr->cmsg_type = SCM_RIGHTS;
        *((int *) CMSG_DATA(cmptr)) = sendfd;

        msg.msg_name = NULL;
        msg.msg_namelen = 0;

        iov[0].iov_base = ptr;
        iov[0].iov_len = nbytes;
        msg.msg_iov = iov;
        msg.msg_iovlen = 1;
        
        struct cmsghdr *cmsg = cmptr;
        ssize_t sent = sendmsg(desChild, &msg, 0);
        assert(sent == 1);
}

Now, here's the code in the child that does the receiving:
int GetNewSocket() {
        int nbytes = 100;
        char ptr[nbytes];
        
        struct msghdr msg;
        struct iovec iov[1];
        int recvfd;

        union {
                struct cmsghdr cm;
                char control[CMSG_SPACE(sizeof(int))];
        } control_un;
        struct cmsghdr *cmptr;

        msg.msg_control = control_un.control;
        msg.msg_controllen = sizeof(control_un.control);
        msg.msg_name = NULL;
        msg.msg_namelen = 0;

        iov[0].iov_base = ptr;
        iov[0].iov_len = nbytes;
        msg.msg_iov = iov;
        msg.msg_iovlen = 1;

                //desAncil is a global, don't forget
        if ( (recvfd = recvmsg(desAncil, &msg, 0)) <= 0)
                                        return(recvfd);

        if ( (cmptr = CMSG_FIRSTHDR(&msg)) != NULL &&
                        cmptr->cmsg_len == CMSG_LEN(sizeof(int))) {
                                        assert (cmptr->cmsg_level == SOL_SOCKET && cmptr->cmsg_type ==
SCM_RIGHTS);
                                        recvfd = *((int *) CMSG_DATA(cmptr));
        } else
                                        recvfd = -1; /* descriptor was not passed */

        return recvfd;
}

So that should do it, right? Remember, the desNewSocket parameter for
SendNewSocket() (which usually turns out to be 20) can be written to
in the parent, so the socket should be good. The descriptor that
appears out of GetNewSocket() (usually 4) is called a "bad descriptor"
upon writing, even though lsof shows it to be valid.



Relevant Pages

  • Re: When [exit] wont exit
    ... the child now does not talk to the parent at all. ... I thought maybe you meant for the parent to pass the child the socket over the pipe or some such. ...
    (comp.lang.tcl)
  • Re: asking help for a peer-to-peer socket programming question
    ... Amy ... >> let elder child processknow and talk to these younger child ... > each child and the parent, and each child can send messages to the parent ... clients' information (socket discriptor)in order to broadcast messages ...
    (comp.unix.programmer)
  • Forking a daemonic Socket listener from a CGI script - browser times out
    ... Socket server - which listens for a connection request from a socket ... However, I combine them into one CGI program, which forks a child process to ... process from the "CGI program" (parent) process, ...
    (comp.lang.perl.misc)
  • Re: Descriptor passed w/SCM_RIGHTS is invalid
    ... > pipe created by socketpair() to pass this socket to a child process. ... > parent, writes to the socket in the child fail, and errno is set to ... even though lsof gives the descriptor in the child as a valid ...
    (comp.unix.programmer)
  • Re: Descriptor passed w/SCM_RIGHTS is invalid
    ... > pipe created by socketpair() to pass this socket to a child process. ... > parent, writes to the socket in the child fail, and errno is set to ... even though lsof gives the descriptor in the child as a valid ...
    (comp.os.linux.development.apps)