Syscalls called at eip:0xffffe428 sys_no:119(sigreturn)

From: Min Lee (abraxsus_at_yonsei.ac.kr.NOSPAM)
Date: 07/27/04


Date: Tue, 27 Jul 2004 19:53:05 +0900

Hello, folks.
Currently I'm working under RH9+Linux 2.6.7 and
I inserted my code into syscall_call section in entry.S to catch syscalls
like strace.
This is a fragment of the result.

Jul 27 14:09:35 calanos kernel: pid:2388 myeip:400104e4 sys_no:5 mypr:0
Jul 27 14:09:35 calanos kernel: pid:2388 myeip:4001040b sys_no:197 mypr:0
Jul 27 14:09:35 calanos kernel: pid:2388 myeip:40010ccd sys_no:90 mypr:0
Jul 27 14:09:35 calanos kernel: pid:2388 myeip:4001051d sys_no:6 mypr:0
Jul 27 14:09:35 calanos kernel: pid:2388 myeip:400104e4 sys_no:5 mypr:0
Jul 27 14:09:35 calanos kernel: pid:2388 myeip:40010564 sys_no:3 mypr:0
Jul 27 14:09:35 calanos kernel: pid:2388 myeip:4001040b sys_no:197 mypr:0
Jul 27 14:09:35 calanos kernel: pid:2388 myeip:40010ccd sys_no:90 mypr:0
Jul 27 14:09:35 calanos kernel: pid:2388 myeip:40010ccd sys_no:90 mypr:0
Jul 27 14:09:35 calanos kernel: pid:2388 myeip:4001051d sys_no:6 mypr:0
Jul 27 14:09:35 calanos kernel: pid:2388 myeip:40010d54 sys_no:125 mypr:0
Jul 27 14:09:35 calanos kernel: pid:2388 myeip:40010d11 sys_no:91 mypr:0
Jul 27 14:09:35 calanos kernel: pid:2150 myeip:ffffe428 sys_no:119 mypr:1
Jul 27 14:09:35 calanos kernel: pid:2312 myeip:ffffe428 sys_no:119 mypr:2
Jul 27 14:09:35 calanos kernel: pid:2150 myeip:ffffe428 sys_no:119 mypr:0
Jul 27 14:09:35 calanos kernel: pid:2150 myeip:ffffe428 sys_no:119 mypr:2
Jul 27 14:09:35 calanos kernel: pid:2150 myeip:ffffe428 sys_no:119 mypr:0
Jul 27 14:09:36 calanos last message repeated 8 times

The command that I issued was "curl http://xxx". the strace shows that this
workload has some send/receive syscalls. myeip is the saved %eip, which has
been obtained by OLDEIP(%ebp) from system call handler. sys_no is the system
call number
and mypr:1 indicates it called inet_msgsend and mypr:2 indicates it called
inet_msgrecv.
(So I think they are send/receive syscalls even though their sys_no is 119,
which is sigreturn.)

The top-half part of the result of myeip:400xxxxx seems quite reasonable and
it also matches with the result of strace.
However the bottom-half part of myeip:ffffe428 has been purplexing me. :-(

Could you explain this situation to me, please?

first, why the %eip where the syscall is issued at is ffffe428 in kernel
space??
(I found some partial answer below but I don't understand fully.)

second, why the system call number is all 119, which is sigreturn?
They seems to be different system call from one another.

third, why the pids are all different?
(2150,2312,... one of them was bash shell and the other was something from
X..
if my memory serves me well.)
I don't think they are separate processes from my issued command "curl
http://xxx"
because they were the only syscalls that called "inet_msgsend and
inet_msgrecv" and
they always appears like that through several tests.
No other syscalls of "myeip:400xxxxx" has called inet_msgsend/inet_msgrecv..

I found some helping article for the question 1..
Also I found that the address range 0xffffe000 - 0xffffe400 (1KB) is
readable
from user mode. It looks like a hole seen from user mode..:-(
So my last question is.. why did they choose this weird scheme??

---
> I use ptrace(PTRACE_SYSCALL,...), and when the program traps I use
> PTRACE_GETREGS to get the EIP.  In RedHat 7.2, this EIP is the address
> in the program from where the syscall occurs.
>
> In RedHat 9, the EIP is always 0xFFFFE002.  It's apparently up in kernel
> space, if I try to read the memory with PTRACE_PEEKDATA, the call fails
> with EIO.  Can someone please explain what this address means?  And how
> do I recover the original address in the program where the syscall is
> called from?
The kernel dynamically assigns that page, and tells the process where it is
using AT_SYSINFO on the stack just after execve(); see /usr/include/elf.h.
At offset +0 is a subroutine which performs a syscall.  Today it may be
 int $0x80
 ret
or
 sysenter    # 0x0f 0x34
 ret
'systenter' usually is faster if the hardware supports it.
So the return address for that 'ret' (i.e., the top four bytes on the
stack) might be the address you desire.  But probably you want the
target address for the _following_ 'ret', which depends on how
the code which translated from C-language calling sequence with
arguments on the stack, into the kernel calling sequence with
arguments in registers, used the stack to store saved registers, etc.
Such usage is unique to each syscall wrapper.
---


Relevant Pages

  • Re: fanotify as syscalls
    ... I'd forgotten about Linus' strace argument. ... everything _should_ be a syscall. ... kernel with some idiotic packet interface. ... It's just a fancy way to do ...
    (Linux-Kernel)
  • Re: fanotify as syscalls
    ... that's unpleasant (adding to inotify via ioctl). ... Of course everything should be a syscall by that argument :-) ... And strace can trace some ioctls and setsockopts. ...
    (Linux-Kernel)
  • Re: [PATCH] tracer for sys_open() - sreadahead
    ... This tracer monitors regular file open() syscalls. ... Speaking about a global syscall tracer, I made a patch to trace only the syscalls ... The output would be something like a slimmed-down strace, ...
    (Linux-Kernel)
  • Re: CONFIG_HAVE_ARCH_TRACEHOOK and you
    ... to call the syscall tracing hooks if TIF_SYSCALL_TRACE isn't set. ... the register allocation ends up as: ... necessary for things like strace. ...
    (Linux-Kernel)
  • Re: list directory entreis..
    ... Note that this uses the read-one-entry, obsolete, readdir syscall ... As this still leaves us without a good "tutorial" version for getdents, ... ;read only data segment ...
    (alt.lang.asm)