Re: Big problems with glibc - how to debug?

From: Kendall Bennett (KendallB_at_scitechsoft.com)
Date: 03/22/04


Date: Mon, 22 Mar 2004 20:53:20 GMT

John Reiser wrote:

>> Basically something is screwy with __libc_enable_async_cancel(), such
>> that after a few open() calls we get a crash inside libc in this
>> function (at least it appears that way).
>
> Show the evidence: backtraces, etc. Is the code clean with respect to
> memory usage errors (malloc/free, and no use of uninitialized values)?
> Use Insure++, memcheck [valgrind], or Purify to find out.

I believe the code is clean with respect to memory accesses, and this
happens really, really early in the loading sequence for the
application. We have done a few mallocs, but no memory has yet been freed.

How do I get a back trace that I can post here? I am not all that
familiar with the back trace mechanism in GDB.

>> Also the other strange thing is that this appears to be related to 2.6
>> kernel services back ported to the Red Hat 2.4 kernels. If we set the
>> LD_ASSUME_KENERL=2.4.19, the problem goes away!
>
> This points to nptl being involved. Show the output of two cases:
> ldd /path/to/my_app
> LD_ASSUME_KERNEL=2.4.19 ldd /path/to/my_app

Interesting! With the assume my app points to /lib/i686/libc.so. Without
this (the default case) my app is linked against /lib/tls/libc.so. This
is different to Red Hat 8, where everything is always linked to
/lib/i686/libc.so.

I assume then that the /lib/tls/libc.so library is the threaded version,
and that for some reason that one is always used by default on newer Red
Hat installs. Now I just need to find out how to get a debug version of
this that I can debug with source code ;-)

>> Any ideas?
>
> Almost always, the best way to debug a thread problem is to construct
> a short testcase that triggers the problem quickly.

Actually what is wierd is that we are not using threads and the problem
still occurs. We removed all code that linked to pthreads before and
instead use clone() directly (our threads don't use the runtime library
- they just process input devices and post data to a shared memory
area). That fixed the problem on Red Hat 8, but now Red Hat 9 has issues
even with clone(). On Friday I removed all the clone calls completely
and the problem still shows up, which is wierd.

> The culprit will
> be obvious, and frequently the fix will be, too. So start a binary
> search: remove code from my_app until the problem goes away, then put
> things back just until the problem reappears, and continue until you
> have a one-line change in source code which triggers the problem.
> Then reduce that to a one-page test program, and post it.

At this point I have a stripped down app that really does nothing much
and I am still seeing the problem. Ideally what I would like is to find
some way to install a version of glibc onto my machine that I can debug
at the source level with GDB, so I can trace back in the source code the
cause of the problem inside the debugger.

What do I need to do in order to get symbols/source installed with GDB
for debugging glibc? Red Hat has debug packages that I installed once
before, but GDB would never see any source information.

-- 
Kendall Bennett
Chief Executive Officer
SciTech Software, Inc.
Phone: (530) 894 8400
http://www.scitechsoft.com
~ SciTech SNAP - The future of device driver technology!


Relevant Pages

  • Re: Debug and break point problem - There is no source code available
    ... manually recreate all the classes and then paste the code in... ... I, too, moved the source code from one ... > I started a new project and confirmed that I can debug it. ... >> I had to manually change the reference in the Solution properties. ...
    (microsoft.public.dotnet.general)
  • Re: Unable to DEbug the Driver in the user mode
    ... I have Printer driver code which uses uniDrv & OEMUI code. ... ANOTHER INTEROP dLL WHICH TRANSFER THE BYTE ARRAY TO .nET ... I am getting some information about the debug content ... but I am not able to debug source code line by line. ...
    (microsoft.public.development.device.drivers)
  • Re: debugging in release mode
    ... The idea that source code was in the .exe was my misconception. ... The alert is a callback. ... SInce I now know that it is not giving away any secrets to deliver debug ... and then turn off optimizations and turn on generation of debug ...
    (microsoft.public.vc.mfc)
  • Re: Seg fault even though address space is accessible
    ... There's a guy around here somewhere who can debug your problem (no ... matter what it is) using only the debugger. ... debugger and not seeing the source code. ... I did extend the position held by one of the trolls that the debugger ...
    (comp.lang.c)
  • Red Hat ES4 GPL Issues?
    ... I have been working on 2.6.9 kernels with red hat ES4 series distributions. ... normally distributes source code with their Linux distro. ...
    (Linux-Kernel)

Loading