gdb help: debugging a segfault in boost::shared_ptr



I am working on a multithreaded application that contains a database
connection pool which is using shared_ptr to pass connections around.

After recent changes I've been getting random segfaults in the
shared_ptr code handling ref counting. The gdb session below help
explain the context.

I acknowledge that this is most likely my own screwup, but since I am
unable to get much meaningful information from gdb, I'm starting to run
out of ideas.

The following is what I have discovered by research:
- shared_ptr is thread safe (from boost docs)
- sp_counted_base uses lock-free algorithms for refcounting (from
header)

The questions I currently have are:
- How can I get more details about the segfault? Ie. Which instruction
or memory address is involved?
- Why can't I access *pw?
- How can atomic_increment segfault? Is it possible that gdb's stack
trace is wrong?

An answer to any of these questions would be greatly appreciated.

[START GDB session]

(gdb) run

[...]

TEST: GetEntry
======================================================================

Entry: 000000000000576(539)
Get adapter

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1311896656 (LWP 29331)]
0xb7c556db in boost::detail::atomic_increment (pw=0x8006d73) at
sp_counted_base_gcc_x86.hpp:66
66 );
(gdb) bt
#0 0xb7c556db in boost::detail::atomic_increment (pw=0x8006d73) at
sp_counted_base_gcc_x86.hpp:66
#1 0xb7c55700 in boost::detail::sp_counted_base::add_ref_copy
(this=0x8006d6f) at sp_counted_base_gcc_x86.hpp:133
#2 0xb7c55746 in shared_count (this=0xb1cdff9c, r=@0xb455140c) at
shared_count.hpp:170
#3 0xb7c559dc in shared_ptr (this=0xb1cdff98) at shared_ptr.hpp:106
#4 0xb7c55290 in DatabaseAdapterPool::GetAdapter (this=0xb7d00f20,
dataSource=@0xb1ce00f8) at DatabaseAdapterPool.cc:188
#5 0xb7c3b0a3 in IDatabaseAdapter::GetAdapter (dataSource=@0xb1ce00f8)
at DatabaseAdapter.cc:31
#6 0xb7c4dffd in Instance::GetEntry (this=0x8056c90,
formName=@0xb1ce02c8, entryId=@0xb1ce02c4, fields=@0xb1ce02ac,
values=@0xb1ce02a0) at Instance.cc:85
#7 0xb7c3e42e in ARDBCGetEntry (object=0x8056c90, tableName=0xb453e284
"orcl", vendorFieldList=0xb1ce039c, transId=0,
entryIdList=0xb1ce0394, idList=0xb1ce038c, fieldList=0xb1ce0384,
status=0xb1ce037c) at syscomardbc.cc:386
#8 0x0804c7ee in ardbctest::testGetEntry (this=0x80cec80) at
ardbctest.h:379
#9 0x0804cddc in
boost::detail::function::void_function_obj_invoker0<ardbctest,
void>::invoke (function_obj_ptr=
{obj_ptr = 0x80cec80, const_obj_ptr = 0x80cec80, func_ptr =
0x80cec80, data = "\200"}) at ardbctest.h:217
#10 0x0804f7a1 in boost::function0<void,
std::allocator<boost::function_base> >::operator() (this=0xb1ce0434)
at function_template.hpp:576
#11 0x0804e5f6 in thread_proxy (param=0xbf96bd0c) at
.../src/thread.cpp:113
#12 0xb7f3a341 in start_thread () from
/lib/tls/i686/cmov/libpthread.so.0
#13 0xb7dcd4ee in clone () from /lib/tls/i686/cmov/libc.so.6
(gdb) l -
56 {
57 //atomic_exchange_and_add( pw, 1 );
58
59 __asm__
60 (
61 "lock\n\t"
62 "incl %0":
63 "=m"( *pw ): // output (%0)
64 "m"( *pw ): // input (%1)
65 "cc" // clobbers
(gdb) l
66 );
67 }
68
69 inline int atomic_conditional_increment( int * pw )
70 {
71 // int rv = *pw;
72 // if( rv != 0 ) ++*pw;
73 // return rv;
74
75 int rv, tmp;
(gdb) print pw
$12 = (int *) 0x8006d73
(gdb) print *pw
Cannot access memory at address 0x8006d73
(gdb) f 4
#4 0xb7c55290 in DatabaseAdapterPool::GetAdapter (this=0xb7d00f20,
dataSource=@0xb1ce00f8) at DatabaseAdapterPool.cc:188
188 shared_ptr<IDatabaseAdapter> a =
pool.GetAdapter();
(gdb) l
183
184 boost::mutex::scoped_lock lock( pool_mutex
);[START GDB session]
185
186 cout << "Finding available adapter. DataSource:
" << dataSource.name << " (" << &dataSource << ")" << endl;
187
188 shared_ptr<IDatabaseAdapter> a =
pool.GetAdapter();
189
190 if ( a.get() == NULL )
191 a = CreateAdapter( dataSource );
192

[END GDB session]

NOTE: pool.GetAdapter() returns either a shared_ptr to an available
adapter, or an empty shared_ptr.

.



Relevant Pages