Discussion:
[Wt-interest] Possible data races using wt-3.3.4-rc1 with boost 1_57_0
Joe VanAndel
2015-02-19 00:14:49 UTC
Permalink
I’m using valgrind 3.10.1 to look for races in my Wt application.

When I run valgrind’s helgrind thread tool on the Wt example ‘blog’
application, it warns me of several races.

/usr/local/bin/valgrind --tool=helgrind --verbose \
--suppressions=/home/vanandel/valgrind/wt_valgrind_suppressions \
--suppressions=/home/vanandel/valgrind/boost_thread.sup \
--gen-suppressions=yes \
--vgdb=yes --vgdb-error=0 \
../../build_boost_1_57_0/examples/blog/blog.wt --docroot . --http-address
0.0.0.0 --http-port 8081

FYI: I get similar warnings from helgrind if I compile against boost
1_56_0.

I’ve attached two of the races the helgrind detects.
wt_boost_race1 <http://pastebin.com/831vMuK5>


The following race condition is quite interesting:
It appears that a http::server::TcpConnection object is being destroyed as
a result of a boost:shared_ptr() being destroyed, while another thread is
still using the TcpConnection object.

Since boost::shared_ptr is supposed to be thread safe, it isn’t obvious how
this could happen.

wt_boost_TcpConnec
 <http://pastebin.com/zErMzYvL>

I’d be happy to post more examples of race conditions detected by helgrind.






Joe VanAndel
NCAR/EOL
Koen Deforche
2015-02-19 20:22:10 UTC
Permalink
Hey,
Post by Joe VanAndel
I’ve attached two of the races the helgrind detects.
wt_boost_race1 <http://pastebin.com/831vMuK5>
This one would suggest that boost::lexical_cast<> is not thread-safe
because of locale()? That can't be right?
Post by Joe VanAndel
It appears that a http::server::TcpConnection object is being destroyed as
a result of a boost:shared_ptr() being destroyed, while another thread is
still using the TcpConnection object.
Since boost::shared_ptr is supposed to be thread safe, it isn’t obvious
how this could happen.
wt_boost_TcpConnec
 <http://pastebin.com/zErMzYvL>
Also this does not make sense: shared_ptr guarantees that an object will
not be destroyed as other references exist?

Koen
Stefan Ruppert
2015-02-20 08:26:59 UTC
Permalink
Hello all,
Hey,
I’ve attached two of the races the helgrind detects.
wt_boost_race1 <http://pastebin.com/831vMuK5>
This one would suggest that boost::lexical_cast<> is not thread-safe
because of locale()? That can't be right?
From my experience helgrind reports sometimes false positives. However I
don't know the locale() code and if there are some global variables used
this data race can happen. Note that there is no mutex used at all!

Most likely these data races can be avoided if global data is
initialized in the main thread before any other thread is started! Maybe
call lexical_cast() in main() once?
It appears that a http::server::TcpConnection object is being
destroyed as a result of a boost:shared_ptr() being destroyed, while
another thread is still using the TcpConnection object.
Since boost::shared_ptr is supposed to be thread safe, it isn’t
obvious how this could happen.
wt_boost_TcpConnec… <http://pastebin.com/zErMzYvL>
Also this does not make sense: shared_ptr guarantees that an object will
not be destroyed as other references exist?
Helgrind data race reports should not be treat at the same time. The
report should be interpreted that at some time memory at
0x51414C6 was read from thread #3 and at a later time memory at
0x51414C6 was written by thread #10 ***without using a mutex***!!!
It does not say anything about lifetime of objects.

However the write occurs within dtor thus any other thread does not
'see' this change in memory... So I would suppress this warning...

Also interesting is that it occurs in code generated from the compiler!?
I don't see any dtor within TcpConnection class!?

Regards,
Stefan

Loading...