Thursday, February 16, 2012

shared_from_this causing Exception tr1::bad_weak_ptr

I've been having a rotten week, with my boost::asio program keep giving me a segmentation fault... and it is not even doing the real work yet. It crashes when a client disconnects. The error message is:

   Exception: tr1::bad_weak_ptr

My code has now been littered with debug lines, lots of them showing usage counts of the shared_ptr in the hope of tracking down at what stage it is going wrong:

   std::cout << "this.use_count=" << shared_from_this().use_count() << "\n";

If that is unfamiliar, my class is defined like this:
   class client_session :
     public boost::enable_shared_from_this< client_session >{ ... }

This allows me to pass around shared pointers to this, from inside the class being pointed at, and is one of the essential tools you need to do anything useful with boost::asio.

I've been reading tutorials, studying other people's code, and progressively adding more shared pointers around objects that I am sure do not really need it. Nothing would shake it.

My code is using cross-references: the client connection object stores a vector of references to the data sources it uses, and each data source stores a vector of references to the clients subscribed to it. When I say reference I mean it holds a smart pointer instance. It is not that complicated but surely the problem must be in that cross-referencing? So, in desperation I deleted the entire data source class, and all that subscribing and unsubscribing code. Eh? It still crashes.

But then I noticed this code:
    std::cout << "In client_session destructor (this.use_count="
        << shared_from_this().use_count() << ")\n";
    unsubscribe_from_all(); //'cos we'll no longer be valid after this

I knew (!!) the problem was not in the destructor, but had to (!!) be before that point, because that first debug line was never reached. If you're already laughing at me, have a healthy helping of kudos. Yes, it was that call to shared_from_this() causing the crash! I was reaching the destructor, but crashing before it could print my debug line.

You see, in C++, an object does not really exist until the end of the constructor, and does not really exist when you enter the destructor. You must not use shared_from_this() in the destructor, or in any function called from the destructor. And when I thought again about what unsubscribe_from_all() (which was also calling shared_from_this()) does, I realized the destructor could not ever be called if any data sources still have a reference to us. So that call is not needed. The destructor code became:

    std::cout << "In client_session constructor.\n";

...and the crashes went away.

There is something very, very annoying knowing the bug I've been chasing for *two solid days* was in the debug code I added to track down the bug.