Bug #8386

OpenSSL thread safety

Added by Dirkjan Bussink about 2 years ago. Updated about 2 months ago.

[ruby-core:54900]
Status:Closed
Priority:Normal
Assignee:Martin Bosslet
ruby -v:trunk Backport:1.9.3: UNKNOWN, 2.0.0: UNKNOWN

Description

As some might know, Rubinius uses a lot of MRI's C extensions that are part of stdlib and ships them as well. Rubinius however does not have a GIL, so thread safety issues in C extensions are much more important.

This patch fixes a thread safety issue in the OpenSSL extension, to allow for it being used in a concurrent scenario. This follows from the documentation of OpenSSL:

  1. Is OpenSSL thread-safe?

Yes (with limitations: an SSL connection may not concurrently be used by multiple threads). On Windows and many Unix systems, OpenSSL automatically uses the multi-threaded versions of the standard libraries. If your platform is not one of these, consult the INSTALL file.

Multi-threaded applications must provide two callback functions to OpenSSL by calling CRYPTO_set_locking_callback() and CRYPTO_set_id_callback(), for all versions of OpenSSL up to and including 0.9.8[abc...]. As of version 1.0.0, CRYPTO_set_id_callback() and associated APIs are deprecated by CRYPTO_THREADID_set_callback() and friends. This is described in the threads(3) manpage.

This patch adds using the callback and thread_id functions so this works properly with Rubinius. This issue is not just theoretical, I've been able to reproduce crashes when using OpenSSL in different threads that have been fixed by this change. Rubinius already incorporated the change, but preferable I would not have to maintain a custom fork with these changes, but be able to use the MRI version without changes. I've already discussed the change with Martin and he saw no reason for objecting this change.

openssl_thread_safety.patch Magnifier (2.21 KB) Dirkjan Bussink, 05/10/2013 08:12 PM

backtrace.txt Magnifier (7.78 KB) Kazuki Tsujimoto, 07/06/2013 11:22 PM

Associated revisions

Revision 41806
Added by emboss almost 2 years ago

  • ext/openssl/ossl.c: Provide CRYPTO_set_locking_callback() and CRYPTO_set_id_callback() callback functions ossl_thread_id and ossl_lock_callback to ensure the OpenSSL extension is usable in multi-threaded environments. [Bug #8386]

Thanks, Dirkjan Bussink, for the patch!

Revision 41806
Added by emboss almost 2 years ago

  • ext/openssl/ossl.c: Provide CRYPTO_set_locking_callback() and CRYPTO_set_id_callback() callback functions ossl_thread_id and ossl_lock_callback to ensure the OpenSSL extension is usable in multi-threaded environments. [Bug #8386]

Thanks, Dirkjan Bussink, for the patch!

Revision 42135
Added by Koichi Sasada almost 2 years ago

  • ext/openssl/ossl.c: use system native (system provided) thread locking APIs added by last commit. This patch fixes [Bug #8386]. "rb_mutex_*" APIs control only "Ruby" threads. Not for native threads.

Revision 42135
Added by Koichi Sasada almost 2 years ago

  • ext/openssl/ossl.c: use system native (system provided) thread locking APIs added by last commit. This patch fixes [Bug #8386]. "rb_mutex_*" APIs control only "Ruby" threads. Not for native threads.

Revision 42159
Added by Koichi Sasada almost 2 years ago

  • ext/openssl/ossl.c: support additional three thread synchronization functions. [ruby-trunk - Bug #8386]

Revision 42159
Added by Koichi Sasada almost 2 years ago

  • ext/openssl/ossl.c: support additional three thread synchronization functions. [ruby-trunk - Bug #8386]

History

#2 Updated by Anonymous almost 2 years ago

  • Status changed from Open to Closed

This issue was solved with changeset r41806.
Dirkjan, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • ext/openssl/ossl.c: Provide CRYPTO_set_locking_callback() and CRYPTO_set_id_callback() callback functions ossl_thread_id and ossl_lock_callback to ensure the OpenSSL extension is usable in multi-threaded environments. [Bug #8386]

Thanks, Dirkjan Bussink, for the patch!

#3 Updated by Kazuki Tsujimoto almost 2 years ago

  • File backtrace.txtMagnifier added
  • Status changed from Closed to Assigned

=begin
Seems r41806 introduced a test failure under Ubuntu 13.04(x86_64):

$ while :; do make TESTS='openssl -j2' test-all || cat; done
[snip]
/home/k_tsj/work/ruby/lib/test/unit/parallel.rb:37: [BUG] vm_call0_cfunc_with_frame: cfp consistency error
ruby 2.1.0dev (2013-07-06 trunk 41806) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0004 p:---- s:0010 e:000009 CFUNC :read
c:0003 p:---- s:0008 e:000007 CFUNC :new
c:0002 p:0041 s:0005 e:000004 BLOCK /home/k_tsj/work/ruby/lib/test/unit/parallel.rb:37 [FINISH]
c:0001 p:---- s:0002 e:000001 TOP [FINISH]

/home/k_tsj/work/ruby/lib/test/unit/parallel.rb:37:in block in _run_suite'
/home/k_tsj/work/ruby/lib/test/unit/parallel.rb:37:in
new'
/home/k_tsj/work/ruby/lib/test/unit/parallel.rb:37:in `read'

I've attached the backtrace log.

Martin, can you check it?
=end

#4 Updated by Martin Bosslet almost 2 years ago

ktsj (Kazuki Tsujimoto) wrote:

I've attached the backtrace log.

Martin, can you check it?

Sure, thanks for the heads up!

#5 Updated by Martin Bosslet almost 2 years ago

Ok, this is gonna be fun... The error occurs in a test that triggers RSA key generation, which was implemented to run with 'rb_thread_call_without_gvl'. The OpenSSL callbacks for thread safety make use of rb_mutex_lock/rb_mutex_unlock. My initial guess is that there's a problem because the calling code runs without the GVL, but I need to analyze further to make a more educated guess.

It would help a lot if I had something isolated that reproduces the error (at least with high probability).

@ktsj Does the error occur every time you run the tests in parallel? Or only sporadically?

#6 Updated by Kazuki Tsujimoto almost 2 years ago

MartinBosslet (Martin Bosslet) wrote:

@ktsj Does the error occur every time you run the tests in parallel? Or only sporadically?

The latter (it is about 30-40% reproducible on my environment).

#7 Updated by Martin Bosslet almost 2 years ago

ktsj (Kazuki Tsujimoto) wrote:

MartinBosslet (Martin Bosslet) wrote:

@ktsj Does the error occur every time you run the tests in parallel? Or only sporadically?

The latter (it is about 30-40% reproducible on my environment).

Thanks! Ok, that seems good enough. I'll try to extract something in isolation from that.

#8 Updated by Koichi Sasada almost 2 years ago

  • Status changed from Assigned to Closed

This issue was solved with changeset r42135.
Dirkjan, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • ext/openssl/ossl.c: use system native (system provided) thread locking APIs added by last commit. This patch fixes [Bug #8386]. "rb_mutex_*" APIs control only "Ruby" threads. Not for native threads.

#9 Updated by Koichi Sasada almost 2 years ago

Maybe I fixed it.
Please check it.

#10 Updated by Yui NARUSE almost 2 years ago

ko1 (Koichi Sasada) wrote:

Maybe I fixed it.
Please check it.

I think dyn_lock_function and others also should be provided.
http://www.openssl.org/docs/crypto/threads.html

#11 Updated by Martin Bosslet almost 2 years ago

@ko1: Thank you so much for that! I think that functionality was exactly what was needed. The error happens when RSA keys are created. The code runs without the GVL in effect, and I believe the error was resulting from the fact that a different native thread was running.

@naruse: You're right, and I also believe it would make sense to distinguish between more "modes" in the callback. I'll try to give it another try over the weekend. But please, if you have something (planned) already, let me know and we could try to combine our efforts!

#12 Updated by Koichi Sasada almost 2 years ago

(2013/07/24 19:37), MartinBosslet (Martin Bosslet) wrote:

@naruse: You're right, and I also believe it would make sense to distinguish between more "modes" in the callback. I'll try to give it another try over the weekend. But please, if you have something (planned) already, let me know and we could try to combine our efforts!

I did. Please check it.

--
// SASADA Koichi at atdot dot net

#13 Updated by Torben Knerr about 2 months ago

Might #11033 be a regression of this?

Also available in: Atom PDF