https://bugs.ruby-lang.org/https://bugs.ruby-lang.org/favicon.ico?17113305112017-04-28T08:08:59ZRuby Issue Tracking SystemRuby master - Feature #13517: [PATCH] reduce rb_mutex_t size from 160 to 80 bytes on 64-bithttps://bugs.ruby-lang.org/issues/13517?journal_id=645282017-04-28T08:08:59Znormalperson (Eric Wong)normalperson@yhbt.net
<ul></ul><p><a href="mailto:normalperson@yhbt.net" class="email">normalperson@yhbt.net</a> wrote:</p>
<blockquote>
<p><a href="https://bugs.ruby-lang.org/issues/13517" class="external">https://bugs.ruby-lang.org/issues/13517</a></p>
</blockquote>
<p>For who care about 32-bit, single-core x86, here are my<br>
Pentium M (Centrino) @ 1.6GHz numbers:</p>
<p>Size reduction of Mutex on 32-bit is 112 => 40 bytes</p>
<p>minimum results in each 3 measurements.<br>
Execution time (sec)<br>
name trunk built<br>
loop_whileloop2 0.554 0.554<br>
vm2_mutex* 3.136 2.217<br>
vm_thread_mutex1 2.783 2.186<br>
vm_thread_mutex2 2.907 2.174<br>
vm_thread_mutex3 9.740 2.586</p>
<p>Speedup ratio: compare with the result of `trunk' (greater is better)<br>
name built<br>
loop_whileloop2 0.999<br>
vm2_mutex* 1.414<br>
vm_thread_mutex1 1.273<br>
vm_thread_mutex2 1.337<br>
vm_thread_mutex3 3.766</p>
<p>In the future, I think the cond_waiting flag can be moved into<br>
a FL_USER flag, too.</p>
<p>But I also want to try similar changes to avoid Array usage in<br>
Queue, SizedQueue, and ConditionVariable classes and rely on<br>
ccan/list + stack for waiters. I will convert from T_STRUCT to<br>
T_DATA.</p> Ruby master - Feature #13517: [PATCH] reduce rb_mutex_t size from 160 to 80 bytes on 64-bithttps://bugs.ruby-lang.org/issues/13517?journal_id=646352017-05-02T09:12:38Znormalperson (Eric Wong)normalperson@yhbt.net
<ul></ul><p><a href="mailto:normalperson@yhbt.net" class="email">normalperson@yhbt.net</a> wrote:</p>
<blockquote>
<p>Feature <a class="issue tracker-2 status-5 priority-4 priority-default closed" title="Feature: [PATCH] reduce rb_mutex_t size from 160 to 80 bytes on 64-bit (Closed)" href="https://bugs.ruby-lang.org/issues/13517">#13517</a>: [PATCH] reduce rb_mutex_t size from 160 to 80 bytes on 64-bit<br>
<a href="https://bugs.ruby-lang.org/issues/13517" class="external">https://bugs.ruby-lang.org/issues/13517</a></p>
</blockquote>
<p>Any comment? I would like to commit this, soon.</p>
<p>Thanks.</p> Ruby master - Feature #13517: [PATCH] reduce rb_mutex_t size from 160 to 80 bytes on 64-bithttps://bugs.ruby-lang.org/issues/13517?journal_id=646362017-05-02T09:21:25Zko1 (Koichi Sasada)
<ul></ul><p>At a glance, it seems nice.<br>
But I need to time to check deeply.<br>
I'll check with 'Misc <a class="issue tracker-2 status-1 priority-4 priority-default" title="Feature: [PATCH] thread_pthread.c (native_sleep): preserve old unblock function (Open)" href="https://bugs.ruby-lang.org/issues/13514">#13514</a>'.</p>
<p>Please wait these days. In Japan, now we have holiday week. I'll check on these days.</p>
<p>Thanks,<br>
Koichi</p> Ruby master - Feature #13517: [PATCH] reduce rb_mutex_t size from 160 to 80 bytes on 64-bithttps://bugs.ruby-lang.org/issues/13517?journal_id=646882017-05-07T19:08:35Zko1 (Koichi Sasada)
<ul></ul><p>sorry for late response.<br>
I have no objection about this patch. thank you.</p>
<p>one question.</p>
<pre><code> list_for_each_safe(&mutex->waitq, cur, next, node) {
list_del_init(&cur->node);
switch (cur->th->state) {
case THREAD_KILLED:
continue;
case THREAD_STOPPED:
case THREAD_RUNNABLE:
case THREAD_STOPPED_FOREVER:
rb_threadptr_interrupt(cur->th);
goto found;
}
}
</code></pre>
<p><code>rb_mutex_lock()</code> set <code>th->status</code> as <code>THREAD_STOPPED_FOREVER</code> before<br>
native sleep, but the above code quoted from <code>rb_mutex_unlock_th()</code>.</p>
<p>What kind of situation do you assume when the thread status is other<br>
than <code>THREAD_STOPPED_FOREVER</code>?</p>
<p>Thanks,<br>
Koichi</p>
<p>--<br>
// SASADA Koichi at atdot dot net</p> Ruby master - Feature #13517: [PATCH] reduce rb_mutex_t size from 160 to 80 bytes on 64-bithttps://bugs.ruby-lang.org/issues/13517?journal_id=646892017-05-07T23:11:42Znormalperson (Eric Wong)normalperson@yhbt.net
<ul></ul><p>SASADA Koichi <a href="mailto:ko1@atdot.net" class="email">ko1@atdot.net</a> wrote:</p>
<blockquote>
<p>sorry for late response.<br>
I have no objection about this patch. thank you.</p>
<p>one question.</p>
<pre><code> list_for_each_safe(&mutex->waitq, cur, next, node) {
list_del_init(&cur->node);
switch (cur->th->state) {
</code></pre>
</blockquote>
<p>Oops, that should be status, not state:</p>
<pre><code> switch (cur->th->status) {
</code></pre>
<blockquote>
<pre><code> case THREAD_KILLED:
continue;
case THREAD_STOPPED:
case THREAD_RUNNABLE:
case THREAD_STOPPED_FOREVER:
rb_threadptr_interrupt(cur->th);
goto found;
}
</code></pre>
<p>}</p>
<pre><code>`rb_mutex_lock()` set `th->status` as `THREAD_STOPPED_FOREVER` before
native sleep, but the above code quoted from `rb_mutex_unlock_th()`.
What kind of situation do you assume when the thread status is other
than `THREAD_STOPPED_FOREVER`?
</code></pre>
</blockquote>
<p>Back to your original question. THREAD_RUNNABLE is possible<br>
if somebody uses Thread#run:</p>
<p>require 'thread'<br>
m = Mutex.new<br>
th = Thread.new do<br>
sleep 0.1 # wait for main thread to get lock<br>
m.synchronize do<br>
sleep<br>
end<br>
end</p>
<p>m.synchronize do<br>
sleep 0.2 # wait for th to block on m.synchronize<br>
th.run<br>
end</p>
<p>I am not sure about other statuses. Maybe exit/GC can trigger<br>
THREAD_KILLED, the mutex_free->rb_mutex_unlock_th call chain<br>
looks like it might due to GC ordering. Anyways, I will add<br>
comments here when I commit.</p>
<blockquote>
<p>Thanks,<br>
Koichi</p>
</blockquote>
<p>Thank you for the review!</p> Ruby master - Feature #13517: [PATCH] reduce rb_mutex_t size from 160 to 80 bytes on 64-bithttps://bugs.ruby-lang.org/issues/13517?journal_id=646902017-05-07T23:41:15Zko1 (Koichi Sasada)
<ul></ul><p>On 2017/05/08 8:08, Eric Wong wrote:</p>
<blockquote>
<p>Back to your original question. THREAD_RUNNABLE is possible<br>
if somebody uses Thread#run:</p>
<pre><code>require 'thread'
m = Mutex.new
th = Thread.new do
sleep 0.1 # wait for main thread to get lock
m.synchronize do
</code></pre>
<p>sleep<br>
end<br>
end</p>
<pre><code>m.synchronize do
sleep 0.2 # wait for th to block on m.synchronize
th.run
end
</code></pre>
</blockquote>
<p>I also confirm that this code set <code>THREAD_RUNNABLE</code>. However, <code>th</code> waits<br>
locking forever, current <code>Thread#run</code> should be bug. mmmm. But not so<br>
serious because it is only small period (maybe as you know). We should<br>
modify later.</p>
<blockquote>
<p>I am not sure about other statuses. Maybe exit/GC can trigger<br>
THREAD_KILLED, the mutex_free->rb_mutex_unlock_th call chain<br>
looks like it might due to GC ordering. Anyways, I will add<br>
comments here when I commit.</p>
</blockquote>
<p>I think adding rb_bug() guard is good to know the flow of such situation.</p>
<p>--<br>
// SASADA Koichi at atdot dot net</p> Ruby master - Feature #13517: [PATCH] reduce rb_mutex_t size from 160 to 80 bytes on 64-bithttps://bugs.ruby-lang.org/issues/13517?journal_id=646912017-05-08T00:18:59ZAnonymous
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Closed</i></li></ul><p>Applied in changeset trunk|r58604.</p>
<hr>
<p>reduce rb_mutex_t size from 160 to 80 bytes on 64-bit</p>
<p>Instead of relying on a native condition variable and mutex for<br>
every Ruby Mutex object, use a doubly linked-list to implement a<br>
waiter queue in the Mutex. The immediate benefit of this is<br>
reducing the size of every Mutex object, as some projects have<br>
many objects requiring synchronization.</p>
<p>In the future, this technique using a linked-list and on-stack<br>
list node (struct mutex_waiter) should allow us to easily<br>
transition to M:N threading model, as we can avoid the native<br>
thread dependency to implement Mutex.</p>
<p>We already do something similar for autoload in variable.c,<br>
and this was inspired by the Linux kernel wait queue (as<br>
ccan/list is inspired by the Linux kernel linked-list).</p>
<p>Finaly, there are big performance improvements for Mutex<br>
benchmarks, especially in contended cases:</p>
<p>measure target: real</p>
<table>
<thead>
<tr>
<th>name</th>
<th align="right">trunk</th>
<th align="right">built</th>
</tr>
</thead>
<tbody>
<tr>
<td>loop_whileloop2</td>
<td align="right">0.149</td>
<td align="right">0.148</td>
</tr>
<tr>
<td>vm2_mutex*</td>
<td align="right">0.893</td>
<td align="right">0.651</td>
</tr>
<tr>
<td>vm_thread_mutex1</td>
<td align="right">0.809</td>
<td align="right">0.624</td>
</tr>
<tr>
<td>vm_thread_mutex2</td>
<td align="right">2.608</td>
<td align="right">0.628</td>
</tr>
<tr>
<td>vm_thread_mutex3</td>
<td align="right">28.227</td>
<td align="right">0.881</td>
</tr>
</tbody>
</table>
<p>Speedup ratio: compare with the result of `trunk' (greater is better)</p>
<table>
<thead>
<tr>
<th>name</th>
<th align="right">built</th>
</tr>
</thead>
<tbody>
<tr>
<td>loop_whileloop2</td>
<td align="right">1.002</td>
</tr>
<tr>
<td>vm2_mutex*</td>
<td align="right">1.372</td>
</tr>
<tr>
<td>vm_thread_mutex1</td>
<td align="right">1.297</td>
</tr>
<tr>
<td>vm_thread_mutex2</td>
<td align="right">4.149</td>
</tr>
<tr>
<td>vm_thread_mutex3</td>
<td align="right">32.044</td>
</tr>
</tbody>
</table>
<p>Tested on AMD FX-8320 8-core at 3.5GHz</p>
<ul>
<li>thread_sync.c (struct mutex_waiter): new on-stack struct<br>
(struct rb_mutex_struct): remove native lock/cond, use ccan/list<br>
(rb_mutex_num_waiting): new function for debug_deadlock_check<br>
(mutex_free): remove native_<em><em>destroy<br>
(mutex_alloc): initialize waitq, remove native</em></em><em>initialize<br>
(rb_mutex_trylock): remove native_mutex</em>{lock,unlock}<br>
(lock_func): remove<br>
(lock_interrupt): remove<br>
(rb_mutex_lock): rewrite waiting path to use native_sleep + ccan/list<br>
(rb_mutex_unlock_th): rewrite to wake up from native_sleep<br>
using rb_threadptr_interrupt<br>
(rb_mutex_abandon_all): empty waitq</li>
<li>thread.c (debug_deadlock_check): update for new struct<br>
(rb_check_deadlock): ditto<br>
<a href="/issues/13517">[ruby-core:80913]</a> [Feature <a class="issue tracker-2 status-5 priority-4 priority-default closed" title="Feature: [PATCH] reduce rb_mutex_t size from 160 to 80 bytes on 64-bit (Closed)" href="https://bugs.ruby-lang.org/issues/13517">#13517</a>]</li>
</ul>