Project

General

Profile

Actions

Misc #10278

closed

[RFC] st.c: use ccan linked list

Added by normalperson (Eric Wong) over 9 years ago. Updated about 6 years ago.


Description

Mainly posting this for documentation purposes because it seems like
an obvious thing to try given we have ccan/list nowadays.

Having shorter code along branchless insert/delete, and using a common
linked-list API is very appealing.

On the other hand, benchmark results are a mixed bag:

http://80x24.org/bmlog-20140922-032221.13002

Also, I may have introduced new bugs the tests didn't catch.
The st_foreach* functions get a bit strange when dealing with
packed-to-unpacked transitions while iterating.

Great thing: bighash is faster (as expected) because of branchless
linked list insertion. However, the major speedup in bighash probably
isn't too important, most hashes are small and users never notice.

vm2_bighash*	1.222

Also, we could introduce rb_hash_new_with_size() for use insns.def
(newhash) if people really care about the static bighash case (I don't
think many do).

Real regressions, iteration seems more complex because loop conditions
are more complex :<

hash_keys	0.978
hash_values	0.941

However, hash_keys/values regressions are pretty small.

Things that worry me:

vm1_attr_ivar*	0.736
vm1_attr_ivar_set*	0.845

WTF? I reran the attr_ivar tests, and the numbers got slightly better:

 ["vm1_attr_ivar",
  [[1.851297842,
    1.549076322,
    1.623306027,
    1.956916541,
    1.533218607,
    1.554089054,
    1.702590516,
    1.789863782,
    1.711815018,
    1.851260599],
   [1.825423191,
    1.824934062,
    1.542471471,
    1.868502091,
    1.79106375,
    1.884568825,
    1.850712387,
    1.797538962,
    2.165696827,
    1.866671482]]],
 ["vm1_attr_ivar_set",
  [[1.926496052,
    2.04742421,
    2.025571131,
    2.047656291,
    2.043747069,
    2.099586827,
    1.953769267,
    2.017580504,
    2.440432603,
    2.111254634],
   [2.365839125,
    2.076282818,
    2.112784977,
    2.118754445,
    2.091752673,
    2.161164561,
    2.107439445,
    2.128147747,
    2.945295069,
    2.131679632]]]]

Elapsed time: 91.963235593 (sec)
-----------------------------------------------------------
benchmark results:
minimum results in each 10 measurements.
Execution time (sec)
name	orig	stll
loop_whileloop	0.672	0.670
vm1_attr_ivar*	0.861	0.872
vm1_attr_ivar_set*	1.255	1.406

Speedup ratio: compare with the result of `orig' (greater is better)
name	stll
loop_whileloop	1.002
vm1_attr_ivar*	0.987
vm1_attr_ivar_set*	0.892

Note: these tests do not even hit st, and even if they did, these are
tiny tables which are packed so the linked-list implementation has no
impact (especially not on lookup tests)

So yeah, probably something messy with the CPU caches.
I always benchmark with the performance CPU governor, and the
rerun ivar numbers are run with CPU pinned to a single core.
CPU: AMD FX-8320 Maybe I can access my other systems later.


Files

0001-st.c-use-ccan-linked-list.patch (13.1 KB) 0001-st.c-use-ccan-linked-list.patch normalperson (Eric Wong), 09/22/2014 05:32 AM
0001-st.c-use-ccan-linked-list-v2.patch (16.8 KB) 0001-st.c-use-ccan-linked-list-v2.patch normalperson (Eric Wong), 06/24/2015 08:20 AM

Related issues 1 (0 open1 closed)

Related to Ruby master - Feature #10321: [PATCH] test st_foreach{,_check} for packed-to-unpack changeClosednormalperson (Eric Wong)10/03/2014Actions
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0