Project

General

Profile

Feature #13725

[PATCH] Hash#[]= deduplicates string keys if (and only if) fstring exists

Added by normalperson (Eric Wong) almost 2 years ago. Updated almost 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:81942]

Description

Hash#[]= deduplicates string keys if (and only if) fstring exists

In typical applications, hash entries are read after being
written to. Blindly writing to hashes which are never read
makes little sense. So, for any hash which is read from, an
fstring entry for the key should already exist for the key.

We no longer blindly create fstrings if the code is blindly
setting random hash keys, preventing the performance regression
in the reverted r43870.

Regarding https://bugs.ruby-lang.org/issues/9188, this has a
minimum impact on the bm_so_k_nucleotide where hash keys are set
and not reused, performance is within 1-2% of existing cases.

  • hash.c: #include gc.h for rb_objspace_garbage_object_p (hash_aset_str): do read-only check of fstring table and reuse fstring if it exists and is still alive (not garbage)

Files


Related issues

Related to Ruby trunk - Bug #13857: frozen string literal: can freeze same string into two unique frozen stringsRejectedActions

Associated revisions

Revision e205304a
Added by normal almost 2 years ago

Hash#[]= deduplicates string keys if (and only if) fstring exists

In typical applications, hash entries are read after being
written to. Blindly writing to hashes which are never read
makes little sense. So, for any hash which is read from, an
fstring entry for the key should already exist for the key.

We no longer blindly create fstrings if the code is blindly
setting random hash keys, preventing the performance regression
in the reverted r43870.

Regarding https://bugs.ruby-lang.org/issues/9188, this has a
minimum impact on the bm_so_k_nucleotide where hash keys are set
and not reused, performance is within 1-2% of existing cases.

  • hash.c: #include gc.h for rb_objspace_garbage_object_p (hash_aset_str): do read-only check of fstring table and reuse fstring if it exists and is still alive (not garbage) [ruby-core:81942] [Feature #13725]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59304 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 59304
Added by normalperson (Eric Wong) almost 2 years ago

Hash#[]= deduplicates string keys if (and only if) fstring exists

In typical applications, hash entries are read after being
written to. Blindly writing to hashes which are never read
makes little sense. So, for any hash which is read from, an
fstring entry for the key should already exist for the key.

We no longer blindly create fstrings if the code is blindly
setting random hash keys, preventing the performance regression
in the reverted r43870.

Regarding https://bugs.ruby-lang.org/issues/9188, this has a
minimum impact on the bm_so_k_nucleotide where hash keys are set
and not reused, performance is within 1-2% of existing cases.

  • hash.c: #include gc.h for rb_objspace_garbage_object_p (hash_aset_str): do read-only check of fstring table and reuse fstring if it exists and is still alive (not garbage) [ruby-core:81942] [Feature #13725]

Revision 59304
Added by normal almost 2 years ago

Hash#[]= deduplicates string keys if (and only if) fstring exists

In typical applications, hash entries are read after being
written to. Blindly writing to hashes which are never read
makes little sense. So, for any hash which is read from, an
fstring entry for the key should already exist for the key.

We no longer blindly create fstrings if the code is blindly
setting random hash keys, preventing the performance regression
in the reverted r43870.

Regarding https://bugs.ruby-lang.org/issues/9188, this has a
minimum impact on the bm_so_k_nucleotide where hash keys are set
and not reused, performance is within 1-2% of existing cases.

  • hash.c: #include gc.h for rb_objspace_garbage_object_p (hash_aset_str): do read-only check of fstring table and reuse fstring if it exists and is still alive (not garbage) [ruby-core:81942] [Feature #13725]

Revision 59304
Added by normal almost 2 years ago

Hash#[]= deduplicates string keys if (and only if) fstring exists

In typical applications, hash entries are read after being
written to. Blindly writing to hashes which are never read
makes little sense. So, for any hash which is read from, an
fstring entry for the key should already exist for the key.

We no longer blindly create fstrings if the code is blindly
setting random hash keys, preventing the performance regression
in the reverted r43870.

Regarding https://bugs.ruby-lang.org/issues/9188, this has a
minimum impact on the bm_so_k_nucleotide where hash keys are set
and not reused, performance is within 1-2% of existing cases.

  • hash.c: #include gc.h for rb_objspace_garbage_object_p (hash_aset_str): do read-only check of fstring table and reuse fstring if it exists and is still alive (not garbage) [ruby-core:81942] [Feature #13725]

History

Updated by normalperson (Eric Wong) almost 2 years ago

I forget to mention, this might make the proposed [Feature #13721]
("net/imap: dedupe attr keys in Net::IMAP::FetchData")
obsolete: https://bugs.ruby-lang.org/issues/13721

Along with the necessity to make similar changes down the line.

#2

Updated by Anonymous almost 2 years ago

  • Status changed from Open to Closed

Applied in changeset trunk|r59304.


Hash#[]= deduplicates string keys if (and only if) fstring exists

In typical applications, hash entries are read after being
written to. Blindly writing to hashes which are never read
makes little sense. So, for any hash which is read from, an
fstring entry for the key should already exist for the key.

We no longer blindly create fstrings if the code is blindly
setting random hash keys, preventing the performance regression
in the reverted r43870.

Regarding https://bugs.ruby-lang.org/issues/9188, this has a
minimum impact on the bm_so_k_nucleotide where hash keys are set
and not reused, performance is within 1-2% of existing cases.

  • hash.c: #include gc.h for rb_objspace_garbage_object_p (hash_aset_str): do read-only check of fstring table and reuse fstring if it exists and is still alive (not garbage) [ruby-core:81942] [Feature #13725]
#3

Updated by nobu (Nobuyoshi Nakada) over 1 year ago

  • Related to Bug #13857: frozen string literal: can freeze same string into two unique frozen strings added

Also available in: Atom PDF