Feature #10200

Symbol list/count API with Symbol GC

Added by Koichi Sasada 11 months ago. Updated 11 months ago.

[ruby-core:64732]
Status:Closed
Priority:Normal
Assignee:Yukihiro Matsumoto

Description

Abstract

We need to consider specification of "Symbol.all_symbols" method because of Symbol GC.

Backgraound

Symbol.all_symbols returns an array includes all symbols in this Ruby interpreter process.

"a#{1+2}b".to_sym
p Symbol.all_symbols.last #=> :a3b. Order of this array is implementation dependent.

However, Ruby 2.2 will introduce symbol GC.
With symbol GC, dynamically created symbols can be collected like this:

"a#{1+2}b".to_sym
p Symbol.all_symbols.last #=> :a3b
GC.start
p Symbol.all_symbols.last #=> :$-a  <- :a3b is collected

Symbol class has another API Symbol.find() to get a symbol from a corresponding string object like that:

str = "a#{1+2}b"
str.to_sym
p Symbol.all_symbols.last #=> :a3b
p Symbol.find(str)        #=> :a3b
GC.start
p Symbol.all_symbols.last #=> :$-a  <- :a3b is collected
p Symbol.find(str)        #=> nil

Symbol GC separate all symbols into two types (because of implementaion details):

  • (1) Collecatable symbols
  • (2) Uncollectable symbols (we can not free even if there are no reference to these symbols)

Now, Symbol.all_symbols returns (1) + (2).

Symbol.all_symbols and Symbol.find methods assume that all symbols are immortal (assume only (2)). However, this assumption is changed ((1) is added).

Symbol.count is proposed to count (2) symbols.
Now, we don't have any way to count (2), because Symbol.all_symbols.size returns (1) and (2) symbols.

Discussion

Maybe there are several possibility:

(a) No change (Symbol.all_symbols and Symbol.find treat with (1) + (2) symbols)
(b) Symbol.all_symbols and Symbol.find treat with (2) symbols
(c) Add new parameter to Symbol.all_symbols and Symbol.find to specify (2) or (1)+(2)).

(b) and (c) is reasonable for recent usage for these API, to findout immortal objects.
However, Symbol GC reduces danger of DoS attack with huge number of immortal objects.

Thoughts?

History

#1 Updated by Koichi Sasada 11 months ago

  • Description updated (diff)

#2 Updated by Akira Tanaka 11 months ago

I feel ObjectSpace.count_objects can be extended to return number of symbols.

#3 Updated by Koichi Sasada 11 months ago

Akira Tanaka wrote:

I feel ObjectSpace.count_objects can be extended to return number of symbols.

Make a new method? Or return a hash object with new types like T_SYMBOL_MORTAL and T_SYMBOL_IMMORTAL?

Now, ObjectSpace.count_objects returns a number of (1) + (2) - [number of statically created symbol (immediate symbol values)].

#4 Updated by Akira Tanaka 11 months ago

New hash entries.

I'm not sure that "T_" prefix is appropriate here, though.

#5 Updated by Yukihiro Matsumoto 11 months ago

This request is bit vague.

As a result of the developers meething on 2014-09-04, we will:

  • keep Symbol.all_symbols as it is.
  • remove Symbol.find(name).

Matz.

#6 Updated by Yukihiro Matsumoto 11 months ago

  • Status changed from Open to Closed

Also available in: Atom PDF