Ruby Issue Tracking System: Issueshttps://bugs.ruby-lang.org/https://bugs.ruby-lang.org/favicon.ico?17113305112012-01-07T06:51:03ZRuby Issue Tracking System
Redmine Ruby master - Feature #5856 (Rejected): Feature: Raise any objecthttps://bugs.ruby-lang.org/issues/58562012-01-07T06:51:03Zkstephens (Kurt Stephens)
<p>Feature: Raise any object</p>
<p>= Proposal</p>
<p>The ability to raise any object that conforms to the protocol of Exception.</p>
<p>= Problem</p>
<ul>
<li>The Exception subclass hierarchy is well-established.</li>
<li>CRuby does not allow any object that behaves as an Exception to be raised, it must be a subclass of Exception.</li>
<li>3rd-party code often rescues Exception; e.g. for error recovery, retry and/or logging.</li>
<li>Users need the ability to raise objects that would not normally be rescued by <em>any</em> code;<br>
e.g.: hard timeouts or custom signal handlers in an application.</li>
</ul>
<p>= Solution</p>
<ul>
<li>ruby/eval.c: Remove make_exception() assertion rb_obj_is_kind_of(mesg, rb_mRaiseable).</li>
</ul>
<p>= Implementation</p>
<ul>
<li>See attached patch or <a href="https://github.com/kstephens/ruby/tree/trunk-raise-any" class="external">https://github.com/kstephens/ruby/tree/trunk-raise-any</a>
</li>
</ul>
<p>= Example</p>
<ul>
<li>See test/ruby/test_raise_any.rb</li>
</ul>
<p>= See also</p>
<ul>
<li><a href="https://bugs.ruby-lang.org/issues/5818" class="external">https://bugs.ruby-lang.org/issues/5818</a></li>
</ul> Ruby master - Feature #5818 (Rejected): Feature: Raiseablehttps://bugs.ruby-lang.org/issues/58182011-12-28T08:17:24Zkstephens (Kurt Stephens)
<p>= Proposal</p>
<p>The ability to raise any object that is a Raiseable.</p>
<p>= Problem</p>
<ul>
<li>The Exception subclass hierarchy is well-established.</li>
<li>CRuby does not allow any object that behaves as an Exception to be raised, it must be a subclass of Exception.</li>
<li>3rd-party code often rescues Exception; e.g. for error recovery, retry and/or logging.</li>
<li>Users need the ability to raise objects that would not normally be rescued by <em>any</em> code;<br>
e.g.: hard timeouts or custom signal handlers in an application.</li>
</ul>
<p>= Solution</p>
<ul>
<li>A "Raiseable" module implements all of the methods currently defined in Exception.</li>
<li>Exception class includes Raiseable module.</li>
<li>ruby/eval.c: make_exception() asserts rb_obj_is_kind_of(mesg, rb_mRaiseable),<br>
instead of rb_obj_is_kind_of(mesg, rb_cException).</li>
<li>Users should avoid "rescue Raiseable" in usual circumstances.</li>
</ul>
<p>= Other Ideas not implemented here:</p>
<ul>
<li>Remove the obj_is_kind_of(mesg, rb_mRaiseable) restriction to allow pure duck-typing.</li>
<li>Clean up the ivar names (<a class="user active user-mention" href="https://bugs.ruby-lang.org/users/3036">@bt (Bernd Homuth)</a>, @mesg) and method names (set_backtrace).</li>
</ul>
<p>= Example</p>
<p>(({<br>
raiseable = Class.new do<br>
include Raiseable<br>
def self.exception *args; new *args; end<br>
end</p>
<pre><code>begin
raise raiseable, "this must be handled"
assert(false)
rescue Exception
assert(false)
rescue Raiseable
assert(true)
end
</code></pre>
<p>}))</p> Ruby master - Feature #5494 (Rejected): Proposal: Improved Finalizer Semantics https://bugs.ruby-lang.org/issues/54942011-10-28T12:20:36Zkstephens (Kurt Stephens)
<p>Proposal: Improved Finalizer Semantics:</p>
<p>ObjectSpace.define_finalizer(object, proc):<br>
** proc should have a single parameter, the object to be finalized, not its id.</p>
<p>When an object with a finalizer is no longer referenced (sweepable):</p>
<ul>
<li>The object is reconsidered to be REFERENCED until next GC.</li>
<li>It's finalizer proc(s) are called, only once, with the object as the sole argument.</li>
<li>Subsequently, the finalizer procs are removed from the object.</li>
<li>The object's memory will <em>NOT</em> be reclaimed yet, nor will its C free_proc be called,<br>
since calling the finalizer proc effectively creates new (temporary) references to the object.</li>
<li>If the finalizer did <em>NOT</em> create any additonal, long-term references to the object,<br>
the object's memory and low-level C resources will be reclaimed in the next GC.</li>
</ul>
<p>This is a simpler protocol:</p>
<ul>
<li>It removes the need for _id2ref in the finalizer procs.</li>
<li>Prevents other complications: such as GC being reinvoked within a finalizer.</li>
<li>Finalizers are invoked with the same "urgency" as before.</li>
</ul>
<p>The downside:</p>
<ul>
<li>Objects with finalizers actually live for more than one GC cycle, if they are unreferenced.</li>
<li>This is probably acceptable since the resources the finalizers "clean-up"<br>
(eg.: file descriptors in a File object) are generally more scarce than the objects holding them.</li>
</ul> Ruby master - Feature #5394 (Rejected): Anonymous Symbols, Anonymous Methodshttps://bugs.ruby-lang.org/issues/53942011-10-04T04:11:16Zkstephens (Kurt Stephens)
<p>Proposal for Anonymous Symbols and Anonymous Methods</p>
<p>Anonymous Methods (AnonMeths) can be used for complex orthogonal behaviors that dispatch by receiver class without patching core or other sensitive classes in a globally visible manner.<br>
AnonMeths are located by Anonymous Symbols (AnonSyms).<br>
AnonSyms do not have parseable names, and can only be referenced by value, limiting namespace problems and promoting encapsulation.<br>
AnonMeths are GCed once the AnonSym bound to them are GCed.<br>
AnonMeths would not appear in Object#methods, thus will not confuse introspection.</p>
<p>Assume:</p>
<pre><code>
Symbol.new() => # # an AnonSymbol than can never be parsed in ruby code.
anon_sym = Symbol.new() # an AnonSym in a variable that can be closed-over or passed by value.
</code></pre>
<p>Optional Supporting Syntax:</p>
<pre><code>
a.*anon_sym(args...) # equiv. to a.send(anon_sym, args...)
class A
def *anon_sym(args...); body...; end
end
</code></pre>
<p>equiv. to:</p>
<pre><code>
class A
define_method(anon_sym) {| args... | body... }
end
</code></pre>
<p>AnonSyms are not added directly to a Module's internal symbol-to-method table.<br>
Instead, each AnonSym has an internal module-to-method table that is GCed when the AnonSym is GCed.</p>
<pre><code>
rcvr.send(anon_sym, ...)
</code></pre>
<p>will use anon_sym's module-to-method table to locate a method based on usual the receiver's module lookup chain.</p>
<p>Example Application:</p>
<p>Typical visitor pattern that pollutes Array and Object method namespaces:</p>
<pre><code>
class Array; def visit(visitor); each { | elem | elem.visit(visitor); } end; end
class Object; def visit(visitor); visitor.something(self); end; end
</code></pre>
<p>Functional alternative using "case ...; when ...":</p>
<pre><code>
def visit(obj, visitor)
case obj
when Array
obj.each { | elem | visit(elem, visitor) }
else
visitor.something(obj)
end
end
</code></pre>
<p>AnonMeth version:</p>
<pre><code>
def visit(obj, visitor)
sel = Symbol.new
class Array; def *sel(visitor); each { | elem | elem.*sel(visitor) }; end; end
class Object; def *sel(visitor); visitor.something(self); end; end
obj.*sel(visitor)
end
</code></pre>
<p>Imagine that visit() needs dynamic hooks to visit different types:</p>
<pre><code>
def visit(obj, visitor)
sel = Symbol.new
class Array; def *sel(visitor); each { | elem | elem.*sel(visitor) }; end; end
class Object; def *sel(visitor); visitor.something(self); end; end
add_visit_methods!(sel)
obj.*sel(visitor)
end
def add_visit_methods!(sel)
class Hash; def *sel(visitor); each { | k, v | v.*sel(visitor); end; end
...
end
</code></pre>
<p>The AnonSym send "rcvr.*sel(...)" dispatches, like a normal method send, directly to the appropriate AnonMeth for "*sel".<br>
visit() can be extended dynamically by adding more AnonMeths bound to "*sel".<br>
The functional "case ...; when..." version is difficult to extend and maintain and is likely to not perform as well as anon messages.<br>
This is similar in style to Scheme letrecs, but is object-oriented.</p>
<p>This idea could be extended to Anonymous Ivars to resolve other namespacing and encapsulation issues for mixins that require state.</p>
<p>-- Kurt Stephens</p> Ruby master - Feature #5392 (Closed): Symbol GChttps://bugs.ruby-lang.org/issues/53922011-10-04T01:41:13Zkstephens (Kurt Stephens)
<p>I looked more into Symbol GC. The biggest problem is IDs are not VALUEs. My outburst at RubyConf based on my stupid assumption that they were -- I was trying to attack the problem using WeakRefs.</p>
<p>If IDs were VALUEs and Symbols were allocated like any other Object, the existing GC mark and root machinery (including C stack root scans), would take care of it, with an additional sweep of the global_symbol lookup tables.</p>
<p>However, the remaining issue is IDs stored in globals. No matter what, IDs stored in C globals will need to be rb_gc_register_address(VALUE*) roots -- this means CRuby API/contract changes.</p>
<p>Adding a standalone ID mark table and a rb_gc_mark_id() function will not fix problem of lone IDs on the C stack.</p>
<p>What was the original reason to distinguish Symbol IDs from Object VALUEs, besides making lexer tokens simple to map.<br>
Would changing IDs to be allocated VALUE objects simplify internals anyway? This change could also allow Anonymous Symbols and Anonymous Methods.</p>
<p>-- Kurt Stephens</p> Ruby master - Feature #5106 (Rejected): Is MurmurHash overkill?https://bugs.ruby-lang.org/issues/51062011-07-27T12:54:32Zkstephens (Kurt Stephens)
<p>st.c implements MurmurHash to compute hash table indexes (#hash).</p>
<p>Simpler hash functions may be appropriate for hash tables, esp. small tables.</p>
<p>Is there a particular reason this hash function was chosen? Is MurmurHash typically used for check-summing purposes?</p>
<p>Anybody positively adverse to changing it? If so I won't bother. Otherwise I might take a crack at it.</p>
<p>-- KAS</p> Ruby master - Feature #5033 (Closed): PATCH: 1.9: gc_mark_children: Avoid gc_mark() tail recursio...https://bugs.ruby-lang.org/issues/50332011-07-16T16:45:23Zkstephens (Kurt Stephens)
<p>Minor GC improvement.</p>
<p>Avoid recurring into gc_mark() when "goto again;" is sufficient.</p>
<p>-- KAS</p> Ruby master - Feature #4990 (Closed): Proposal: Internal GC/memory subsystem APIhttps://bugs.ruby-lang.org/issues/49902011-07-08T05:50:11Zkstephens (Kurt Stephens)
<p>There is significant interest in improving/altering the performance, behavior and features of MRI's GC in 1.8 and 1.9 series.</p>
<p>Proposal: MRI should support an internal GC API -- to separate MRI core from its current GC implementation,<br>
and provide hooks for additional features:</p>
<ol>
<li>Interfaces between MRI internals and any GC/allocator implementation:</li>
</ol>
<ul>
<li>stock MRI GC</li>
<li>malloc() without free() to support valgrind testing (or short-lived programs)</li>
<li>variants of stock MRI GC (<a href="http://engineering.twitter.com/2011/03/building-faster-ruby-garbage-collector.html" class="external">http://engineering.twitter.com/2011/03/building-faster-ruby-garbage-collector.html</a> and REE)</li>
<li>BDW (<a href="http://www.hpl.hp.com/personal/Hans_Boehm/gc/" class="external">http://www.hpl.hp.com/personal/Hans_Boehm/gc/</a>)</li>
<li>other collectors (<a href="https://github.com/kstephens/smal" class="external">https://github.com/kstephens/smal</a>)</li>
</ul>
<ol start="2">
<li>
<p>Support selecting GC implementations at run-time or compile time.</p>
</li>
<li>
<p>Support malloc() replacements, at run-time and/or compile time, such as:</p>
</li>
</ol>
<ul>
<li>tcmalloc</li>
<li>jemalloc</li>
</ul>
<ol start="4">
<li>Support callback hooks in allocation and GC phases to orthogonally add features, such as:</li>
</ol>
<ul>
<li>performant/correct WeakReferences and ReferenceQueues (<a href="http://redmine.ruby-lang.org/issues/4168" class="external">http://redmine.ruby-lang.org/issues/4168</a>).</li>
<li>allocation tracing/debugging.</li>
<li>instance caching (e.g.: Floats)</li>
<li>computational caching.</li>
<li>cache invalidation.</li>
<li>metrics collection.</li>
</ul>
<ol start="5">
<li>Interfaces to common features of alternate GCs:</li>
</ol>
<ul>
<li>finalization</li>
<li>weak references</li>
<li>atomic allocations (e.g.: string or binary data)</li>
<li>mostly read-only/static allocations (e.g.: code, global bindings)</li>
</ul>
<p>A prototype GC phase callback API for 1.8, REE and 1.9 is here:</p>
<p><a href="https://github.com/kstephens/ref/tree/master-mri-gc_api/patch" class="external">https://github.com/kstephens/ref/tree/master-mri-gc_api/patch</a></p>
<p>This GC API should be supported on both 1.8 and 1.9 code lines.</p> Backport187 - Backport #4493 (Closed): Patch: MRI 1.8.7: syck: fix buffer overflow when parsing Y...https://bugs.ruby-lang.org/issues/44932011-03-11T02:39:20Zkstephens (Kurt Stephens)
<p>=begin<br>
Certain sequences of tokens will cause syck.c store a NULL string terminator outside the allocated p->buffer when parsing from a large YAML string, causing memory corruption leading to SEGV faults.</p>
<p>The problem was discovered by completely disabling MRI's GC, by changing gc.c:rb_newobj() to call xalloc() directly and returning immediately in gc.c:garbage_collect() and then running REE under valgrind. REE stack clearing code was also disabled. Problem was also visible by directly instrumenting syck.c with mprotect().</p>
<p>The patch is applicable to REE 1.8 and MRI 1.8.7:</p>
<ol>
<li>Replaces the confusing logic in syck.c:syck_io_str_read() with behavior similar to syck_io_file_read().</li>
<li>Enables ASSERT() by default.</li>
<li>syck_assert() now takes a string msg.</li>
<li>syck_assert() calls rb_raise() instead of calling abort().</li>
<li>Removes a bogus ASSERT() that always fails under MRI unit tests.</li>
</ol>
<p>This patch <em>does not</em> fix unterminated quoted strings that would normally raise a parsing error under psych.</p>
<p>See <a href="http://code.google.com/p/rubyenterpriseedition/issues/detail?id=66" class="external">http://code.google.com/p/rubyenterpriseedition/issues/detail?id=66</a> for patch.</p>
<p>Contact me directly for specific test cases, instrumentation patches, etc.</p>
<p>=end</p> Backport187 - Backport #3359 (Closed): Date::Format::Bag performance improvementhttps://bugs.ruby-lang.org/issues/33592010-05-28T11:36:06Zkstephens (Kurt Stephens)
<p>=begin<br>
Date::Format::Bag spends a lot of time in method_missing for a limited number of method selectors.</p>
<p>The hack below removed ~ 3% of time from a ActiveRecord/Rails app that parses and formats many Date objects.</p>
<pre>
class Date::Format::Bag
def method_missing(sel, *args) # , &block)
sel = sel.to_s
t = sel.dup
set = t.chomp!('=')
t = t.intern
if set
value = @elem[t] = args[0]
expr = "def #{sel}(arg); @elem[#{t.inspect}] = arg; end"
else
value = @elem[t]
#
# There appear to be no callers like:
#
# e.foo(something)
#
# Thus do not interpret any arguments as
# the *rest parameter creates Arrays that are
# never referenced.
#
# expr = "def #{sel}(*rest); @elem[#{t.inspect}]; end"
expr = "def #{sel}(); @elem[#{t.inspect}]; end"
end
expr = "class #{self.class}; #{expr}; end;"
# $stderr.puts " **** #{self.class}\##{sel} => #{expr}"
eval(expr)
value
end
end
</pre>
<p>Might be better to just enumerate all the possible Hash slot getter/setters using a simple class macro,<br>
since Date::Format::Bag is not used by anything outside of Date.<br>
=end</p> Backport187 - Backport #2594 (Closed): 1.8.7 Patch: Reduce time spent in gc.c is_pointer_to_heap().https://bugs.ruby-lang.org/issues/25942010-01-12T03:45:40Zkstephens (Kurt Stephens)
<p>=begin<br>
gc.c:</p>
<p>Rationale:</p>
<ul>
<li>The size of struct heap_slots grows exponentially.</li>
<li>add_heap() puts new heaps on the end of the heaps[] array.</li>
<li>The newest heaps are placed toward the end.</li>
<li>The newer heaps are larger, thus are more likely to contain valid pointers than smaller heaps.</li>
<li>sort_heaps() reorders the heaps[] array such that early probes are more likely to match in larger heaps.</li>
</ul>
<p>This was developed under REE 1.8.7, and ported to 1.8.7.</p>
<p>Patches:</p>
<p>MRI 1.8.7: <a href="http://github.com/kstephens/ruby/commit/263551bbf8e52aa031433e4e00936f41760b3980" class="external">http://github.com/kstephens/ruby/commit/263551bbf8e52aa031433e4e00936f41760b3980</a><br>
REE 1.8.7: <a href="http://github.com/kstephens/rubyenterpriseedition187/commit/d69554f0b37331a597f8837abba37c302701d292" class="external">http://github.com/kstephens/rubyenterpriseedition187/commit/d69554f0b37331a597f8837abba37c302701d292</a></p>
<p>See also: <a href="http://code.google.com/p/rubyenterpriseedition/issues/detail?id=24&colspec=ID" class="external">http://code.google.com/p/rubyenterpriseedition/issues/detail?id=24&colspec=ID</a> Type Status Priority Milestone Summary</p>
<p>Measurements: ~ 2% faster overall:</p>
<p>cnuapp@kurt-4:/export/bug/103302/cnu_ruby_build/rubyenterpriseedition187$ ./ruby ../test_gc_options.rb<br>
WARMUP:<br>
./miniruby -I./lib -I.ext/common -I./- -r./ext/purelib.rb -r ../close_fds.rb ./runruby.rb --extout=.ext -- ./test/runner.rb --basedir=./test --runner=console:</p>
<p>RUBY_GC_SORT_HEAPS=0 RUBY_GC_COPY_ON_WRITE_FRIENDLY=0 :<br>
Command exited with non-zero status 1<br>
189.05user 10.50system 4:25.50elapsed 75%CPU (0avgtext+0avgdata 0maxresident)k<br>
0inputs+10112outputs (0major+533733minor)pagefaults 0swaps</p>
<p>RUBY_GC_SORT_HEAPS=1 RUBY_GC_COPY_ON_WRITE_FRIENDLY=0 :<br>
Command exited with non-zero status 1<br>
185.37user 10.51system 4:20.12elapsed 75%CPU (0avgtext+0avgdata 0maxresident)k<br>
0inputs+10120outputs (0major+529560minor)pagefaults 0swaps<br>
=end</p> Ruby 1.8 - Feature #2561 (Closed): 1.8.7 Patch reduces time cost of Rational operations by 50%.https://bugs.ruby-lang.org/issues/25612010-01-06T10:58:21Zkstephens (Kurt Stephens)
<p>=begin<br>
This changes adds a specialize Fixnum#gcd and a tuned rational.rb.<br>
Reduced overall time on Rational operations by > 50%.</p>
<pre>
user system total real
test_it 22.380000 2.140000 24.520000 ( 24.559388)
test_it ks_rational 17.870000 1.830000 19.700000 ( 19.687221)
test_it ks_rational + Fixnum#gcd 10.660000 0.000000 10.660000 ( 10.665765)
</pre>
<p>The patch will perform better with Fixnum#gcd on numeric.c but still is faster with only the rational.rb changes.</p>
<p>I have a version for 1.8.6 if someone wants it.<br>
=end</p>