Feature #6047

read_all: Grow buffer exponentially in generic case

Added by Martin Bosslet about 2 years ago. Updated over 1 year ago.

[ruby-core:42748]
Status:Assigned
Priority:Normal
Assignee:Martin Bosslet
Category:core
Target version:next minor

Description

In the general case, read_all grows its buffer linearly by just the amount that is currently read from the underlying source. This results in a linear number of reallocs, It might turn out beneficial if the buffer were grown exponentially by multiplying with a constant factor (e.g. 1.5 or 2), thus resulting in only a logarithmic numver of reallocs.

I will provide a patch and benchmarks, but I'm already opening this issue so I won't forget.

See also https://bugs.ruby-lang.org/issues/5353 for more details.

History

#1 Updated by Koichi Sasada over 1 year ago

ping. status?
Do you need helps or comments?

#2 Updated by Martin Bosslet over 1 year ago

ko1 (Koichi Sasada) wrote:

ping. status?
Do you need helps or comments?

Thanks for your help, to be honest, I haven't tried so far. Can we leave it at 2.0.0 target for now? If I run into problems, I'll ask here!

#3 Updated by Eric Wong over 1 year ago

Martin Bosslet Martin.Bosslet@googlemail.com wrote:

In the general case, read_all grows its buffer linearly by just the
amount that is currently read from the underlying source. This results
in a linear number of reallocs, It might turn out beneficial if the
buffer were grown exponentially by multiplying with a constant factor
(e.g. 1.5 or 2), thus resulting in only a logarithmic numver of
reallocs.

I think growing the buffer exponentially makes sense.

I would enforce a hard limit (probably <= 8 MB) for each growth,
to:

1) discourage read_all() for large files, it's very wasteful and
usually hurts performance

2) prevent memory exhaustion for edge cases (especially on 32-bit)

#4 Updated by Yusuke Endoh over 1 year ago

  • Target version changed from 2.0.0 to next minor

My experience also shows that it is useless to open a ticket for a reminder to myself :-)

I'm setting to next minor tentatively, but if it is really just a performance improvement (i.e., it affects no external modules), you can commit it to 2.0.0 before code freeze.

Yusuke Endoh mame@tsg.ne.jp

Also available in: Atom PDF