Bug #1332: Reading file on Windows is 500x slower then with previous Ruby version - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #1332

closed

Reading file on Windows is 500x slower then with previous Ruby version

Bug #1332: Reading file on Windows is 500x slower then with previous Ruby version

Added by ther (Damjan Rems) over 17 years ago. Updated about 15 years ago.

Status:

Closed

Assignee:

akr (Akira Tanaka)

Target version:

1.9.2

ruby -v:

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]

Backport:

[ruby-core:23063]

Description

=begin
time = [Time.new]
c = ''
'aaaa'.upto('zzzz') {|e| c << e}
3.times { c << c }
time << Time.new
File.open('out.file','w') { |f| f.write(c) }
time << Time.new
c = File.open('out.file','r') { |f| f.read }
time << Time.new
0.upto(time.size - 2) {|i| p "#{i} #{time[i+1]-time[i]}" }

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
"0 0.537075"
"1 0.696244"
"2 40.188834"

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
"0 0.551"
"1 0.133"
"2 0.087"

That is about 5x slower write and 500x read operation. Times are the
same if I do:
f = File.new('out.file','r')
c = f.read
f.close

Tried on two machines. Vista SP1 and XP SP3. Same results.

Tried with virus scanner disabled. Same results.

Tried on old Win2K P4 2.4Ghz machine without virus scanner
"0 1.0625"
"1 1.09375"
"2 111.171875"

Thats 111 seconds to read 14.623.232 bytes long file which is probably read from cache anyway.

The problem doesn't seem to exist on Linux althow I have tried only Ruby 1.9.0 version.

by
TheR
=end

Related issues 2 (0 open — 2 closed)

Updated by yugui (Yuki Sonoda) about 17 years ago Actions
Copy link
#1

Target version set to 1.9.2

Updated by yugui (Yuki Sonoda) about 17 years ago Actions
Copy link
#2

Status changed from Open to Assigned
Assignee set to akr (Akira Tanaka)
Priority changed from Normal to 3

Updated by rogerdpack (Roger Pack) over 16 years ago Actions
Copy link
#3

I believe this is related to other issues regarding reading files in non-binary mode being slow in 1.9

a = File.open('l', 'w'); 10000000.times { a.write "abc\n" }; a.close
Benchmark.measure { a = File.open('l', 'r'); a.readlines; a.close }.real
=> 11.890625
Benchmark.measure { a = File.open('l', 'rb'); a.readlines; a.close }.real
=> 3.59375

I believe that it is doing a string conversion from one encoding ["\r\n"] to another ["\n"].

Perhaps there is a way to speed this up? (ex: special case it somehow)?

-r

refs:
http://www.ruby-forum.com/topic/182691
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/24824

Updated by usa (Usaku NAKAMURA) over 16 years ago Actions
Copy link
#4

=begin
Hello,

In message "[ruby-core:26505] [Bug #1332] Reading file on Windows is 500x slower then with previous Ruby version"
on Nov.04,2009 04:50:49, redmine@ruby-lang.org wrote:

I believe that it is doing a string conversion from one encoding ["\r\n"] to another ["\n"].

right.

Perhaps there is a way to speed this up? (ex: special case it somehow)?

Currently, we has implemented the newline conversion as a
transcode converter, just like encoding conversion.
But the design of transcode is too general to use it such
a simple operation, as our finding.
We want to find a better mechanism which doesn't deviate
from the current design of IO...

Regards,¶

U.Nakamura usa@garbagecollect.jp

=end

Updated by jonforums (Jon Forums) over 16 years ago Actions
Copy link
#5

=begin

Currently, we has implemented the newline conversion as a
transcode converter, just like encoding conversion.
But the design of transcode is too general to use it such
a simple operation, as our finding.
We want to find a better mechanism which doesn't deviate
from the current design of IO...

Do you think the current transcode design is also the cause of

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/24839

Jon

=end

Updated by rogerdpack (Roger Pack) over 16 years ago Actions
Copy link
#6

A temporary work around [though not actually binary compatible] appears to be

Index: ruby.c
===================================================================
--- ruby.c      (revision 25830)
+++ ruby.c      (working copy)
@@ -1484,6 +1484,7 @@
        int fd, mode = O_RDONLY;
 #if defined DOSISH || defined __CYGWIN__
        {
+           mode |= O_BINARY;
            const char *ext = strrchr(fname, '.');
            if (ext && STRCASECMP(ext, ".exe") == 0)
                mode |= O_BINARY;

This causes all ruby script files loaded to be loaded as binary. The drawback is that if you have a ruby script that was saved as ascii and contains strings that wrap lines, those strings will have an extra "\n" in them, ex:

File.write 'stringy.rb', "a="abc\r\ndef"; puts a.inspect"

normal ruby:

C:>ruby stringy.rb
"abc\ndef"

patched ruby:

C:\>ruby stringy.rb
"abc\r\ndef"

But if your files were saved in binary mode it will be the same.
And the slowdown is gone for now.
Hopefully a better fix can be created.
Thanks.
-r

Updated by usa (Usaku NAKAMURA) over 16 years ago Actions
Copy link
#7

Hello,

In message "[ruby-core:26840] [Bug #1332] Reading file on Windows is 500x slower then with previous Ruby version"
on Nov.21,2009 08:10:45, redmine@ruby-lang.org wrote:

This causes all ruby script files loaded to be loaded as binary. The drawback is that if you have a ruby script that was saved as ascii and contains strings that wrap lines, those strings will have an extra "\n" in them, ex:

pseudo-IO DATA recognizes the script file as data file.
So, changing default mode breaks the compatibility of such
scripts.

Regards,¶

U.Nakamura usa@garbagecollect.jp

Updated by rogerdpack (Roger Pack) over 16 years ago Actions
Copy link
#8

Appears that

the writes have slowed down, "only" by about 100% (take twice as long to write in ascii 1.9 as in 1.8). Not terrible.
the reads have slowed down by something like 40000% (!)

I think to avoid the slowdown with reads you can "hack a work around" like

c = File.open('out.file','rb') { |f| f.read }
c.gsub!("\r\n", "\n")

But this seems like there might be a bug in there, too.

-rp

Updated by mame (Yusuke Endoh) over 16 years ago Actions
Copy link
#9

Status changed from Assigned to Closed

Hi,

This was fixed at r27340.

Buffer was extended (realloc'ed) in linear-order, which resulted
in O(n^2 ). Now it is extended using "double memory if you run out"
rule, like String. So the problem was solved, I think.

Thanks,

--
Yusuke Endoh mame@tsg.ne.jp

Updated by rogerdpack (Roger Pack) over 16 years ago Actions
Copy link
#10

appears to be much better in trunk.

1.9.1:

"0 0.396039"
"1 0.352035"
"2 43.111311"

1.9.2:

"0 0.369037"
"1 0.513051"
"2 1.626163" # still 10x as slow as 1.8.6, but probably because of a different reason.

Thanks!
-rp

Updated by mame (Yusuke Endoh) over 16 years ago Actions
Copy link
#11

Hi,

2010/4/16 Roger Pack redmine@ruby-lang.org:

1.9.2:

"0 0.369037"
"1 0.513051"
"2 1.626163" # still 10x as slow as 1.8.6, but probably because of a different reason.

Yes, text mode is still 10x -- 30x slower than binary mode.
It is reproduced not only on windows but also Linux.
Perhaps, this is the symptom because of the reason explained
in [ruby-core:26515].

--
Yusuke ENDOH mame@tsg.ne.jp

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

Ruby

Custom queries

Bug #1332

Reading file on Windows is 500x slower then with previous Ruby version

Updated by yugui (Yuki Sonoda) about 17 years ago Actions
Copy link
#1

Updated by yugui (Yuki Sonoda) about 17 years ago Actions
Copy link
#2

Updated by rogerdpack (Roger Pack) over 16 years ago Actions
Copy link
#3

Updated by usa (Usaku NAKAMURA) over 16 years ago Actions
Copy link
#4

Regards,¶

Updated by jonforums (Jon Forums) over 16 years ago Actions
Copy link
#5

Updated by rogerdpack (Roger Pack) over 16 years ago Actions
Copy link
#6

Updated by usa (Usaku NAKAMURA) over 16 years ago Actions
Copy link
#7

Regards,¶

Updated by rogerdpack (Roger Pack) over 16 years ago Actions
Copy link
#8

Updated by mame (Yusuke Endoh) over 16 years ago Actions
Copy link
#9

Updated by rogerdpack (Roger Pack) over 16 years ago Actions
Copy link
#10

Updated by mame (Yusuke Endoh) over 16 years ago Actions
Copy link
#11

	Related to Ruby - Bug #2742: IO#read/gets can be very slow in doze	Closed		Actions
	Related to Ruby - Feature #3228: speedup File.read	Rejected		Actions

Project

General

Profile

Ruby

Custom queries

Bug #1332

Reading file on Windows is 500x slower then with previous Ruby version

Updated by yugui (Yuki Sonoda) about 17 years ago ActionsCopy link #1

Updated by yugui (Yuki Sonoda) about 17 years ago ActionsCopy link #2

Updated by rogerdpack (Roger Pack) over 16 years ago ActionsCopy link #3

Updated by usa (Usaku NAKAMURA) over 16 years ago ActionsCopy link #4

Regards,¶

Updated by jonforums (Jon Forums) over 16 years ago ActionsCopy link #5

Updated by rogerdpack (Roger Pack) over 16 years ago ActionsCopy link #6

Updated by usa (Usaku NAKAMURA) over 16 years ago ActionsCopy link #7

Regards,¶

Updated by rogerdpack (Roger Pack) over 16 years ago ActionsCopy link #8

Updated by mame (Yusuke Endoh) over 16 years ago ActionsCopy link #9

Updated by rogerdpack (Roger Pack) over 16 years ago ActionsCopy link #10

Updated by mame (Yusuke Endoh) over 16 years ago ActionsCopy link #11

Updated by yugui (Yuki Sonoda) about 17 years ago Actions
Copy link
#1

Updated by yugui (Yuki Sonoda) about 17 years ago Actions
Copy link
#2

Updated by rogerdpack (Roger Pack) over 16 years ago Actions
Copy link
#3

Updated by usa (Usaku NAKAMURA) over 16 years ago Actions
Copy link
#4

Updated by jonforums (Jon Forums) over 16 years ago Actions
Copy link
#5

Updated by rogerdpack (Roger Pack) over 16 years ago Actions
Copy link
#6

Updated by usa (Usaku NAKAMURA) over 16 years ago Actions
Copy link
#7

Updated by rogerdpack (Roger Pack) over 16 years ago Actions
Copy link
#8

Updated by mame (Yusuke Endoh) over 16 years ago Actions
Copy link
#9

Updated by rogerdpack (Roger Pack) over 16 years ago Actions
Copy link
#10

Updated by mame (Yusuke Endoh) over 16 years ago Actions
Copy link
#11