Backport #8165
closedProblems with require
Description
Require doesn't work correctly if path contains cyrillic.
require '/home/mak/test.rb' #-> true
require '/home/mak/test.rb' #-> false
require '/home/mak/Проекты/test.rb' #-> true
require '/home/mak/Проекты/test.rb' #-> true
Files
Updated by luislavena (Luis Lavena) over 11 years ago
- Status changed from Open to Feedback
Hello,
Please provide more details of the platform/OS, version and filesystem used.
Thank you.
Updated by mame (Yusuke Endoh) over 11 years ago
- Status changed from Feedback to Assigned
- Assignee set to tarui (Masaya Tarui)
- Target version set to 2.1.0
It reproduces on my Ubuntu 12.10 with ext4 filesystem.
$ cat t.rb
coding: UTF-8¶
p require("/home/mame/あいうえお/test.rb")
p require("/home/mame/あいうえお/test.rb")
p require("/home/mame/あいうえお/test.rb")
p(*$")
$ ruby -v t.rb
ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-linux]
true
true
true
"enumerator.so"
snip
"/home/mame/あいうえお/test.rb"
"/home/mame/あいうえお/test.rb"
"/home/mame/あいうえお/test.rb"
$ ruby-1.9.3-p392 -v t.rb
ruby 1.9.3p392 (2013-02-22 revision 39386) [x86_64-linux]
true
false
false
"enumerator.so"
snip
"/home/mame/あいうえお/test.rb"
I guess this is related to the performance improvement of require in 2.0.0.
Tarui-san or Shirosaki-san, do you have any idea?
--
Yusuke Endoh mame@tsg.ne.jp
Updated by funny_falcon (Yura Sokolov) over 11 years ago
This is occasionally fixed in trunk by https://bugs.ruby-lang.org/issues/8048 ,
by de-abusing Ruby Strings and using plain C strings.
Also backported version of https://bugs.ruby-lang.org/issues/8158 fixes this too,
cause it removes string comparison at all.
Updated by h.shirosaki (Hiroshi Shirosaki) over 11 years ago
It seems short_feature
is not proper if the path contains non-ascii characters.
rb_str_substr() would look up string position per character, not per byte. rb_str_subseq() seems to work for non-ascii.
diff --git a/load.c b/load.c
index 7c69461..dab493d 100644
--- a/load.c
+++ b/load.c
@@ -247,16 +247,16 @@ features_index_add(VALUE feature, VALUE offset)
if (p < feature_str)
break;
/* Now *p == '/'. We reach this point for every '/' in feature
. */
- short_feature = rb_str_substr(feature, p + 1 - feature_str, feature_end - p - 1);
- short_feature = rb_str_subseq(feature, p + 1 - feature_str, feature_end - p - 1);
features_index_add_single(short_feature, offset);
if (ext) {
-
short_feature = rb_str_substr(feature, p + 1 - feature_str, ext - p - 1);
-
}short_feature = rb_str_subseq(feature, p + 1 - feature_str, ext - p - 1); features_index_add_single(short_feature, offset);
}
features_index_add_single(feature, offset);
if (ext) {
- short_feature = rb_str_substr(feature, 0, ext - feature_str);
- short_feature = rb_str_subseq(feature, 0, ext - feature_str);
features_index_add_single(short_feature, offset);
}
}
Updated by h.shirosaki (Hiroshi Shirosaki) over 11 years ago
- File 0001-load.c-fix-require-with-non-ascii-path.patch 0001-load.c-fix-require-with-non-ascii-path.patch added
One more issue seems string encoding.
When looking up a feature name in rb_feature_p(), encoding information lacks.
So `short_feature' should not have encoding.
If encoding of the two non-ascii strings is different, hash value would be different.
load.c: rb_feature_p()
feature_val = rb_str_new(feature, len); // not have encoding
this_feature_index = rb_hash_lookup(features_index, feature_val);
I've attached a patch. Tested on both trunk and 2.0.0 branch.
ruby 2.1.0dev (2013-03-29 trunk 39991) [x86_64-linux]
ruby 2.0.0p100 (2013-03-27 revision 39954) [x86_64-linux]
Updated by funny_falcon (Yura Sokolov) over 11 years ago
May be it will be simpler to accept https://bugs.ruby-lang.org/issues/8158 ?
It has less memory requirement, avoids string comparisons, etc
(yes, i'm annoying a bit, sorry)
2013/3/29 h.shirosaki (Hiroshi Shirosaki) h.shirosaki@gmail.com
Issue #8165 has been updated by h.shirosaki (Hiroshi Shirosaki).
File 0001-load.c-fix-require-with-non-ascii-path.patch added
One more issue seems string encoding.
When looking up a feature name in rb_feature_p(), encoding information
lacks.
So `short_feature' should not have encoding.
If encoding of the two non-ascii strings is different, hash value would be
different.load.c: rb_feature_p()
feature_val = rb_str_new(feature, len); // not have encoding this_feature_index = rb_hash_lookup(features_index, feature_val);
I've attached a patch. Tested on both trunk and 2.0.0 branch.
ruby 2.1.0dev (2013-03-29 trunk 39991) [x86_64-linux]
ruby 2.0.0p100 (2013-03-27 revision 39954) [x86_64-linux]Bug #8165: Problems with require
https://bugs.ruby-lang.org/issues/8165#change-38028Author: Krugloff (Alexandr Kruglov)
Status: Assigned
Priority: Normal
Assignee: tarui (Masaya Tarui)
Category:
Target version: current: 2.1.0
ruby -v: ruby 2.0.0-p0Require doesn't work correctly if path contains cyrillic.
require '/home/mak/test.rb' #-> true
require '/home/mak/test.rb' #-> falserequire '/home/mak/Проекты/test.rb' #-> true
require '/home/mak/Проекты/test.rb' #-> true
Updated by h.shirosaki (Hiroshi Shirosaki) over 11 years ago
funny_falcon (Yura Sokolov) wrote:
May be it will be simpler to accept https://bugs.ruby-lang.org/issues/8158 ?
It has less memory requirement, avoids string comparisons, etc
Indeed your patch would not have this bug since using C string instead of string object.
I think smaller patch or splitting patch might be better to be reviewed.
Updated by funny_falcon (Yura Sokolov) over 11 years ago
My patch doesn't store any string value, only hashes of them. It relies on
loaded_feature_path for string comparison.
Nor it uses ruby arrays for index storage - it organize them in linked
lists which are stored in a single C array.
29.03.2013 20:14 пользователь "h.shirosaki (Hiroshi Shirosaki)" <
h.shirosaki@gmail.com> написал:
Issue #8165 has been updated by h.shirosaki (Hiroshi Shirosaki).
funny_falcon (Yura Sokolov) wrote:
May be it will be simpler to accept
https://bugs.ruby-lang.org/issues/8158 ?
It has less memory requirement, avoids string comparisons, etcIndeed your patch would not have this bug since using C string instead of
string object.
I think smaller patch or splitting patch might be better to be reviewed.
Bug #8165: Problems with require
https://bugs.ruby-lang.org/issues/8165#change-38032Author: Krugloff (Alexandr Kruglov)
Status: Assigned
Priority: Normal
Assignee: tarui (Masaya Tarui)
Category:
Target version: current: 2.1.0
ruby -v: ruby 2.0.0-p0Require doesn't work correctly if path contains cyrillic.
require '/home/mak/test.rb' #-> true
require '/home/mak/test.rb' #-> falserequire '/home/mak/Проекты/test.rb' #-> true
require '/home/mak/Проекты/test.rb' #-> true
Updated by Anonymous over 11 years ago
- Status changed from Assigned to Closed
- % Done changed from 0 to 100
This issue was solved with changeset r40135.
Alexandr, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.
-
load.c (features_index_add): use rb_str_subseq() to specify C string
position properly to fix require non ascii path.
[ruby-core:53733] [Bug #8165] -
test/ruby/test_require.rb (TestRequire#test_require_nonascii_path):
a test for the above.
Updated by nagachika (Tomoyuki Chikanaga) over 11 years ago
- Tracker changed from Bug to Backport
- Project changed from Ruby master to Backport200
- Status changed from Closed to Assigned
- Assignee changed from tarui (Masaya Tarui) to nagachika (Tomoyuki Chikanaga)
- Target version deleted (
2.1.0)
Updated by nagachika (Tomoyuki Chikanaga) over 11 years ago
working memo: backport 40135,40148,40173 but ruby/test_require.rb TestRequire#test_require_nonascii_path fails.
Updated by h.shirosaki (Hiroshi Shirosaki) over 11 years ago
nagachika (Tomoyuki Chikanaga) wrote:
working memo: backport 40135,40148,40173 but ruby/test_require.rb TestRequire#test_require_nonascii_path fails.
Thank you for backport. #8048 would also be needed for this issue.
Hash lookup of feature_index fails due to String encoding of the key. r39874 changes hash key from ruby String to C string.
Updated by nagachika (Tomoyuki Chikanaga) over 11 years ago
shirosaki san
Thank you for your advice! I'll try to backport with r39874 later.
Updated by nagachika (Tomoyuki Chikanaga) over 11 years ago
- Status changed from Assigned to Closed
This issue was solved with changeset r40398.
Alexandr, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.
merge revision(s) 40135,40148,40173: [Backport #8165]
* load.c (features_index_add): use rb_str_subseq() to specify C string
position properly to fix require non ascii path.
[ruby-core:53733] [Bug #8165]
* test/ruby/test_require.rb (TestRequire#test_require_nonascii_path):
a test for the above.
* test/ruby/test_require.rb (TestRequire#test_require_nonascii_path):
fix load path for encoding to run the test as stand-alone.
* test/ruby/test_require.rb (TestRequire#test_require_nonascii_path):
RUBY_PLATFORM should escape as Regexp,
because RUBY_PLATFORM may contain '.'.