Project

General

Profile

Backport #7278

'warning: regexp match /.../n against to UTF-8 string' in net/protocol.rb

Added by kakutani (Shintaro KAKUTANI) almost 7 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Normal
[ruby-dev:46394]

Description

かくたにです。
UTF-8のメールをRails 3.2.8 のActionMailerから出そうとすると、
'warning: regexp match /.../n against to UTF-8 string' の警告が出ます。
添付のパッチではASCII-8BITにforce_encodingする方法にしてみました。

過去1ヶ月の間にもモンキーパッチでしのいでる方を複数見かけたので、対応いただければと思っております。
http://dev.ywesee.com/Yasu/20121012-create-fachinfo-chapter-exporter-job
http://d.hatena.ne.jp/benikujyaku/20111002/1317536555

よろしくお願いします。


Files

Associated revisions

Revision ccd7a805
Added by naruse (Yui NARUSE) almost 7 years ago

  • lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line): don't use /n in universal regexp. [ruby-dev:46394] [Bug #7278]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@37487 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 37487
Added by naruse (Yui NARUSE) almost 7 years ago

  • lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line): don't use /n in universal regexp. [ruby-dev:46394] [Bug #7278]

Revision 37487
Added by naruse (Yui NARUSE) almost 7 years ago

  • lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line): don't use /n in universal regexp. [ruby-dev:46394] [Bug #7278]

Revision 37487
Added by naruse (Yui NARUSE) almost 7 years ago

  • lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line): don't use /n in universal regexp. [ruby-dev:46394] [Bug #7278]

Revision 37487
Added by naruse (Yui NARUSE) almost 7 years ago

  • lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line): don't use /n in universal regexp. [ruby-dev:46394] [Bug #7278]

Revision 37487
Added by naruse (Yui NARUSE) almost 7 years ago

  • lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line): don't use /n in universal regexp. [ruby-dev:46394] [Bug #7278]

Revision 37487
Added by naruse (Yui NARUSE) almost 7 years ago

  • lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line): don't use /n in universal regexp. [ruby-dev:46394] [Bug #7278]

Revision 6ce8c339
Added by naruse (Yui NARUSE) almost 7 years ago

  • lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line): treat \r as newline as mame pointed. [ruby-dev:46425] [Bug #7278]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@37563 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 37563
Added by naruse (Yui NARUSE) almost 7 years ago

  • lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line): treat \r as newline as mame pointed. [ruby-dev:46425] [Bug #7278]

Revision 37563
Added by naruse (Yui NARUSE) almost 7 years ago

  • lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line): treat \r as newline as mame pointed. [ruby-dev:46425] [Bug #7278]

Revision 37563
Added by naruse (Yui NARUSE) almost 7 years ago

  • lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line): treat \r as newline as mame pointed. [ruby-dev:46425] [Bug #7278]

Revision 37563
Added by naruse (Yui NARUSE) almost 7 years ago

  • lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line): treat \r as newline as mame pointed. [ruby-dev:46425] [Bug #7278]

Revision 37563
Added by naruse (Yui NARUSE) almost 7 years ago

  • lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line): treat \r as newline as mame pointed. [ruby-dev:46425] [Bug #7278]

Revision 37563
Added by naruse (Yui NARUSE) almost 7 years ago

  • lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line): treat \r as newline as mame pointed. [ruby-dev:46425] [Bug #7278]

Revision c35e7519
Added by usa (Usaku NAKAMURA) over 6 years ago

merge revision(s) 37487,37563: [Backport #7278]

    * lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line):
      don't use /n in universal regexp. [ruby-dev:46394] [Bug #7278]

    * lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line):
      treat \r as newline as mame pointed. [ruby-dev:46425] [Bug #7278]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_1_9_3@38830 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 38830
Added by usa (Usaku NAKAMURA) over 6 years ago

merge revision(s) 37487,37563: [Backport #7278]

* lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line):
  don't use /n in universal regexp. [ruby-dev:46394] [Bug #7278]

* lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line):
  treat \r as newline as mame pointed. [ruby-dev:46425] [Bug #7278]

History

Updated by no6v (Nobuhiro IMAI) almost 7 years ago

これだと yield される文字列のエンコーディングが ASCII-8BIT になってしまいますね。
each_line でダメなケースってあるのかな。test/net/ 以下のテストは全部とおります。
@wbuf を破壊しないことによる影響はちょっと分かりません。

diff --git a/lib/net/protocol.rb b/lib/net/protocol.rb
index 9733d56..743e59b 100644
--- a/lib/net/protocol.rb
+++ b/lib/net/protocol.rb
@@ -322,7 +322,7 @@ module Net # :nodoc:

 def each_crlf_line(src)
   buffer_filling(@wbuf, src) do
  • while line = @wbuf.slice!(/\A.*(?:\n|\r\n|\r(?!\z))/n)
  • @wbuf.each_line do |line| yield line.chomp("\n") + "\r\n" end end

Updated by kakutani (Shintaro KAKUTANI) almost 7 years ago

意図としてはeach_lineでも良さそうに見えますし、破壊しないほうが行儀は良さそう!

Updated by mame (Yusuke Endoh) almost 7 years ago

  • Status changed from Open to Assigned
  • Assignee set to naruse (Yui NARUSE)

正確にはここは微妙にメンテナがいないのですが、近縁の net/http のメンテナ
かつ encoding がらみということで成瀬さんに振ってみます。

コード見ただけですが、破壊的に書き換えることでバッファ管理してる気配が
ありますので、each_line じゃダメな予感がします。

そもそもこの正規表現、間違ってる気がすごくします。
/\A.?(?:\n|\r\n|\r(?!\z))/ と書きたかったのではないかなあ。
.
? の ? が足りない。

--
Yusuke Endoh mame@tsg.ne.jp

Updated by naruse (Yui NARUSE) almost 7 years ago

mame (Yusuke Endoh) wrote:

そもそもこの正規表現、間違ってる気がすごくします。
/\A.?(?:\n|\r\n|\r(?!\z))/ と書きたかったのではないかなあ。
.
? の ? が足りない。

ちなみに、この . は [\n] のことなので、? はなくても大丈夫です。

#5

Updated by naruse (Yui NARUSE) almost 7 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r37487.
Shintaro, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line): don't use /n in universal regexp. [ruby-dev:46394] [Bug #7278]

Updated by mame (Yusuke Endoh) almost 7 years ago

naruse (Yui NARUSE) wrote:

mame (Yusuke Endoh) wrote:

そもそもこの正規表現、間違ってる気がすごくします。
/\A.?(?:\n|\r\n|\r(?!\z))/ と書きたかったのではないかなあ。
.
? の ? が足りない。

ちなみに、この . は [\n] のことなので、? はなくても大丈夫です。

"foo\rbar\r" を "foo" と "bar" に分けてほしいのではないかと。
実際、この正規表現が書かれた r5907 以前では \r で切れていたように読めます。
動かしてないので勘違いならすみません。

--
Yusuke Endoh mame@tsg.ne.jp

#7

Updated by naruse (Yui NARUSE) over 6 years ago

  • Tracker changed from Bug to Backport
  • Project changed from Ruby master to Backport193
  • Status changed from Closed to Assigned
  • Assignee changed from naruse (Yui NARUSE) to usa (Usaku NAKAMURA)
#8

Updated by usa (Usaku NAKAMURA) over 6 years ago

  • Status changed from Assigned to Closed

This issue was solved with changeset r38830.
Shintaro, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


merge revision(s) 37487,37563: [Backport #7278]

* lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line):
  don't use /n in universal regexp. [ruby-dev:46394] [Bug #7278]

* lib/net/protocol.rb (Net::InternetMessageIO#each_crlf_line):
  treat \r as newline as mame pointed. [ruby-dev:46425] [Bug #7278]

Also available in: Atom PDF