Project

General

Profile

Feature #19102

Updated by k0kubun (Takashi Kokubun) over 1 year ago

## Proposal 
 Change the behavior of `ERB::Util.html_escape` in the following two parts: 

 1. Skip converting an argument with `#to_s` if the argument is already a `T_STRING`. 
 2. Do not allocate and return a new String when nothing needs to be escaped. 

 ## Background 
 The current `ERB::Util.html_escape` is implemented as `CGI.escapeHTML(s.to_s)`. So the performance is almost equal to `CGI.escapeHTML` except for the `to_s` call. Because it's common to embed non-String expressions in template engines, a template engine typically calls `to_s` to convert non-String expressions to a String before escaping it, which is why the difference exists. Proposal (1) is useful for optimizing the case that something that's already a `String` is embedded. We ignore the extreme case that `String#to_s` is weirdly monkey-patched. 

 As to proposal (2), my original implementation of `CGI.escapeHTML` https://github.com/ruby/ruby/pull/1164 was not calling `rb_str_dup` for that case. However, because [Bug #11858] claimed returning the argument object for non-escaped cases is a backward incompatibility with the old `gsub`-based implementation, we added the unneeded `rb_str_dup` call and the performance for that case has been compromised. This behavior is completely unnecessary for template engines. On the other hand, because `ERB::Util.html_escape` is a helper for ERB, we should not need to consider any backward compatibility that is not relevant to ERB or any template engines. So proposal (2) should be possible in `ERB::Util.html_escape` unlike `CGI.escapeHTML`. 

 ## Benchmark 
 Implementation: https://github.com/ruby/erb/pull/27 

 ```rb 
 require 'benchmark/ips' 
 require 'erb' 

 class << ERB::Util 
   def html_escape_old(s) 
     CGI.escapeHTML(s.to_s) 
   end 
 end 

 Benchmark.ips do |x| 
   s = 'hello world' 
   x.report('before') { ERB::Util.html_escape_old(s) } 
   x.report('after')    { ERB::Util.html_escape(s) } 
   x.compare! 
 end 
 ``` 

 ``` 
 ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux] 
 Warming up -------------------------------------- 
               before       1.066M i/100ms 
                after       1.879M i/100ms 
 Calculating ------------------------------------- 
               before       10.615M (± 0.3%) i/s -       53.320M in     5.023083s 
                after       18.742M (± 0.4%) i/s -       93.929M in     5.011847s 

 Comparison: 
                after: 18741747.6 i/s 
               before: 10615137.1 i/s - 1.77x    (± 0.00) slower 
 ```

Back