Bug #19621
Updated by felix.wolfsteller@betterplace.org (Felix Wolfsteller) over 1 year ago
By default on unixoid systems, `Resolv` will read `/etc/hosts` once. Privacy- and security aware people might use the file to prevent unwanted traffic, developers use it to quickly manipulate address resolution. `Resolv::Hosts` uses [`IO.read`](https://github.com/betterplace/ruby/blob/9b07d30df8c6bf65c2558c023fd6452405915610/lib/resolv.rb#LL195C4-L195C4), which seems to be inefficient when dealing with large amounts of data that should be consumed by line. E.g. if you install the `/etc/hosts` additions by [hblock](https://hblock.molinero.dev/hosts) (https://github.com/hectorm/hblock), the first call to resolve an address will likely take **minutes**. Unfortunately, replacing `.open ... .each` with We believe the solution is easy: Use streaming `IO.foreach` does not help. (see patch and PR attached). Benchmarking with partial examplary `/etc/hosts` /etc/host from above (172751 line) with xyz done like this ```ruby ``` require 'resolv' require 'benchmark' Benchmark.measure do Resolv::Hosts.new.lazy_initialize end ``` yields to With `read`: ``` ... With `foreach`: 25.622515 8.821095 34.443610 ( 34.495448) ... ``` . Reading in all the lines into memory first and then consuming them (`File.readlines`) might improve the situation, but is probably not desirable due to memory concerns.