Project

General

Profile

Actions

Feature #18033

open

Time.new to parse a string

Added by nobu (Nobuyoshi Nakada) 5 months ago. Updated 2 days ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:104552]

Description

Make Time.new parse Time#inspect and ISO-8601 like strings.

  • Time.iso8601 and Time.parse need an extension library, date.
  • Time.iso8601 can't parse Time#inspect string.
  • Time.parse often results in unintentional/surprising results.
  • Time.new also about 1.9 times faster than Time.iso8601.

    $ ./ruby -rtime -rbenchmark -e '
    n = 1000
    s = Time.now.iso8601
    Benchmark.bm(12) do |x|
      x.report("Time.iso8601") {n.times{Time.iso8601(s)}}
      x.report("Time.parse") {n.times{Time.parse(s)}}
      x.report("Time.new") {n.times{Time.new(s)}}
    end'
                       user     system      total        real
    Time.iso8601   0.006919   0.000185   0.007104 (  0.007091)
    Time.parse     0.018338   0.000207   0.018545 (  0.018590)
    Time.new       0.003671   0.000069   0.003740 (  0.003741)
    

https://github.com/ruby/ruby/pull/4639


Related issues

Related to Ruby master - Feature #16005: A variation of Time.iso8601 that can parse yyyy-MM-dd HH:mm:ssOpenActions
Related to Ruby master - Feature #18331: Kernel.#TimeOpenActions
Actions #1

Updated by nobu (Nobuyoshi Nakada) 5 months ago

  • Description updated (diff)

Updated by ioquatix (Samuel Williams) 5 months ago

It looks like good improvement.

But does any other #new implementation work this way? Can we fix Time#parse? I wonder if user will be confused, should they use Time.parse or Time.new?

Why is Time.parse so slow?

This seems like a good performance improvement. But rather than fixing the existing #parse, we introduce new one. Now we move the complexity to the user.

Updated by nobu (Nobuyoshi Nakada) 5 months ago

ioquatix (Samuel Williams) wrote in #note-2:

But does any other #new implementation work this way? Can we fix Time#parse? I wonder if user will be confused, should they use Time.parse or Time.new?

The reason not to "fix Time#parse" is

Time.parse often results in unintentional/surprising results.

Updated by ioquatix (Samuel Williams) 5 months ago

Time.parse often results in unintentional/surprising results.

Can we change it so that it doesn't return unintentional/surprising results? I realise this might be impossible without extending the interface... e.g. Time.parse(..., surprising: false) :p

I agree clean room implementation is nice. But if Time.parse can do unexpected things, a new interface does not fix existing usage so the problem still exists, and now we have two interfaces for doing largely the same thing - one that "works" and one that does "unintentional/surprising" things.

Updated by nobu (Nobuyoshi Nakada) 5 months ago

ioquatix (Samuel Williams) wrote in #note-4:

Time.parse often results in unintentional/surprising results.

Can we change it so that it doesn't return unintentional/surprising results? I realise this might be impossible without extending the interface... e.g. Time.parse(..., surprising: false) :p

Yes, it seems by design, and changing the behavior will just break something.
I thought that the same name but different behavior method would be more confusing.

I agree clean room implementation is nice. But if Time.parse can do unexpected things, a new interface does not fix existing usage so the problem still exists, and now we have two interfaces for doing largely the same thing - one that "works" and one that does "unintentional/surprising" things.

A wrapper version based on the new Time.new would be possible, I think.

Updated by byroot (Jean Boussier) 5 months ago

                   user     system      total        real
Time.iso8601   0.006919   0.000185   0.007104 (  0.007091)
Time.parse     0.018338   0.000207   0.018545 (  0.018590)
Time.new       0.003671   0.000069   0.003740 (  0.003741)

This new interface being faster than Time.iso8601 at parsing iso8601 seems weird to me. When I have IS0-8601 dates (pretty much the only format I really use), I'd expect that by using the dedicated method to parse them, I get the best possible performance.

Can we optimize Time.iso8601 in the same way? Otherwise I can already see people moving to the new API and losing strictness by doing so.

Actions #7

Updated by mame (Yusuke Endoh) 5 months ago

  • Related to Feature #16005: A variation of Time.iso8601 that can parse yyyy-MM-dd HH:mm:ss added

Updated by ioquatix (Samuel Williams) 5 months ago

Yes, it seems by design, and changing the behavior will just break something.

I think there is value in the following interface:

Time.parse(..., strict: true) where we have some very clearly documented interpretation of strict.

If we want to support multiple formatS:

Time.parse(..., format: :loose/:strict) or Time.parse(...,format: :iso8601) etc.

Updated by ioquatix (Samuel Williams) about 1 month ago

I checked the latest update.

I'm still not sure about Time.new(string) parsing a string.

What about Time.parse_strict or Time.parse_iso8601 etc?

Updated by Eregon (Benoit Daloze) 25 days ago

Like ioquatix (Samuel Williams) and byroot (Jean Boussier) I think Time.new taking a iso-like String is not a good idea.

Also it's technically incompatible (the first argument is always the year for Time.new and it even accepts strings):

> Time.new "2234-01-01T00:00:00+07:00"
=> 2234-01-01 00:00:00 +0100

I understand Time.iso8601 and Time.parse need an extension library, date. is annoying and not intuitive.
How about simply making Time.iso8601 a core method?

Time.iso8601 can't parse Time#inspect string.

One shouldn't parse inspect so this should be no problem in practice.
We could add Time#iso8601 though (in core), I think that's much better than time.to_datetime.iso8601.

Updated by byroot (Jean Boussier) 25 days ago

If I might add a nitpick, the actual format is RFC 3339, which is pretty much a subset of ISO 8601.

But yes, +1 to making Time.iso8601 a core method.

Updated by timcraft (Tim Craft) 23 days ago

Nice performance improvement!

In terms of the interface I think it would be confusing to make Time.new parse a string:

  • Using a .parse method for parsing is more conventional / less surprising / more intention revealing
  • Having both .new and .parse would be confusing because it's not clear when to use one or the other, what the trade-offs are etc

It feels very unintiutive and not "least surprise". Similarly Time.parse(...,format: :iso8601) and Time.parse_iso8601 feel a bit clunky.

Is this awkwardness stemming from trying to fit much functionality into the Time class? It would be more work, but if there was a dedicated ISO8601 module we could shift some responsibility away from the Time class. For example:

ISO8601::Time.parse  # strictly ISO8601 parsing, returns a Time object

This could be extended to encapsulate ISO8601 parsing for other classes:

ISO8601::Date.parse
ISO8601::Duration.parse
ISO8601::TimeInterval.parse

The same general pattern could be followed for other formats:

RFC3339::Time.parse
RFC2822::Time.parse
SQL92::Time.parse

Each of those methods would have the same signature, so they could be easily interchanged to get varying degrees of strictness or performance.

From a naming and readability perspective the names are less surprising, more intention revealing, and very greppable.

The higher level Time.parse method could potentially delegate to these methods to support parsing a wider range of formats.

Each module could also be extended to encapsulate string formatting as an alternative to adding instance methods like #iso8601, #rfc3339 etc.


An alternative to using the formats as the top level modules would be to invert and nest the modules under the Time classes, for example:

Time::ISO8601.parse  # instead of ISO8601::Time.parse

Same benefits—I'm not sure which would be considered more Ruby-ish?

Actions #13

Updated by nobu (Nobuyoshi Nakada) 2 days ago

Updated by nobu (Nobuyoshi Nakada) 2 days ago

Eregon (Benoit Daloze) wrote in #note-10:

Also it's technically incompatible (the first argument is always the year for Time.new and it even accepts strings):

> Time.new "2234-01-01T00:00:00+07:00"
=> 2234-01-01 00:00:00 +0100

I overlooked it.
May #18331 be better?

Updated by nobu (Nobuyoshi Nakada) 2 days ago

I remember #18331 won't work with subclasses of Time.
Does Time.at("2021-12-04 02:00:00") make sense?

Updated by sawa (Tsuyoshi Sawada) 2 days ago

nobu (Nobuyoshi Nakada) wrote in #note-15:

I remember #18331 won't work with subclasses of Time.

Can you elaborate on that? Is your concern about certain frameworks using time-like classes other than Time? I don't see any problem with that. If you create a Time object using Kernel.#Time proposed in #18331, that can easily be converted (often without noticing) to such class in case of Ruby on Rails' ActiveSupport::TimeWithZone.

Updated by Eregon (Benoit Daloze) 2 days ago

Yes, Kernel#Time(str) seems better than Time.new(str).

However I think being precise for parsing is key for times.
Hence, why not make Time.iso8601 fast and a core method?
Then code already using the right method would just become faster, and there is no need migrate code to use a new API.

Updated by nobu (Nobuyoshi Nakada) 2 days ago

sawa (Tsuyoshi Sawada) wrote in #note-16:

nobu (Nobuyoshi Nakada) wrote in #note-15:

I remember #18331 won't work with subclasses of Time.

Can you elaborate on that? Is your concern about certain frameworks using time-like classes other than Time? I don't see any problem with that. If you create a Time object using Kernel.#Time proposed in #18331, that can easily be converted (often without noticing) to such class in case of Ruby on Rails' ActiveSupport::TimeWithZone.

The timezone name will be converted by find_timezone method on the class of a Time instance, so that subclasses can override it.

Updated by nobu (Nobuyoshi Nakada) 2 days ago

Eregon (Benoit Daloze) wrote in #note-17:

Hence, why not make Time.iso8601 fast and a core method?
Then code already using the right method would just become faster, and there is no need migrate code to use a new API.

First, the primary target is the result of Time#inspect, and it is not fully compliant with ISO-8601.
Second, ISO-8601 allows many variants, but I'm not going to implement them all.

Actions

Also available in: Atom PDF