Project

General

Profile

Actions

Bug #14102

closed

Date.strptime ignores constraints provided by day name

Added by cyclotron3k (A Samuel) over 6 years ago. Updated over 4 years ago.

Status:
Rejected
Assignee:
-
Target version:
-
ruby -v:
2.4.2
[ruby-core:83746]

Description

RUBY_VERSION
=> "2.4.2" # also tested in 2.5.0

require 'date'
=> true

Date.strptime('Potato, November 13, 2017', '%A, %B %d, %Y').strftime('%A, %B %d, %Y')
=> ArgumentError: invalid date

Date.strptime('Friday, November 31, 2017', '%A, %B %d, %Y').strftime('%A, %B %d, %Y')
=> ArgumentError: invalid date

# November 13, 2017 is a Monday
Date.strptime('Tuesday, November 13, 2017', '%A, %B %d, %Y').strftime('%A, %B %d, %Y')
=> "Monday, November 13, 2017"

None of the dates above are valid, only one gets coerced.

Updated by jeremyevans0 (Jeremy Evans) over 6 years ago

cyclotron3k (A Samuel) wrote:

Date.strptime('Potato, November 13, 2017', '%A, %B %d, %Y').strftime('%A, %B %d, %Y')
=> ArgumentError: invalid date

Fails because Potato is not a valid day name.

Date.strptime('Friday, November 31, 2017', '%A, %B %d, %Y').strftime('%A, %B %d, %Y')
=> ArgumentError: invalid date

Fails because November only has 30 days.

Date.strptime('Tuesday, November 13, 2017', '%A, %B %d, %Y').strftime('%A, %B %d, %Y')
=> "Monday, November 13, 2017"

Doesn't fail because %d takes precedence over %A. Date.strptime does not check that the all format specifier values are internally consistent. This is true not just for day names but in general:

Date.strptime('4 3', '%W %d')
=> #<Date: 2017-11-03 ((2458061j,0s,0n),+0s,2299161j)>

Date.strptime('4 3', '%w %d')
=> #<Date: 2017-11-03 ((2458061j,0s,0n),+0s,2299161j)>

You can even pass multiple of the same specifiers, in which case last one wins:

Date.strptime('3 4', '%d %d')
=> #<Date: 2017-11-04 ((2458062j,0s,0n),+0s,2299161j)>

Date.strptime('3 4', '%m %m')
=> #<Date: 2017-04-01 ((2457845j,0s,0n),+0s,2299161j)>

I get the feeling that asking Date.strptime to check that all format specifier values are internally consistent is asking too much. It may be possible to correctly handle all cases, but it would be very complex. There are also a lot of situations where the combination of format specifiers used still results in ambiguity. Consider Date.strptime('4 3', '%w %d'), where week day is 3 and month day is 4. Should it try to find the closest Wednesday that is the 4th of the month?

Updated by cyclotron3k (A Samuel) over 6 years ago

jeremyevans0 (Jeremy Evans) wrote:

Doesn't fail because %d takes precedence over %A.

The point I was trying to make was that they are all invalid dates, and so I would expect all of them to raise errors.

Your explanation of directive precedence makes sense, but it's not mentioned in the documentation. It would help to have this behaviour documented but I think that although it's possible to ignore the contradictions and extract some kind of meaning, it seems a bit PHP'y to ignore the ambiguity (or outright contradiction) and return a result at any cost.

You can even pass multiple of the same specifiers, in which case last one wins.

I noticed that too. In my opinion, ('3 4', '%d %d') is as valid as Potato, (i.e. not valid) and should raise an error.

I get the feeling that asking Date.strptime to check that all format specifier values are internally consistent is asking too much.

I was worried that may be the answer.

Regarding precedence, I did some further testing:

# %W - Week number of the year.  The week starts with Monday.  (00..53)
# %w - Day of the week (Sunday is 0, 0..6)
# %d - Day of the month, zero-padded (01..31)

Date.strptime('1', '%W').strftime('%A, %B %d, %Y') # first week, strptime assumes this year
=> "Monday, January 02, 2017"

Date.strptime('1 2', '%W %w').strftime('%A, %B %d, %Y') # add week day of Tuesday, and strptime picks the Tuesday of that week
=> "Tuesday, January 03, 2017"

Date.strptime('1 2 4', '%W %w %d').strftime('%A, %B %d, %Y') # add a contradictory date, and it is ignored
=> "Tuesday, January 03, 2017"

Date.strptime('1 3', '%W %d').strftime('%A, %B %d, %Y') # remove the %w, and strptime now ignores %W and assumes November
# there is no contradiction here, but strptime interpreted it incorrectly.
=> "Friday, November 03, 2017"

So far, I've discovered that %W and %w will be ignored when %d is also specified, but %d will be ignored when when %W and %w are specified at the same time. I'm getting the impression that documenting the precedence of format directives is not going to be easy.

Also, %W seems to have the same precedence as %w because in the second part of the following example, strptime could ignore one or other to make a valid date, but doesn't:

Date.strptime('53 1 2018', '%W %w %Y').strftime('%A, %B %d, %Y') # the Monday in the last week of 2018
=> "Monday, December 31, 2018"

Date.strptime('53 2 2018', '%W %w %Y').strftime('%A, %B %d, %Y')
=> ArgumentError: invalid date

It may be possible to correctly handle all cases, but it would be very complex. There are also a lot of situations where the combination of format specifiers used still results in ambiguity. Consider Date.strptime('4 3', '%w %d'), where week day is 3 and month day is 4. Should it try to find the closest Wednesday that is the 4th of the month?

I think ambiguity is a different problem to outright contradiction and I'm not sure what the solution is, but I think what strptime does at the moment isn't the right answer; it seems to guess parts that are missing, even when it contradicts information that has been provided (see above).

For resolving ambiguity, I'm imagining a kind of inverted Sieve of Eratosthenes, where each piece of supplied information defines one or many valid dates on an infinite calendar; the closest date to today, where all constraints are met, is the answer. The tricky part (impossible?) is determining if a point where all constraints are met even exists.

Actions #3

Updated by jeremyevans0 (Jeremy Evans) over 4 years ago

  • Status changed from Open to Rejected
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0