Bug #10416: Create mechanism for updating of Unicode data files downstreams when we want - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #10416

open

Create mechanism for updating of Unicode data files downstreams when we want

Bug #10416: Create mechanism for updating of Unicode data files downstreams when we want

Added by duerst (Martin Dürst) almost 12 years ago. Updated over 2 years ago.

Status:

Assigned

Assignee:

nobu (Nobuyoshi Nakada)

Target version:

ruby -v:

ruby 2.2.0dev (2014-10-22 trunk 48092) [x86_64-cygwin]

Backport:

2.0.0: UNKNOWN, 2.1: UNKNOWN

[ruby-core:65843]

Tags:

build

Description

The current mechanism for updating Unicode data files will create the following problem:
Downstream compilers/packagers will download Unicode data files ONE time (they may already have done so).

However, if they don't activate ALWAYS_UPDATE_UNICODE = yes, these files will never get updated, and they will stay on Unicode version 7.0 even if in five years Unicode is e.g. on version 12.0.
On the other hand, if they activate ALWAYS_UPDATE_UNICODE = yes (and assuming issue #10415 gets fixed), they constantly update to the latest version of Unicode. That's good for those who actually want this, but now what our current policy is.
What's missing is that we (Ruby core) can make sure downstream checkouts update to a new Unicode version when we want then to do so (as we e.g. can do for other parts that are based on Unicode data, see e.g. https://bugs.ruby-lang.org/issues/9092), without sending an email to everybody and hoping they read and follow it.

[Currently, the only solution I know will work is the one pointed out by Yui Naruse in https://bugs.ruby-lang.org/issues/10084#note-17, but I'm okay with any other solution.]

Related issues 3 (0 open — 3 closed)

Updated by nobu (Nobuyoshi Nakada) almost 12 years ago Actions
Copy link
#1 [ruby-core:65855]

It affects only developers who build from the repository.
Released packages should have the latest (and fixed) version at the release time.

Updated by naruse (Yui NARUSE) over 11 years ago Actions
Copy link
#2 [ruby-core:65925]

For years, file structures of Unicode Data was changed some times.
Therefore there's no guarantee that Unicode 12 can work with the current script.

Updated by duerst (Martin Dürst) over 11 years ago Actions
Copy link
#3 [ruby-core:65932]

Yui NARUSE wrote:

For years, file structures of Unicode Data was changed some times.
Therefore there's no guarantee that Unicode 12 can work with the current script.

I agree (but see last paragraph of this comment). But that's not what this issue is about.

What I'm talking about is that next year, at some point in time, we decide that ruby trunk is upgraded to Unicode 8.0 (and so on probably every year). This was the case this year for Unicode 7.0, see issue #9092.

We do this after checking that the new Unicode data files work with the current script (first the beta files and then the final releases), and if they don't work, then we upgrade the script. Then we commit, and everybody on trunk gets the changes when they update. But currently, this is not the case for the Unicode data files, and people on trunk will have to use a special effort to upgrade.

Besides committing lib/unicode_normalize/tables.rb (nobu reverted it but didn't give any reason why), there's another way to achieve this goal:

Note in a file the versions or timestaps of the 'official' version of the Ruby trunk Unicode data files. This could be part of a .mk file, or a new file. Of the three files we currently download, two have a header (first two lines) like this:

# NormalizationTest-7.0.0.txt
# Date: 2013-11-27, 09:54:41 GMT [MD]

So we could note the version and/or date we want people on trunk to use, and check against it. But one file, UnicodeData.txt, doesn't contain the information in the file, so we have to rely on the date of the Last-Modified http header (which we already use to avoid repeated downloads of the same file).

The reason why UnicodeData.txt doesn't contain is these header lines is that this is a very old file and the Unicode Consortium is actually quite careful to not make any changes that could affect the users of a file. If data of a different type is needed, then it is provided in a separate file.

Updated by duerst (Martin Dürst) over 11 years ago Actions
Copy link
#4 [ruby-core:66013]

I committed r48194, switching the download location to http://www.unicode.org/Public/7.0.0/ucd/ (i.e. Unicode Version 7.0.0), as discussed at the meeting yesterday. This does not yet address this bug, because when we change this to http://www.unicode.org/Public/8.0.0/ucd/ next year, the new files won't automatically be downloaded.

Updated by duerst (Martin Dürst) over 11 years ago Actions
Copy link
#5 [ruby-core:66019]

Related to Bug #10458: After r48196, make cannot complete because of Unicode file download problem added

Updated by naruse (Yui NARUSE) over 8 years ago Actions
Copy link
#6

Target version deleted (~~2.2.0~~)

Updated by jeremyevans0 (Jeremy Evans) almost 5 years ago Actions
Copy link
#7 [ruby-core:105602]

@duerst (Martin Dürst) Do you know if this is still in issue in the master branch?

Updated by duerst (Martin Dürst) almost 5 years ago Actions
Copy link
#8 [ruby-core:105605]

jeremyevans0 (Jeremy Evans) wrote in #note-7:

@duerst (Martin Dürst) Do you know if this is still in issue in the master branch?

I suspect it is still "an issue", i.e. it still happens.
Nobody has complained about it, and so it may be that it's an irrelevant issue.
I will update Ruby to Unicode 14.0.0 in a couple weeks or so, and will look out for this, and then either close it or push it forward.

Updated by jeremyevans0 (Jeremy Evans) almost 3 years ago Actions
Copy link
#9 [ruby-core:114450]

@duerst (Martin Dürst) Do you think this can be closed?

Updated by duerst (Martin Dürst) almost 3 years ago Actions
Copy link
#10

Related to Feature #19171: Update Unicode data to Unicode Version 15.1 added

Updated by duerst (Martin Dürst) almost 3 years ago Actions
Copy link
#11 [ruby-core:114461]

The next version of Unicode (15.1) will be released in about 3 weeks. I'll check at that point, and close if no longer relevant.

Updated by nobu (Nobuyoshi Nakada) almost 3 years ago Actions
Copy link
#12 [ruby-core:114933]

The current enc-unicode.rb seems to fail because of Indic_Conjunct_break properties with values.

I'm not sure how these properties should be handled well.
/\p{InCB_Liner}/ or /\p{InCB=Liner}/ as the comments in that file?
https://github.com/nobu/ruby/tree/unicode-15.1 is the former.

Updated by nobu (Nobuyoshi Nakada) almost 3 years ago Actions
Copy link
#13

Related to Feature #19908: Update to Unicode 15.1 added

Updated by hsbt (Hiroshi SHIBATA) over 2 years ago Actions
Copy link
#14

Status changed from Open to Assigned

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

Ruby

Custom queries

Bug #10416

Create mechanism for updating of Unicode data files downstreams when we want

Updated by nobu (Nobuyoshi Nakada) almost 12 years ago Actions
Copy link
#1 [ruby-core:65855]

Updated by naruse (Yui NARUSE) over 11 years ago Actions
Copy link
#2 [ruby-core:65925]

Updated by duerst (Martin Dürst) over 11 years ago Actions
Copy link
#3 [ruby-core:65932]

Updated by duerst (Martin Dürst) over 11 years ago Actions
Copy link
#4 [ruby-core:66013]

Updated by duerst (Martin Dürst) over 11 years ago Actions
Copy link
#5 [ruby-core:66019]

Updated by naruse (Yui NARUSE) over 8 years ago Actions
Copy link
#6

Updated by jeremyevans0 (Jeremy Evans) almost 5 years ago Actions
Copy link
#7 [ruby-core:105602]

Updated by duerst (Martin Dürst) almost 5 years ago Actions
Copy link
#8 [ruby-core:105605]

Updated by jeremyevans0 (Jeremy Evans) almost 3 years ago Actions
Copy link
#9 [ruby-core:114450]

Updated by duerst (Martin Dürst) almost 3 years ago Actions
Copy link
#10

Updated by duerst (Martin Dürst) almost 3 years ago Actions
Copy link
#11 [ruby-core:114461]

Updated by nobu (Nobuyoshi Nakada) almost 3 years ago Actions
Copy link
#12 [ruby-core:114933]

Updated by nobu (Nobuyoshi Nakada) almost 3 years ago Actions
Copy link
#13

Updated by hsbt (Hiroshi SHIBATA) over 2 years ago Actions
Copy link
#14

Project

General

Profile

Ruby

Custom queries

Bug #10416

Create mechanism for updating of Unicode data files downstreams when we want

Updated by nobu (Nobuyoshi Nakada) almost 12 years ago ActionsCopy link #1 [ruby-core:65855]

Updated by naruse (Yui NARUSE) over 11 years ago ActionsCopy link #2 [ruby-core:65925]

Updated by duerst (Martin Dürst) over 11 years ago ActionsCopy link #3 [ruby-core:65932]

Updated by duerst (Martin Dürst) over 11 years ago ActionsCopy link #4 [ruby-core:66013]

Updated by duerst (Martin Dürst) over 11 years ago ActionsCopy link #5 [ruby-core:66019]

Updated by naruse (Yui NARUSE) over 8 years ago ActionsCopy link #6

Updated by jeremyevans0 (Jeremy Evans) almost 5 years ago ActionsCopy link #7 [ruby-core:105602]

Updated by duerst (Martin Dürst) almost 5 years ago ActionsCopy link #8 [ruby-core:105605]

Updated by jeremyevans0 (Jeremy Evans) almost 3 years ago ActionsCopy link #9 [ruby-core:114450]

Updated by duerst (Martin Dürst) almost 3 years ago ActionsCopy link #10

Updated by duerst (Martin Dürst) almost 3 years ago ActionsCopy link #11 [ruby-core:114461]

Updated by nobu (Nobuyoshi Nakada) almost 3 years ago ActionsCopy link #12 [ruby-core:114933]

Updated by nobu (Nobuyoshi Nakada) almost 3 years ago ActionsCopy link #13

Updated by hsbt (Hiroshi SHIBATA) over 2 years ago ActionsCopy link #14

Updated by nobu (Nobuyoshi Nakada) almost 12 years ago Actions
Copy link
#1 [ruby-core:65855]

Updated by naruse (Yui NARUSE) over 11 years ago Actions
Copy link
#2 [ruby-core:65925]

Updated by duerst (Martin Dürst) over 11 years ago Actions
Copy link
#3 [ruby-core:65932]

Updated by duerst (Martin Dürst) over 11 years ago Actions
Copy link
#4 [ruby-core:66013]

Updated by duerst (Martin Dürst) over 11 years ago Actions
Copy link
#5 [ruby-core:66019]

Updated by naruse (Yui NARUSE) over 8 years ago Actions
Copy link
#6

Updated by jeremyevans0 (Jeremy Evans) almost 5 years ago Actions
Copy link
#7 [ruby-core:105602]

Updated by duerst (Martin Dürst) almost 5 years ago Actions
Copy link
#8 [ruby-core:105605]

Updated by jeremyevans0 (Jeremy Evans) almost 3 years ago Actions
Copy link
#9 [ruby-core:114450]

Updated by duerst (Martin Dürst) almost 3 years ago Actions
Copy link
#10

Updated by duerst (Martin Dürst) almost 3 years ago Actions
Copy link
#11 [ruby-core:114461]

Updated by nobu (Nobuyoshi Nakada) almost 3 years ago Actions
Copy link
#12 [ruby-core:114933]

Updated by nobu (Nobuyoshi Nakada) almost 3 years ago Actions
Copy link
#13

Updated by hsbt (Hiroshi SHIBATA) over 2 years ago Actions
Copy link
#14