Project

General

Profile

Actions

Bug #11706

closed

Clean up files etc/unicode/name2ctype.{h.blt,kwd,src}

Added by duerst (Martin Dürst) about 9 years ago. Updated over 3 years ago.

Status:
Closed
Target version:
-
[ruby-core:71542]

Description

The files name2ctype.{h.blt,kwd,src} in etc/unicode are intermediate products that are not needed in the repository, and haven't been committed consistently. I propose to remove them.

[I'm not sure this is a bug or a feature, but it doesn't provide any new functionality, so feature doesn't seem right.]

[I've assigned this to Nobu for feedback; I can execute it once we agree on a way forward.]

On 2015/11/17 15:39, Nobuyoshi Nakada wrote:

Please update name2ctype.{h.blt,kwd,src} files too.

Thanks for the reminder. I had a look at these files. Maybe before further commits, we can try to simplify things a bit, and/or to ignore irrelevant stuff.

Sorry this message is long. Looking at the three files you mentioned, I noticed the following:

enc/unicode/name2ctype.h.kwd was produced on the Onigmo side, when I worked on the update (see also https://github.com/k-takata/Onigmo/pull/58), too. However, it is not part of the Onigmo distribution.
It was last committed by Yui Naruse at r36070, on 2012/06/14. This is way before the update to Unicode 7.0.0 with r46831.

On 2011/11/20, K. Takata introduced https://github.com/k-takata/Onigmo/blob/master/tool/convert-name2ctype.sh, which is used as:
convert-name2ctype.sh name2ctype.kwd > name2ctype.h
to directly convert from name2ctype.kwd to name2ctype.h (although it produces a few numbered intermediary files which are removed in the last step).

enc/unicode/name2ctype.h.blt was last committed by yourself in r49292 on 2015/01/17. Your log message mentions r46831, but it is unclear why you updated .h.blt and not .kwd and .src. The last commit before this was r36070, same as for name2ctype.h.kwd.

enc/unicode/name2ctype.src also was last committed in r36070.

Looking at Makefile.in, it contains instructions to create enc/unicode/name2ctype.h from enc/unicode/name2ctype.kwd at http://svn.ruby-lang.org/cgi-bin/viewvc.cgi/trunk/Makefile.in?view=markup#l340. There, .h.blt and .src are mentioned, but my knowledge of shell syntax isn't good enough to understand what's exactly supposed to go on.

My conclusions so far would be:

  • name2ctype.{h.blt,kwd,src} are all intermediary files that are not
    actually used directly for building Ruby.
  • In the last few years, these three files have been committed only
    rarely and accidentally, not in any visible sync with actual bug fixes
    or feature additions.
  • Onigmo no longer uses name2ctype.h.blt and .src, and does not commit
    .kwd.
  • The build process on the Onigmo side, although I did it manually, was
    well documented and painless; on the Ruby side, it may be possible to
    build enc/unicode/name2ctype.h (the file that's finally used for
    compilation), but I haven't found how to do so.
  • For a process that needs to be done about once a year, this amount of
    manual work seems perfectly fine (at least for me, and I volunteer to
    do it again next year).
  • Therefore, I suggest that we don't care about committing
    name2ctype.{h.blt,kwd,src}. If you want me to commit
    enc/unicode/name2ctype.h.kwd, I can do it (because I have the new
    version). Indeed, it might be better to remove these three files;
    they only make checkouts heavier.
  • If we want to simplify the production process, my preference would be
    to update Makefile.in based on convert-name2ctype.sh, or to directly
    integrate convert-name2ctype.sh into tool/enc-unicode.rb
    (why would one want to use sed and friends if we already use ruby?)

Related issues 1 (0 open1 closed)

Related to Ruby master - Feature #11563: Update Onigmo regular expression engine to Unicode Version 8.0.0Closedduerst (Martin Dürst)Actions
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0