Feature #1493

[patch] lex_state as bit field / IS_lex_state() macro

Added by Dave B almost 5 years ago. Updated over 1 year ago.

Assignee:Nobuyoshi Nakada
Target version:2.0.0


# ? Changelog:
# Represent lexstate by bit field instead of serial integer enum
# so that single or multiple values can be checked together using
# unifying macro IS
lex_state(). States remain mutually exclusive.


Each use of current macro ISBEG() ..
#define IS
BEG() (lexstate == EXPRBEG || lexstate == EXPRMID || \
lexstate == EXPRVALUE || lexstate == EXPRCLASS)

.. results in up to 4 tests which can be reduced to 1.

An extreme use case ..

     if (IS_BEG() ||
         lex_state == EXPR_DOT ||
         IS_ARG()) {

.. requires up to 7 tests on lex_state.

To my knowledge, compilers can't optimize this.
As a bit field, there's no need to:

     if (IS_lex_state( EXPR_BEG_ANY | EXPR_ARG_ANY | EXPR_DOT ))

.. reduces to: if (lexstate & (testbits)) // TRUE/FALSE

All changes in this patch were scripted to eliminate the possibility
of typos. Where some replaced sections looked similar or the same,
they were verified with MD5sum before applying!
Multiple state tests were merged using simple, strict logic and any
"surprise" code would have been left unchanged.

Therefore, it should not be necessary to check every line of the patch;
I hope that a check of a random sample will give confidence in the rest.

There were several repeated switches which could have been replaced with:

lexstate = (ISlexstate( EXPRFNAME | EXPRDOT )) ? EXPRARG : EXPR_BEG

but I assumed you would prefer if-else for flexibility and legibility:

if (IS_lex_state( EXPR_FNAME | EXPR_DOT )) {
    lex_state = EXPR_ARG;
else {
    lex_state = EXPR_BEG;

Say what you don't like about anything - it might be easy to change. :)

Wouldn't it be nice to use some spare bits to combine other state?

     if ((lex_state == EXPR_BEG && !cmd_state) || ...
     if (IS_ARG() && space_seen && ...
  • I haven't explored those, yet.

    My motivation was not to reduce processor heat. ;)
    In the future, if we need to break the dependence on Yacc/Bison, these
    state transitions might need to become token information and the change
    makes them far easier to store.
    ( e.g. Any of 'THESE states' => 'THIS state' )


IS_LEX.patch Magnifier (13.2 KB) Dave B, 05/20/2009 02:18 AM

Associated revisions

Revision 37338
Added by Nobuyoshi Nakada over 1 year ago

parse.y: bit field lex_state

  • parse.y (enum lexstatee): [EXPERIMENTAL] lexstate as bit field / ISlex_state() macro. based on the patch by Dave B in . [Feature #1493]


#1 Updated by Marc-Andre Lafortune over 4 years ago

  • Category set to core
  • Assignee set to Yukihiro Matsumoto
  • Priority changed from Normal to Low



#2 Updated by Kazuhiro NISHIYAMA about 4 years ago

  • Category set to core
  • Status changed from Open to Assigned
  • Target version set to 2.0.0



#3 Updated by Yusuke Endoh about 2 years ago

  • Assignee changed from Yukihiro Matsumoto to Nobuyoshi Nakada

Nobu, aren't you interested in this? Could you please review and
import the patch if it looks good to you?

This seems to be just a patch for refactoring. Though, the purpose
is not so clear to me. (Who is planning to break the dependence on
bison?) But, the patch seems benign (I read only the ticket, not
the patch itself).

Yusuke Endoh mame@tsg.ne.jp

#4 Updated by Nobuyoshi Nakada over 1 year ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r37338.
Dave, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.

parse.y: bit field lex_state

  • parse.y (enum lexstatee): [EXPERIMENTAL] lexstate as bit field / ISlex_state() macro. based on the patch by Dave B in . [Feature #1493]

Also available in: Atom PDF