Node:attribute as a weak keyword, Next:external as a weak keyword, Up:Parsing keywords
attribute
as a weak keywordNote that in the following we use the spelling attribute
when
referring to the directive and Attribute
for an identifier.
This is according to the GPCS and might make the following text
clearer. However, it cannot be a criterion for resolving the
conflict since the compiler must treat both spellings equally. The
same applies, of course, to the line-breaks and white-space used
here for readability.
Making attribute
a weak keyword leads to a S/R conflict in
variable declarations (whereas routine declarations go without
conflicts). Consider this case:
var a: Integer; attribute (...)
vs.
var a: Integer; Attribute: ...
After reading the ;
, the parser must decide whether to shift
it, or to reduce to a variable declaration. But the next token
attribute
doesn't decide it, and bison can only look ahead
one token.
The following token would resolve the problem, since the directive
attribute
is always followed by (
whereas an
identifier in a variable declaration can be followed by ,
or
:
, but never (
.
More generally, an identifier in an id_list
in the parser can
never be followed by (
(while identifiers in other contexts
can be, e.g. in function calls). This must be carefully checked
manually through the whole grammar!
Thus, the solution consists of two steps. Firstly, the lexer
does the additional look-ahead that bison can't do. When it reads
the word attribute
(and it is not disabled by dialect options
or by the user or shadowed by some declaration), then if the next
token is not (
, it can only be an identifier, so the lexer
returns LEX_ID
. If the next token is (
, the lexer
returns p_attribute
.
Lexer look-ahead is not really nice, either, e.g. because it
increases the "shift" of compiler directives. At least, we only
have to read ahead two characters plus preceding white-space (two
because of (.
), and not an actual token - the latter would
add additiional complications of saving and restoring lexer semantic
values and the state of lexer/parser interrelation variables
(see Lexer/parser) such as lex_const_equal
, and then
either lex the token again later or handle the cases where the
parser modifies these variables in between. This would get really
messy.
Secondly, the parser accepts p_attribute
as an identifier
except in an id_list
. To achieve this, the nonterminal
new_identifier_limited
is used within id_list
.
Note: Using new_identifier_limited
does not
mean that Attribute
can't be used as an identifier in this
place. Instead, this nonterminal can never be followed by (
,
so the lexer will have turned Attribute
into a LEX_ID
token already.
Actually, that's not all: In a constant_definition
, the
conflict is not against id_list
, but against a simple
new_identifier
. But we can just use
new_identifier_limited
instead in the
constant_definition
rule.
This finally solves all conflicts with attribute
.
fjf792*.pas
are test programs for these cases.