Node:implementation constructor destructor operator uses import initialization, Previous:forward near far, Up:Parsing keywords
implementation
, constructor
, destructor
, operator
, uses
, import
and initialization
as weak keywordsIn ISO 7185 Pascal, each section of the source code is uniquely
introduced by a keyword (program
, const
, type
,
var
, label
, procedure
, function
,
begin
). However, the ending of some of these sections (in
particular const
, type
and var
) is not
intrinsically defined, but only by the context (the next of these
"critical" keywords). E.g., var Foo: Integer;
can be a
complete variable declaration part (if one of those keywords
follows), or only a part of one, as in var Foo: Integer; Bar:
Integer;
. (For the other keywords, the ending is intrinsically
defined - the program
heading and label
declarations
end with the next ;
. For procedure
and function
it's a little more complicated, due to forward
declarations,
but still well-defined, and begin
ends with the matching
end
). The same applies to sections within one routine, except
that program
cannot occur there.
Extended Pascal adds to
(in to begin do
and to
end do
) and end
(in interface modules and implementation
modules without initializer and finalizer) to those "critical"
keywords.
But it also adds two keywords which are not defined in classic
Pascal, namely export
and import
. But they can only
occur at the beginning of the source or of a module implementation
so they have fewer chances to conflict with those other keywords.
The same applies to UCSD/Borland Pascal's uses
for units.
(uses
terminates at the first ;
, export
and
import
do not necessarily, like var
etc.)
The problem gets bigger with UCSD/Borland Pascal's
implementation
in units. It can occur after the interface
part, so it might follow, e.g., a variable declaration part. And it
is not an ISO 7185 Pascal keyword.
If we want to treat implementation
as a weak keyword, it must
not conflict with new identifiers anywhere in the grammar.
However, variable declaration parts are not self-contained in the
sense described above, so after a variable declaration part it is
not immediately clear if the part is finished or will continue. So
this is a place where a new identifier is acceptable. E.g.:
interface var Bar: Integer; Implementation: Integer;
vs.
interface var Bar: Integer; implementation
The same applies to implementation
after const
,
type
, export
and import
parts.
The same problem also occurs with the Borland Pascal and Object
Pascal keywords constructor
and destructor
, the
Borland Delphi keyword initialization
, and the PXSC keyword
operator
since the respective declarations can follow
variable declaration blocks etc. It also happens with import
(but it is only possible after an export
part) and with
uses
if we allow it after other declarations (GPC extension).
Again, we play some lexer tricks. We observe that the new identifier
in export
, var
, const
and type
is always
followed by either ,
, :
or =
while none of the
keywords implementation
, constructor
,
destructor
, operator
, import
and uses
is
ever followed by one of these symbols ... with two exceptions:
operator =
is valid, overloading the =
operator.
Consider:
type Foo = record end; Operator = (a, b); { enum type }
vs.
type Foo = record end; operator = (a, b: Foo) c: Foo;
To decide whether operator
is a keyword, we would have to
look ahead six tokens! Anyway, that seems to be a new record
(where "record" in this sentence can be read either as a Pascal
keyword or in at least one of the usual English meanings ;-).
The other exception is that initialization
can, in principle,
be followed by (
, as in:
implementation type Foo = Integer; Initialization (Obj: Integer)
vs.
implementation type Foo = Integer; Initialization (Obj as SubObj).Method;
This would require 3 tokens look-ahead. However, a (
at the
beginning of a statement is quite uncommon, so we just disallow
that, so the use of Initialization
as an identifier is not
restricted.
Doing so much look-ahead would be a huge effort and cause some
complications as noted above. This seems inappropriate for such an
academic example. So, until someone comes up with a clever trick to
cope with this case, we give up here and let operator
before
=
be a keyword, so overloading =
is possible. This
means that operator
cannot be used as an export
interface, a type or an (untyped) constant, unless the keyword is
disabled explicitly or by dialect options. (Enabling and disabling
the keyword by the parser would also have been no option here, since
the parser would need the 6-token look-ahead just as well, which it
cannot do.)
You may have noticed that we "forgot" import
(in the list
of possibly unfinished sections; not in the list of critical
following keywords where it was alright; it actually plays both
roles in this discussion).
This is because the identifier at the beginning of an import
specification can be followed by qualified
, only
,
in
, (
or ;
- the former two of which are
non-standard keywords as well and would therefore conflict with a
new identifier after, e.g., uses
and operator
.
This means that there's no simple general solution. So let's
consider the problematic keywords after an import
part in
detail:
import
. Can't happen since EP only allows only
import
part (possibly containing multiple import
specifications). So this one doesn't cause a S/R conflict, unlike
the following ones.
uses
. Combining module-style import
with
unit-style uses
is a direct mix of different standards.
According to the discussion above, it would lead to the following
ambiguity:
import Foo; Uses only (a); { import onlya
fromUses
}
vs.
import Foo; uses Only (a); { importa
fromOnly
}
Though uses
with an import-list is another "cross-standard"
extension, disallowing it would only reduce the issue from an
ambiguity to a two-token look-ahead conflict and not really help
much - whereas it would devalue the usefulness of uses
which
otherwise can always serve as a substitute for import
, e.g.
to avoid all the conflicts discussed here (because uses
is
terminated by the first ;
).
operator
.
import Foo; Operator only (a, b);
(i.e., import only a
and b
from an interface called
Operator
), vs.
import Foo; operator Only (a, b: Integer) c: Integer;
As in the case of operator =
, we would need 6 tokens of
look-ahead. We have to give up.
implementation
. This does not happen for module
implementations since their syntax is different (module Foo
implementation;
), but for unit implementations. Combining these
with module-style import
is therefore "cross-standard"
already. In addition, it would imply an empty interface part (apart
from the imports) which is rather pointless in units (whereas it
might be useful in modules, containing only re-exports, but as
noted, module implementations are unproblematic here).
constructor
and destructor
. In an interface,
these actually do not make sense immediately after import
since their purpose is to implement constructors and destructors of
object types that must have been declared before (not imported). But
it could happen in an implementation.
We forbid all of these keywords immediately after an import
part. This is achieved using parser precedence rules.