You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
222 lines
7.5 KiB
Plaintext
222 lines
7.5 KiB
Plaintext
|
|
=-----------------------------------------------------------------------------=
|
|
* *
|
|
* Paren Stack *
|
|
* *
|
|
=-----------------------------------------------------------------------------=
|
|
|
|
At the heart of this algorithm are two stacks.
|
|
There is the Paren Stack (PS) and the Frame stack.
|
|
|
|
The PS (m_parenStack in the code) keeps track of braces, parens,
|
|
if/else/switch/do/while/etc items -- anything that is nestable.
|
|
|
|
Complex statements go through some of these BS_ stages:
|
|
BS_PAREN1 - paren on if/for/switch/while, etc
|
|
BS_PAREN2 - paren on do{}while()
|
|
BS_BRACE_DO - brace set on do{}
|
|
BS_BRACE2 - brace on if/else/for/switch/while
|
|
BS_ELSE - expecting 'else' after 'if'
|
|
BS_ELSEIF - expecting 'if' after 'else'
|
|
BS_WHILE - expecting 'while' after 'do'
|
|
|
|
The file is processed one token at a time to support #if/#else/#endif
|
|
preprocessors at any point.
|
|
|
|
Take this simple if statement as an example:
|
|
|
|
if ( x )
|
|
{
|
|
x--;
|
|
}
|
|
|
|
The stack would look like so:
|
|
The format is first the token processed and then the PSE stack as it appears
|
|
AFTER the token is processed.
|
|
|
|
'if' [IF - PAREN1]
|
|
'(' [IF - PAREN1] [SPAREN OPEN]
|
|
'x' [IF - PAREN1] [SPAREN OPEN]
|
|
')' [IF - BRACE2] <- note that the stage was changed on SPAREN_CLOSE
|
|
'{' [IF - BRACE2] [BRACE OPEN]
|
|
'x' [IF - BRACE2] [BRACE OPEN]
|
|
'--' [IF - BRACE2] [BRACE OPEN]
|
|
';' [IF - BRACE2] [BRACE OPEN]
|
|
'}' [IF - ELSE]
|
|
<- lack of else kills the ELSE, closes statement
|
|
|
|
Virtual brace example:
|
|
if ( x )
|
|
x--;
|
|
else if (y)
|
|
y--;
|
|
else
|
|
z++;
|
|
|
|
'if' [IF - PAREN1]
|
|
'(' [IF - PAREN1] [SPAREN OPEN]
|
|
'x' [IF - PAREN1] [SPAREN OPEN]
|
|
')' [IF - BRACE2]
|
|
'x' [IF - BRACE2] [VBRACE OPEN] <- VBrace open inserted before because
|
|
the token was not '{'
|
|
'--' [IF - BRACE2] [VBRACE OPEN]
|
|
';' [IF - ELSE] <- Semicolon causes a VBrace close to be
|
|
inserted after the semicolon
|
|
'else' [ELSE - ELSEIF] <- IF changed into ELSE, expect IF or BRACE
|
|
'x' [ELSE - BRACE2] [VBRACE OPEN] <- lack of '{' -> VBrace
|
|
'++' [ELSE - BRACE2] [VBRACE OPEN]
|
|
';' <- VBrace close inserted after semicolon
|
|
ELSE removed after statement close
|
|
|
|
Nested virtual brace example: (EOF represents the end of the file)
|
|
if ( x )
|
|
if (y)
|
|
y--;
|
|
else
|
|
z++;
|
|
EOF
|
|
|
|
'if' [IF - PAREN1]
|
|
'(' [IF - PAREN1] [PAREN OPEN]
|
|
'x' [IF - PAREN1] [PAREN OPEN]
|
|
')' [IF - BRACE2]
|
|
'if' [IF - BRACE2] [VBRACE OPEN] [IF - PAREN1] <- VBrace on BRACE2, IF opened
|
|
'(' [IF - BRACE2] [VBRACE OPEN] [IF - PAREN1] [SPAREN OPEN]
|
|
'y' [IF - BRACE2] [VBRACE OPEN] [IF - PAREN1] [SPAREN OPEN]
|
|
')' [IF - BRACE2] [VBRACE OPEN] [IF - BRACE2]
|
|
'y' [IF - BRACE2] [VBRACE OPEN] [IF - BRACE2] [VBRACE OPEN]
|
|
'--' [IF - BRACE2] [VBRACE OPEN] [IF - BRACE2] [VBRACE OPEN]
|
|
';' [IF - BRACE2] [VBRACE OPEN] [IF - ELSE]
|
|
'else' [IF - BRACE2] [VBRACE OPEN] [ELSE - ELSEIF]
|
|
'z' [IF - BRACE2] [VBRACE OPEN] [ELSE - BRACE2] [VBRACE OPEN]
|
|
'++' [IF - BRACE2] [VBRACE OPEN] [ELSE - BRACE2] [VBRACE OPEN]
|
|
';' [IF - BRACE2] [VBRACE OPEN] [ELSE - BRACE2] - step1
|
|
[IF - BRACE2] [VBRACE OPEN] - step2
|
|
[IF - ELSE] - step3
|
|
EOF
|
|
|
|
-- this last semi is more complicated - first it terminates the VBRACE and then
|
|
the else, which then, since it is the end of a statement, terminates the
|
|
VBRACE. That bumps the IF stage to ELSE.
|
|
The EOF kills that off (since it is not an else)
|
|
|
|
Order of operation:
|
|
1) if TOS=VBRACE && PC=SEMI, insert VBRACE close, PC=>VBRACE close
|
|
2) if PC=VBRACE close or PC=BRACE close, and TOS is complex (if/else/etc)
|
|
then advance complex stage. If statement ends, pop and advance
|
|
|
|
|
|
Stages for each complex statement:
|
|
if
|
|
IF-PAREN1, IF-BRACE2, IF-ELSE
|
|
|
|
if/else
|
|
IF-PAREN1, IF-BRACE2, IF-ELSE, ELSE-ELSEIF, ELSE-BRACE2
|
|
|
|
if/else if/else
|
|
IF-PAREN1, IF-BRACE2, IF-ELSE, ELSE-ELSEIF, IF-PAREN1, IF-BRACE2, IF-ELSE, ELSE-ELSEIF, ELSE-BRACE2
|
|
|
|
for
|
|
FOR-PAREN1, FOR-BRACE2
|
|
|
|
while
|
|
WHILE-PAREN1, WHILE-BRACE2
|
|
|
|
switch
|
|
SWITCH-PAREN1, SWITCH-BRACE2
|
|
|
|
synchronized
|
|
SYNCHRONIZED-PAREN1
|
|
|
|
do/while
|
|
DO-BRACE_DO, DO-WHILE, WHILE-PAREN2
|
|
|
|
|
|
Another less-interesting example:
|
|
|
|
{
|
|
if (x)
|
|
volatile
|
|
{
|
|
y++;
|
|
}
|
|
return y;
|
|
}
|
|
|
|
'{' [BRACE OPEN]
|
|
'if' [BRACE OPEN] [IF - PAREN1]
|
|
'(' [BRACE OPEN] [IF - PAREN1] [PAREN OPEN]
|
|
'x' [BRACE OPEN] [IF - PAREN1] [PAREN OPEN]
|
|
')' [BRACE OPEN] [IF - BRACE2]
|
|
'volatile' [BRACE OPEN] [IF - BRACE2] [VBRACE OPEN] [VOLATILE - BRACE2]
|
|
'{' [BRACE OPEN] [IF - BRACE2] [VBRACE OPEN] [VOLATILE - BRACE2] [BRACE OPEN]
|
|
'y' [BRACE OPEN] [IF - BRACE2] [VBRACE OPEN] [VOLATILE - BRACE2] [BRACE OPEN]
|
|
'++' [BRACE OPEN] [IF - BRACE2] [VBRACE OPEN] [VOLATILE - BRACE2] [BRACE OPEN]
|
|
';' [BRACE OPEN] [IF - BRACE2] [VBRACE OPEN] [VOLATILE - BRACE2] [BRACE OPEN]
|
|
'}' [BRACE OPEN] [IF - ELSE] <- the brace close ends brace-open,
|
|
volatile-brace2 and vbrace-open
|
|
'return' [BRACE OPEN] <- not else
|
|
'y' [BRACE OPEN]
|
|
';' [BRACE OPEN]
|
|
'}' <- empties the stack
|
|
|
|
|
|
=-----------------------------------------------------------------------------=
|
|
* *
|
|
* Parse Frames *
|
|
* *
|
|
=-----------------------------------------------------------------------------=
|
|
|
|
The m_parenStack stack is kept on a frame stack.
|
|
The frame stack is need for languages that support preprocessors (C, C++, C#)
|
|
that can arbitrarily change code flow. It also isolates #define macros so
|
|
that they are indented independently and do not affect the rest of the program.
|
|
|
|
When an #if is hit, a copy of the current frame is push on the frame stack.
|
|
When an #else/#elif is hit, a copy of the current stack is pushed under the
|
|
#if frame and the original (pre-#if) frame is copied to the current frame.
|
|
When #endif is hit, the top frame is popped.
|
|
This has the following effects:
|
|
- a simple #if / #endif does not affect program flow
|
|
- #if / #else /#endif - continues from the #if clause
|
|
|
|
When a #define is entered, the current frame is pushed and cleared.
|
|
When a #define is exited, the frame is popped.
|
|
|
|
|
|
Take this example, which isn't very exciting, as both the #if and #else parts
|
|
end up with the same paren stack. This is the usual case.
|
|
|
|
{
|
|
foo(param1,
|
|
#ifdef DEBUG
|
|
"debug");
|
|
#else
|
|
"release");
|
|
#endif
|
|
}
|
|
|
|
Right before the #ifdef, we have this for the paren stack:
|
|
Top> [BRACE OPEN] [PAREN OPEN]
|
|
|
|
The #ifdef pushes a copy of the current stack, so we have this:
|
|
Top> [BRACE OPEN] [PAREN OPEN]
|
|
[BRACE OPEN] [PAREN OPEN]
|
|
|
|
The close paren after "debug" closes out the PAREN-OPEN on the top of the stack.
|
|
Top> [BRACE OPEN]
|
|
[BRACE OPEN] [PAREN OPEN]
|
|
|
|
The #else swaps the top two frames.
|
|
Top> [BRACE OPEN] [PAREN OPEN]
|
|
[BRACE OPEN]
|
|
|
|
Right after the #else, we hit another close paren after the "release".
|
|
Top> [BRACE OPEN]
|
|
[BRACE OPEN]
|
|
|
|
At the #endif, the top of stack is thrown out, which restores us to the #if path.
|
|
Top> [BRACE OPEN]
|
|
|
|
|