You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

222 lines
7.5 KiB

=-----------------------------------------------------------------------------=
* *
* Paren Stack *
* *
=-----------------------------------------------------------------------------=
At the heart of this algorithm are two stacks.
There is the Paren Stack (PS) and the Frame stack.
The PS (m_parenStack in the code) keeps track of braces, parens,
if/else/switch/do/while/etc items -- anything that is nestable.
Complex statements go through some of these BS_ stages:
BS_PAREN1 - paren on if/for/switch/while, etc
BS_PAREN2 - paren on do{}while()
BS_BRACE_DO - brace set on do{}
BS_BRACE2 - brace on if/else/for/switch/while
BS_ELSE - expecting 'else' after 'if'
BS_ELSEIF - expecting 'if' after 'else'
BS_WHILE - expecting 'while' after 'do'
The file is processed one token at a time to support #if/#else/#endif
preprocessors at any point.
Take this simple if statement as an example:
if ( x )
{
x--;
}
The stack would look like so:
The format is first the token processed and then the PSE stack as it appears
AFTER the token is processed.
'if' [IF - PAREN1]
'(' [IF - PAREN1] [SPAREN OPEN]
'x' [IF - PAREN1] [SPAREN OPEN]
')' [IF - BRACE2] <- note that the stage was changed on SPAREN_CLOSE
'{' [IF - BRACE2] [BRACE OPEN]
'x' [IF - BRACE2] [BRACE OPEN]
'--' [IF - BRACE2] [BRACE OPEN]
';' [IF - BRACE2] [BRACE OPEN]
'}' [IF - ELSE]
<- lack of else kills the ELSE, closes statement
Virtual brace example:
if ( x )
x--;
else if (y)
y--;
else
z++;
'if' [IF - PAREN1]
'(' [IF - PAREN1] [SPAREN OPEN]
'x' [IF - PAREN1] [SPAREN OPEN]
')' [IF - BRACE2]
'x' [IF - BRACE2] [VBRACE OPEN] <- VBrace open inserted before because
the token was not '{'
'--' [IF - BRACE2] [VBRACE OPEN]
';' [IF - ELSE] <- Semicolon causes a VBrace close to be
inserted after the semicolon
'else' [ELSE - ELSEIF] <- IF changed into ELSE, expect IF or BRACE
'x' [ELSE - BRACE2] [VBRACE OPEN] <- lack of '{' -> VBrace
'++' [ELSE - BRACE2] [VBRACE OPEN]
';' <- VBrace close inserted after semicolon
ELSE removed after statement close
Nested virtual brace example: (EOF represents the end of the file)
if ( x )
if (y)
y--;
else
z++;
EOF
'if' [IF - PAREN1]
'(' [IF - PAREN1] [PAREN OPEN]
'x' [IF - PAREN1] [PAREN OPEN]
')' [IF - BRACE2]
'if' [IF - BRACE2] [VBRACE OPEN] [IF - PAREN1] <- VBrace on BRACE2, IF opened
'(' [IF - BRACE2] [VBRACE OPEN] [IF - PAREN1] [SPAREN OPEN]
'y' [IF - BRACE2] [VBRACE OPEN] [IF - PAREN1] [SPAREN OPEN]
')' [IF - BRACE2] [VBRACE OPEN] [IF - BRACE2]
'y' [IF - BRACE2] [VBRACE OPEN] [IF - BRACE2] [VBRACE OPEN]
'--' [IF - BRACE2] [VBRACE OPEN] [IF - BRACE2] [VBRACE OPEN]
';' [IF - BRACE2] [VBRACE OPEN] [IF - ELSE]
'else' [IF - BRACE2] [VBRACE OPEN] [ELSE - ELSEIF]
'z' [IF - BRACE2] [VBRACE OPEN] [ELSE - BRACE2] [VBRACE OPEN]
'++' [IF - BRACE2] [VBRACE OPEN] [ELSE - BRACE2] [VBRACE OPEN]
';' [IF - BRACE2] [VBRACE OPEN] [ELSE - BRACE2] - step1
[IF - BRACE2] [VBRACE OPEN] - step2
[IF - ELSE] - step3
EOF
-- this last semi is more complicated - first it terminates the VBRACE and then
the else, which then, since it is the end of a statement, terminates the
VBRACE. That bumps the IF stage to ELSE.
The EOF kills that off (since it is not an else)
Order of operation:
1) if TOS=VBRACE && PC=SEMI, insert VBRACE close, PC=>VBRACE close
2) if PC=VBRACE close or PC=BRACE close, and TOS is complex (if/else/etc)
then advance complex stage. If statement ends, pop and advance
Stages for each complex statement:
if
IF-PAREN1, IF-BRACE2, IF-ELSE
if/else
IF-PAREN1, IF-BRACE2, IF-ELSE, ELSE-ELSEIF, ELSE-BRACE2
if/else if/else
IF-PAREN1, IF-BRACE2, IF-ELSE, ELSE-ELSEIF, IF-PAREN1, IF-BRACE2, IF-ELSE, ELSE-ELSEIF, ELSE-BRACE2
for
FOR-PAREN1, FOR-BRACE2
while
WHILE-PAREN1, WHILE-BRACE2
switch
SWITCH-PAREN1, SWITCH-BRACE2
synchronized
SYNCHRONIZED-PAREN1
do/while
DO-BRACE_DO, DO-WHILE, WHILE-PAREN2
Another less-interesting example:
{
if (x)
volatile
{
y++;
}
return y;
}
'{' [BRACE OPEN]
'if' [BRACE OPEN] [IF - PAREN1]
'(' [BRACE OPEN] [IF - PAREN1] [PAREN OPEN]
'x' [BRACE OPEN] [IF - PAREN1] [PAREN OPEN]
')' [BRACE OPEN] [IF - BRACE2]
'volatile' [BRACE OPEN] [IF - BRACE2] [VBRACE OPEN] [VOLATILE - BRACE2]
'{' [BRACE OPEN] [IF - BRACE2] [VBRACE OPEN] [VOLATILE - BRACE2] [BRACE OPEN]
'y' [BRACE OPEN] [IF - BRACE2] [VBRACE OPEN] [VOLATILE - BRACE2] [BRACE OPEN]
'++' [BRACE OPEN] [IF - BRACE2] [VBRACE OPEN] [VOLATILE - BRACE2] [BRACE OPEN]
';' [BRACE OPEN] [IF - BRACE2] [VBRACE OPEN] [VOLATILE - BRACE2] [BRACE OPEN]
'}' [BRACE OPEN] [IF - ELSE] <- the brace close ends brace-open,
volatile-brace2 and vbrace-open
'return' [BRACE OPEN] <- not else
'y' [BRACE OPEN]
';' [BRACE OPEN]
'}' <- empties the stack
=-----------------------------------------------------------------------------=
* *
* Parse Frames *
* *
=-----------------------------------------------------------------------------=
The m_parenStack stack is kept on a frame stack.
The frame stack is need for languages that support preprocessors (C, C++, C#)
that can arbitrarily change code flow. It also isolates #define macros so
that they are indented independently and do not affect the rest of the program.
When an #if is hit, a copy of the current frame is push on the frame stack.
When an #else/#elif is hit, a copy of the current stack is pushed under the
#if frame and the original (pre-#if) frame is copied to the current frame.
When #endif is hit, the top frame is popped.
This has the following effects:
- a simple #if / #endif does not affect program flow
- #if / #else /#endif - continues from the #if clause
When a #define is entered, the current frame is pushed and cleared.
When a #define is exited, the frame is popped.
Take this example, which isn't very exciting, as both the #if and #else parts
end up with the same paren stack. This is the usual case.
{
foo(param1,
#ifdef DEBUG
"debug");
#else
"release");
#endif
}
Right before the #ifdef, we have this for the paren stack:
Top> [BRACE OPEN] [PAREN OPEN]
The #ifdef pushes a copy of the current stack, so we have this:
Top> [BRACE OPEN] [PAREN OPEN]
[BRACE OPEN] [PAREN OPEN]
The close paren after "debug" closes out the PAREN-OPEN on the top of the stack.
Top> [BRACE OPEN]
[BRACE OPEN] [PAREN OPEN]
The #else swaps the top two frames.
Top> [BRACE OPEN] [PAREN OPEN]
[BRACE OPEN]
Right after the #else, we hit another close paren after the "release".
Top> [BRACE OPEN]
[BRACE OPEN]
At the #endif, the top of stack is thrown out, which restores us to the #if path.
Top> [BRACE OPEN]