從零寫一個編譯器(十三):程式碼生成之遍歷AST
阿新 • • 發佈:2019-08-22
專案的完整程式碼在 C2j-Compiler
前言
在上一篇完成對JVM指令的生成,下面就可以真正進入程式碼生成部分了。通常現代編譯器都是先把生成IR,再經過程式碼優化等等,最後才編譯成目標平臺程式碼。但是時間水平有限,我們沒有IR也沒有程式碼優化,就直接利用AST生成Java位元組碼
入口
進行程式碼生成的入口在CodeGen,和之前直譯器一樣:先獲取main函式的頭節點,從這個節點開始,先進入函式定義,再進入程式碼塊
函式定義節點
在進入函式定義節點的時候,就要生成一個函式定義對應的Java位元組碼,即一個靜態方法(因為我們對整個C語言檔案生成為一個類,main方法為public static main,其它的則是對應的靜態方法,結構體則是另外的類)
- 對於函式定義先從節點中拿到對應的函式命和引數
- emitArgs是用來處理引數的,根據引數生成相應的Java位元組碼
- 如果這個函式是main的話已經是交由前面處理了,邏輯差不多(具體在start.Start中)
case SyntaxProductionInit.NewName_LP_RP_TO_FunctDecl: root.reverseChildren(); AstNode n = root.getChildren().get(0); String name = (String) n.getAttribute(NodeKey.TEXT); symbol = (Symbol) root.getAttribute(NodeKey.SYMBOL); generator.setCurrentFuncName(name); if (name != null && !name.equals("main")) { String declaration = name + emitArgs(symbol); generator.emitDirective(Directive.METHOD_PUBBLIC_STATIC, declaration); generator.setNameAndDeclaration(name, declaration); } copyChild(root, root.getChildren().get(0)); break; case SyntaxProductionInit.NewName_LP_VarList_RP_TO_FunctDecl: n = root.getChildren().get(0); name = (String) n.getAttribute(NodeKey.TEXT); symbol = (Symbol) root.getAttribute(NodeKey.SYMBOL); generator.setCurrentFuncName(name); if (name != null && !name.equals("main")) { String declaration = name + emitArgs(symbol); generator.emitDirective(Directive.METHOD_PUBBLIC_STATIC, declaration); generator.setNameAndDeclaration(name, declaration); } Symbol args = symbol.getArgList(); if (args == null || argsList == null || argsList.isEmpty()) { System.err.println("generate function with arg list but arg list is null"); System.exit(1); } break;
建立結構體和陣列
陣列
建立結構體和陣列的節點在DefGenerate裡,可以看到在這裡只處理了陣列和普通變數,有關結構體的處理是在對結構體第一次使用的時候。順便提一下程式碼生成對於賦初值操作是沒有進行處理的。
- 如果是個陣列,酒直接呼叫ProgramGenerator直接生成建立陣列的指令
- 如果是個普通變數,就直接找到它並且賦值為0(這裡變數在佇列裡的位置是根據符號表來計算的,具體可以看上一篇的getLocalVariableIndex方法)
public class DefGenerate extends BaseGenerate { @Override public Object generate(AstNode root) { int production = (int) root.getAttribute(NodeKey.PRODUCTION); ProgramGenerator generator = ProgramGenerator.getInstance(); Symbol symbol = (Symbol) root.getAttribute(NodeKey.SYMBOL); switch (production) { case SyntaxProductionInit.Specifiers_DeclList_Semi_TO_Def: Declarator declarator = symbol.getDeclarator(Declarator.ARRAY); if (declarator != null) { if (symbol.getSpecifierByType(Specifier.STRUCTURE) == null) { generator.createArray(symbol); } } else { int i = generator.getLocalVariableIndex(symbol); generator.emit(Instruction.SIPUSH, "" + 0); generator.emit(Instruction.ISTORE, "" + i); } break; default: break; } return root; } }
結構體
處理結構體定義的程式碼在UnaryNodeGenerate,也就是隻有在使用到結構體定義時才會進行定義
- 先拿到當前UNARY的符號,如果instanceof ArrayValueSetter就說明是一個結構體陣列,就進入getStructSymbolFromStructArray方法建立一個結構體陣列,並返回當前下標的結構體物件
- 設定當前結構體的作用域範圍
- 對結構體作為類進行定義
- 然後對讀取結構體的域
- 其實可以忽略指標部分,因為程式碼生成並沒有對指標進行模擬
case SyntaxProductionInit.Unary_StructOP_Name_TO_Unary:
child = root.getChildren().get(0);
String fieldName = (String) root.getAttribute(NodeKey.TEXT);
Object object = child.getAttribute(NodeKey.SYMBOL);
boolean isStructArray = false;
if (object instanceof ArrayValueSetter) {
symbol = getStructSymbolFromStructArray(object);
symbol.addValueSetter(object);
isStructArray = true;
} else {
symbol = (Symbol) child.getAttribute(NodeKey.SYMBOL);
}
if (isStructArray) {
ArrayValueSetter vs = (ArrayValueSetter) object;
Symbol structArray = vs.getSymbol();
structArray.addScope(ProgramGenerator.getInstance().getCurrentFuncName());
} else {
symbol.addScope(ProgramGenerator.getInstance().getCurrentFuncName());
}
ProgramGenerator.getInstance().putStructToClassDeclaration(symbol);
if (isSymbolStructPointer(symbol)) {
copyBetweenStructAndMem(symbol, false);
}
Symbol args = symbol.getArgList();
while (args != null) {
if (args.getName().equals(fieldName)) {
args.setStructParent(symbol);
break;
}
args = args.getNextSymbol();
}
if (args == null) {
System.err.println("access a filed not in struct object!");
System.exit(1);
}
if (args.getValue() != null) {
ProgramGenerator.getInstance().readValueFromStructMember(symbol, args);
}
root.setAttribute(NodeKey.SYMBOL, args);
root.setAttribute(NodeKey.VALUE, args.getValue());
if (isSymbolStructPointer(symbol)) {
checkValidPointer(symbol);
structObjSymbol = symbol;
monitorSymbol = args;
GenerateBrocasterImpl.getInstance().registerReceiverForAfterExe(this);
} else {
structObjSymbol = null;
}
break;
一元操作節點
這個節點和在直譯器的有很多相同,除了有對結構體的操作,其它的也是有非常重要的作用
- 像數字、字串或者是變數和之前的操作都是把資訊傳遞到父節點,交由父節點處理
case SyntaxProductionInit.Number_TO_Unary:
text = (String) root.getAttribute(NodeKey.TEXT);
boolean isFloat = text.indexOf('.') != -1;
if (isFloat) {
value = Float.valueOf(text);
root.setAttribute(NodeKey.VALUE, value);
} else {
value = Integer.valueOf(text);
root.setAttribute(NodeKey.VALUE, value);
}
break;
case SyntaxProductionInit.Name_TO_Unary:
symbol = (Symbol) root.getAttribute(NodeKey.SYMBOL);
if (symbol != null) {
root.setAttribute(NodeKey.VALUE, symbol.getValue());
root.setAttribute(NodeKey.TEXT, symbol.getName());
}
break;
case SyntaxProductionInit.String_TO_Unary:
text = (String) root.getAttribute(NodeKey.TEXT);
root.setAttribute(NodeKey.VALUE, text);
break;
case SyntaxProductionInit.Unary_LB_Expr_RB_TO_Unary:
child = root.getChildren().get(0);
symbol = (Symbol) child.getAttribute(NodeKey.SYMBOL);
child = root.getChildren().get(1);
int index = 0;
if (child.getAttribute(NodeKey.VALUE) != null) {
index = (Integer) child.getAttribute(NodeKey.VALUE);
}
Object idxObj = child.getAttribute(NodeKey.SYMBOL);
try {
Declarator declarator = symbol.getDeclarator(Declarator.ARRAY);
if (declarator != null) {
Object val = declarator.getElement((int) index);
root.setAttribute(NodeKey.VALUE, val);
ArrayValueSetter setter;
if (idxObj == null) {
setter = new ArrayValueSetter(symbol, index);
} else {
setter = new ArrayValueSetter(symbol, idxObj);
}
root.setAttribute(NodeKey.SYMBOL, setter);
root.setAttribute(NodeKey.TEXT, symbol.getName());
}
Declarator pointer = symbol.getDeclarator(Declarator.POINTER);
if (pointer != null) {
setPointerValue(root, symbol, index);
PointerValueSetter pv = new PointerValueSetter(symbol, index);
root.setAttribute(NodeKey.SYMBOL, pv);
root.setAttribute(NodeKey.TEXT, symbol.getName());
}
} catch (Exception e) {
e.printStackTrace();
System.exit(1);
}
break;
賦值操作
- 如果當前是一個數組,先拿到它的符號和下標
- 如果不是結構體陣列,那麼拿到下標直接用readArrayElement生成讀取陣列元素的指令
- 如果是一個符號則用getLocalVariableIndex讀取這個符號的值
- 如果是一個常數,則直接生成IPUSH指令
- 最後進行賦值操作,如果不是對結構體的域進行賦值就直接用getLocalVariableIndex拿到佇列位置然後生成ISTORE
- 如果是對結構體陣列的元素的域的賦值,就呼叫assignValueToStructMemberFromArray生成程式碼,如果只是結構體就直接呼叫assignValueToStructMember生成程式碼
ProgramGenerator generator = ProgramGenerator.getInstance();
if (BaseGenerate.resultOnStack) {
this.value = obj;
BaseGenerate.resultOnStack = false;
} else if (obj instanceof ArrayValueSetter) {
ArrayValueSetter setter = (ArrayValueSetter) obj;
Symbol symbol = setter.getSymbol();
Object index = setter.getIndex();
if (symbol.getSpecifierByType(Specifier.STRUCTURE) == null) {
if (index instanceof Symbol) {
ProgramGenerator.getInstance().readArrayElement(symbol, index);
if (((Symbol) index).getValue() != null) {
int i = (int) ((Symbol) index).getValue();
try {
this.value = symbol.getDeclarator(Declarator.ARRAY).getElement(i);
} catch (Exception e) {
e.printStackTrace();
}
}
} else {
int i = (int) index;
try {
this.value = symbol.getDeclarator(Declarator.ARRAY).getElement(i);
} catch (Exception e) {
e.printStackTrace();
}
ProgramGenerator.getInstance().readArrayElement(symbol, index);
}
}
} else if (obj instanceof Symbol) {
Symbol symbol = (Symbol) obj;
this.value = symbol.value;
int i = generator.getLocalVariableIndex(symbol);
generator.emit(Instruction.ILOAD, "" + i);
} else if (obj instanceof Integer) {
Integer val = (Integer) obj;
generator.emit(Instruction.SIPUSH, "" + val);
this.value = obj;
}
if (!this.isStructMember()) {
int idx = generator.getLocalVariableIndex(this);
if (!generator.isPassingArguments()) {
generator.emit(Instruction.ISTORE, "" + idx);
}
} else {
if (this.getStructSymbol().getValueSetter() != null) {
generator.assignValueToStructMemberFromArray(this.getStructSymbol().getValueSetter(), this, this.value);
} else {
generator.assignValueToStructMember(this.getStructSymbol(), this, this.value);
}
}
最後
完成這部分後,對下面的程式碼
void quicksort(int A[10], int p, int r) {
int x;
int i;
i = p - 1;
int j;
int t;
int v;
v = r - 1;
if (p < r) {
x = A[r];
for (j = p; j <= v; j++) {
if (A[j] <= x) {
i++;
t = A[i];
A[i] = A[j];
A[j] = t;
}
}
v = i + 1;
t = A[v];
A[v] = A[r];
A[r] = t;
t = v - 1;
quicksort(A, p, t);
t = v + 1;
quicksort(A, t, r);
}
}
void main () {
int a[10];
int i;
int t;
printf("before quick sort:");
for(i = 0; i < 10; i++) {
t = (10 - i);
a[i] = t;
printf("value of a[%d] is %d", i, a[i]);
}
quicksort(a, 0, 9);
printf("after quick sort:");
for (i = 0; i < 10; i++) {
printf("value of a[%d] is %d", i, a[i]);
}
}
則會生成下面的Java位元組碼
.class public C2Bytecode
.super java/lang/Object
.method public static main([Ljava/lang/String;)V
sipush 10
newarray int
astore 0
sipush 0
istore 1
sipush 0
istore 2
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "before quick sort:"
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "
"
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
sipush 0
istore 1
loop0:
iload 1
sipush 10
if_icmpge branch0
sipush 10
iload 1
isub
istore 2
aload 0
iload 1
iload 2
iastore
aload 0
iload 1
iaload
istore 3
iload 1
istore 4
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "value of a["
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
iload 4
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "] is "
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
iload 3
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "
"
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
iload 1
sipush 1
iadd
istore 1
goto loop0
branch0:
aload 0
sipush 0
sipush 9
invokestatic C2Bytecode/quicksort([III)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "after quick sort:"
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "
"
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
sipush 0
istore 1
loop2:
iload 1
sipush 10
if_icmpge branch4
aload 0
iload 1
iaload
istore 3
iload 1
istore 4
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "value of a["
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
iload 4
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "] is "
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
iload 3
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "
"
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
iload 1
sipush 1
iadd
istore 1
goto loop2
branch4:
return
.end method
.method public static quicksort([III)V
sipush 2
newarray int
astore 6
sipush 0
istore 5
sipush 1
istore 5
aload 6
iload 5
sipush 1
iastore
aload 6
sipush 1
iaload
istore 10
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "before quick sort: "
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
iload 10
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "
"
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
sipush 0
istore 9
sipush 0
istore 3
iload 1
sipush 1
isub
istore 3
sipush 0
istore 4
sipush 0
istore 7
sipush 0
istore 8
iload 2
sipush 1
isub
istore 8
iload 1
iload 2
if_icmpge branch1
aload 0
iload 2
iaload
istore 9
iload 1
istore 4
loop1:
iload 4
iload 8
if_icmpgt ibranch1
aload 0
iload 4
iaload
iload 9
if_icmpgt ibranch2
iload 3
sipush 1
iadd
istore 3
aload 0
iload 3
iaload
istore 7
aload 0
iload 3
aload 0
iload 4
iaload
iastore
aload 0
iload 4
iload 7
iastore
ibranch2:
iload 4
sipush 1
iadd
istore 4
goto loop1
ibranch1:
iload 3
sipush 1
iadd
istore 8
aload 0
iload 8
iaload
istore 7
aload 0
iload 8
aload 0
iload 2
iaload
iastore
aload 0
iload 2
iload 7
iastore
iload 8
sipush 1
isub
istore 7
aload 0
iload 1
iload 7
invokestatic C2Bytecode/quicksort([III)V
iload 8
sipush 1
iadd
istore 7
aload 0
iload 7
iload 2
invokestatic C2Bytecode/quicksort([III)V
branch1:
return
.end method
.end class
小結
這篇的程式碼生成和之前直譯器的思路很相似,都是根據AST和對應的產生式來執行或者生成程式碼。
其實主要的思路是很清晰的,只是其中有太多細節容易讓人太過糾結。這個系列算作是我自己的學習筆記,到這也有十三篇了,下一篇可能寫寫總結就正式結束了。
歡迎St