llvm學習筆記(2)
2. LLVM的後端描述
2.1. 型別描述
為了更好地描述暫存器所能支援的值型別(大小),以及運算元的型別(大小),Tablegen在ValueTypes.td裡給出了一系列的型別定義,它們都繼承自ValueType:
16 class ValueType<int size, intvalue> {
17 stringNamespace = "MVT";
18 int Size =size;
19 int Value =value;
20 }
其中Size表示該型別的位元大小,Value是該型別的標識值(Value必須與MachineValueType.h中的類MVT裡的列舉型別SimpleValueType給出的數值一致)。以此ValueType基類,TableGen會用到了以下的型別(也即是LLVM IR會用到的型別):
OtherVT:“其他”值 |
il:一位元的布林值 |
i8:8位元整數值 |
i16:16位元整數值 |
i32:32位元整數值 |
i64:64位元整數值 |
i128:128位元整數值 |
f16:16位元浮點值 |
f32:32位元浮點值 |
f64:64位元浮點值 |
f80:80位元浮點值 |
f128:128位元浮點值 |
ppcf128:PPC的128位元浮點值 |
v2i1:2✕i1向量 |
v4i1:4✕i1向量 |
|
v8i1:8✕i1向量 |
v16i1:16✕i1向量 |
v32i1:32✕i1向量 |
v64i1:64✕i1向量 |
v1i8:1✕i8向量 |
v2i8:2✕i8向量 |
v4i8:4✕i8向量 |
v8i8:8✕i8向量 |
v16i8:16✕i8向量 |
v32i8:32✕i8向量 |
v64i8:64✕i8向量 |
v1i16:1✕i16向量 |
v2i16:2✕i16向量 |
v4i16:4✕i16向量 |
v8i16:8✕i16向量 |
v16i16:16✕i16向量 |
v32i16:32✕i16向量 |
v1i32:1✕i32向量 |
v2i32:2✕i32向量 |
v4i32:4✕i32向量 |
v8i32:8✕i32向量 |
v16i32:16✕i32向量 |
v1i64:1✕i64向量 |
v2i64:2✕i64向量 |
v4i64:4✕i64向量 |
v8i64:8✕i64向量 |
v16i64:16✕i64向量 |
v1i128:1✕i128向量 |
v2f16:2✕f16向量 |
v4f16:4✕f16向量 |
v8f16:16✕f16向量 |
v1f32:1✕f32向量 |
v2f32:2✕f32向量 |
v4f32:4✕f32向量 |
v8f32:8✕f32向量 |
v16f32:16✕f32向量 |
v1f64:1✕f64向量 |
v2f64:2✕f64向量 |
v4f64:4✕f64向量 |
v8f64:8✕f64向量 |
x86mmx:XMM值 |
FlagVT: |
isVoid:非值型別 |
untyped:非型別值 |
iPTRAny:將當前指標大小對映到任意地址空間 |
vAny:任意大小向量 |
fAny:任意格式浮點值 |
|
iAny:任意位元大小整數值 |
iPTR:當前指標大小 |
Any:任意大小、任意型別值 |
MetadataVT:元資料 |
其中vXiY,vXfY是X86的SSE、AVX指令集所支援的向量型別。
2.2. 指令描述
指令在Target.td(該檔案用於描述與目標機器無關的介面)中有一個定義Instruction。不過它更像是一個描述指令的容器,將指令方方面面的描述集中起來。其中,OutOperandList與InOperandList分別是輸出、輸入運算元,AsmString是字串形式的彙編程式碼,Pattern指出在SelectionDAG中什麼樣的DAG片段能匹配這條指令,Itinerary則可以描述該指令在處理器中的執行步驟,而SchedRW則描述該指令對CPU資源的佔用情況。
注意,描述一條指令不需要用到所有的域。
320 //===----------------------------------------------------------------------===//
321 // Instruction set description - Theseclasses correspond to the C++ classes in
322 // the Target/TargetInstrInfo.h file.
323 //
325 stringNamespace = "";
326
327 dag OutOperandList; // An dagcontaining the MI def operand list.
328 dag InOperandList; // An dagcontaining the MI use operand list.
329 stringAsmString = ""; // The .s format to print the instruction with.
330
331 // Pattern - Set to the DAG pattern for this instruction,if we know of one,
332 // otherwise, uninitialized.
333 list<dag> Pattern;
334
335 // The follow state will eventually be inferredautomatically from the
336 // instruction pattern.
337
338 list<Register> Uses = []; // Defaultto using no non-operand registers
339 list<Register> Defs = []; // Defaultto modifying no non-operand registers
340
341 // Predicates - List of predicates which will be turnedinto isel matching
342 // code.
343 list<Predicate> Predicates = [];
344
345 // Size - Size of encoded instruction, or zero if thesize cannot be determined
346 // from the opcode.
347 int Size = 0;
348
349 // DecoderNamespace - The "namespace" in whichthis instruction exists, on
350 // targets like ARM which multiple ISA namespaces exist.
351 stringDecoderNamespace = "";
352
353 // Code size, for instruction selection.
354 // FIXME: What does this actually mean?
355 int CodeSize =0;
356
357 // Added complexity passed onto matching pattern.
358 intAddedComplexity = 0;
359
360 // These bits capture information about the high-levelsemantics of the
361 // instruction.
362 bitisReturn = 0; // Is thisinstruction a return instruction?
363 bitisBranch = 0; // Is thisinstruction a branch instruction?
364 bitisIndirectBranch = 0; // Is this instruction an indirectbranch?
365 bitisCompare = 0; // Is thisinstruction a comparison instruction?
366 bitisMoveImm = 0; // Is thisinstruction a move immediate instruction?
367 bitisBitcast = 0; // Is thisinstruction a bitcast instruction?
368 bitisSelect = 0; // Is thisinstruction a select instruction?
369 bitisBarrier = 0; // Can controlflow fall through this instruction?
370 bit isCall = 0; // Is this instruction a call instruction?
371 bit canFoldAsLoad= 0; //Can this be folded as a simple memory operand?
372 bitmayLoad = ?; // Is it possible for this inst to readmemory?
373 bitmayStore = ?; // Is itpossible for this inst to write memory?
374 bitisConvertibleToThreeAddress = 0; // Can this 2-addr instruction promote?
375 bitisCommutable = 0; // Is this 3 operand instruction commutable?
376 bitisTerminator = 0; // Is this part of the terminator for a basic block?
377 bitisReMaterializable = 0; // Is this instructionre-materializable?
378 bitisPredicable = 0; // Is this instruction predicable?
379 bithasDelaySlot = 0; // Does this instruction have an delay slot?
380 bitusesCustomInserter = 0; // Pseudo instr needingspecial help.
381 bithasPostISelHook = 0; // To be *adjusted* after isel by target hook.
382 bithasCtrlDep = 0; // Does thisinstruction r/w ctrl-flow chains?
383 bitisNotDuplicable = 0; // Is it unsafe to duplicate this instruction?
384 bitisConvergent = 0; // Is this instruction convergent?
385 bitisAsCheapAsAMove = 0; // As cheap (or cheaper) thana move instruction.
386 bithasExtraSrcRegAllocReq = 0; // Sources have specialregalloc requirement?
387 bithasExtraDefRegAllocReq = 0; // Defs have specialregalloc requirement?
388 bitisRegSequence = 0; // Is this instruction a kind of reg sequence?
389 // If so, make sureto override
390 //TargetInstrInfo::getRegSequenceLikeInputs.
391 bitisPseudo = 0; // Is thisinstruction a pseudo-instruction?
392 // If so, won'thave encoding information for
393 // the[MC]CodeEmitter stuff.
394 bitisExtractSubreg = 0; // Is this instruction a kind of extract subreg?
395 // If so, makesure to override
396 //TargetInstrInfo::getExtractSubregLikeInputs.
397 bitisInsertSubreg = 0; // Is this instruction a kind of insert subreg?
398 // If so, make sureto override
399 //TargetInstrInfo::getInsertSubregLikeInputs.
400
401 // Side effect flags - When set, the flags have these meanings:
402 //
403 // hasSideEffects - Theinstruction has side effects that are not
404 // captured by any operands ofthe instruction or other flags.
405 //
406 bithasSideEffects = ?;
407
408 // Is this instruction a "real" instruction(with a distinct machine
409 // encoding), or is it a pseudo instruction used for codegen modeling
410 // purposes.
411 // FIXME: For now this is distinct from isPseudo, above, ascode-gen-only
412 // instructions can (and often do) still have encoding information
413 // associated with them. Once we've migrated all of them over to true
414 // pseudo-instructions that are lowered to real instructions prior to
415 // the printer/emitter, we can remove this attribute and just useisPseudo.
416 //
417 // The intended use is:
418 // isPseudo: Does not have encoding information and should be expanded,
419 // at the latest, duringlowering to MCInst.
420 //
421 // isCodeGenOnly: Does have encoding information and can go through tothe
422 // CodeEmitter unchanged, butduplicates a canonical instruction
423 // definition's encoding andshould be ignored when constructing the
424 // assembler match tables.
425 bitisCodeGenOnly = 0;
426
427 // Is this instruction a pseudo instruction for use bythe assembler parser.
428 bitisAsmParserOnly = 0;
429
430 InstrItinClassItinerary = NoItinerary;// Execution steps used forscheduling.
431
432 // Scheduling information from TargetSchedule.td.
433 list<SchedReadWrite> SchedRW;
434
435 stringConstraints = ""; // OperandConstraint, e.g. $src = $dst.
436
437 /// DisableEncoding - List of operand names (e.g."$op1,$op2") that should not
438 /// be encoded into the output machineinstr.
439 stringDisableEncoding = "";
440
441 stringPostEncoderMethod = "";
442 stringDecoderMethod = "";
443
444 /// Target-specific flags. This becomes the TSFlags fieldin TargetInstrDesc.
445 bits<64>TSFlags = 0;
446
447 ///@name Assembler Parser Support
448 ///@{
449
450 stringAsmMatchConverter = "";
451
452 /// TwoOperandAliasConstraint - Enable TableGen toauto-generate a
453 /// two-operand matcher inst-alias for a three operand instruction.
454 /// For example, the arm instruction "add r3, r3, r5" can bewritten
455 /// as "add r3, r5". The constraint is of the same form as atied-operand
456 /// constraint. For example, "$Rn = $Rd".
457 stringTwoOperandAliasConstraint = "";
458
459 ///@}
460
461 /// UseNamedOperandTable - If set, the operand indices of thisinstruction
462 /// can be queried via the getNamedOperandIdx() function which isgenerated
463 /// by TableGen.
464 bitUseNamedOperandTable = 0;
465 }
TableGen對class宣告(包括def,defm,multiclass宣告)的處理是生成一個Record物件,並儲存在解析器的Records容器內。所不同者,def與defm宣告得到的Record物件是可援引的(Records容器內包含兩個子容器,分別用於class與def宣告)。對上面的Instruction宣告,TablegGen不會產生一個對應的C++類宣告,對應的C++類宣告定義在Target/TargetInstrInfo.h裡,是手動編寫的類TargetInstrInfo。
2.2.1. TableGen的內建型別
在上面的Instruction宣告中,dag,bit,string,int與list都可視為TD語言的保留字,它們只允許出現在class,multiclass,def,defm及foreach這樣的聲明裡。它們是TD的內建型別。
2.2.1.1. Dag
dag是一個很特殊的型別,它表示了程式中間表達樹中的dag【有向無環(子)圖】結構。因此,它的例項是一個遞迴構造,有這樣的語法:
“(“DagArg DagArgList”)”
DagArgList ::= DagArg (“,” DagArg)*
DagArg ::= Value [“:” TokVarName] | TokVarName
第一個式子中的DagArg稱為該dag的操作符。第三個式子中的Value也可以是一個dag結構。下面是一個LLVM裡的實際例子:
(set VR128:$dst, (v2i64 (scalar_to_vector (i64 (bitconvert (x86mmx VR64:$src))))))
這個dag值有6層巢狀。第一層的操作符是“set”,它有兩個值。第一個值“VR128:$dst”中,“VR128”是值部分,“$dst”是該值的符號名(符號名必須以“$”開頭),在上下文中代表這個值。第二個值則是一個dag值,其操作符是“v2i64”,“v2i64”的運算元又是一個dag值,其操作符是“scalar_to_vector”,運算元是一個dag值,以“i64”作為操作符,以此類推。
這個dag值描述了一個轉換操作:64位標量的源運算元儲存在MMX暫存器裡,首先轉換為64位有符號整數,然後轉換為一個2✕i64向量,儲存入一個128位目標暫存器。
dag的操作符要麼是一個簡單的def(例如“out”,“in”,“set”,它們對Tblgen有特殊的含義);要麼是一個SDNode派生定義,描述一個操作(例如上面的“scalar_to_vector”與“bitconvert”);又或者是一個ValueType派生定義,描述值的型別(例如上面的“VR128”,“i64”,“x86mmx”)。
2.2.2.2. List
顧名思義,這代表了一個佇列。List值有這樣的形式:
“[“ ValueList ”]” [“<” Type ”>”]
ValueList ::= [ValueListNE]
ValueListNE ::= Value (“,” Value)*
List值可以是空的,即“[]”。下面是一個LLVM的實際例子:
[llvm_ptr_ty, llvm_ptr_ty]
注意,在TD語言裡,“[{…}]”結構不是一個List值,而是表示一個內嵌的程式碼片段。
2.2.2.3. String
String基本上就是C++的字串常量。
2.2.2.4. Bit、int
Bit代表一個位元位,int則是一個64位整數。
2.2.2.5. Bits
Bits代表若干位元位,需要引數指定長度,如上面的“bits<64>”。