1. 程式人生 > >llvm學習筆記(2)

llvm學習筆記(2)

2.      LLVM的後端描述

2.1.           型別描述

為了更好地描述暫存器所能支援的值型別(大小),以及運算元的型別(大小),Tablegen在ValueTypes.td裡給出了一系列的型別定義,它們都繼承自ValueType:

16       class ValueType<int size, intvalue> {

17         stringNamespace = "MVT";

18         int Size =size;

19         int Value =value;

20       }

其中Size表示該型別的位元大小,Value是該型別的標識值(Value必須與MachineValueType.h中的類MVT裡的列舉型別SimpleValueType給出的數值一致)。以此ValueType基類,TableGen會用到了以下的型別(也即是LLVM IR會用到的型別):

OtherVT:“其他”值

il:一位元的布林值

i8:8位元整數值

i16:16位元整數值

i32:32位元整數值

i64:64位元整數值

i128:128位元整數值

f16:16位元浮點值

f32:32位元浮點值

f64:64位元浮點值

f80:80位元浮點值

f128:128位元浮點值

ppcf128:PPC的128位元浮點值

v2i1:2✕i1向量

v4i1:4✕i1向量

v8i1:8✕i1向量

v16i1:16✕i1向量

v32i1:32✕i1向量

v64i1:64✕i1向量

v1i8:1✕i8向量

v2i8:2✕i8向量

v4i8:4✕i8向量

v8i8:8✕i8向量

v16i8:16✕i8向量

v32i8:32✕i8向量

v64i8:64✕i8向量

v1i16:1✕i16向量

v2i16:2✕i16向量

v4i16:4✕i16向量

v8i16:8✕i16向量

v16i16:16✕i16向量

v32i16:32✕i16向量

v1i32:1✕i32向量

v2i32:2✕i32向量

v4i32:4✕i32向量

v8i32:8✕i32向量

v16i32:16✕i32向量

v1i64:1✕i64向量

v2i64:2✕i64向量

v4i64:4✕i64向量

v8i64:8✕i64向量

v16i64:16✕i64向量

v1i128:1✕i128向量

v2f16:2✕f16向量

v4f16:4✕f16向量

v8f16:16✕f16向量

v1f32:1✕f32向量

v2f32:2✕f32向量

v4f32:4✕f32向量

v8f32:8✕f32向量

v16f32:16✕f32向量

v1f64:1✕f64向量

v2f64:2✕f64向量

v4f64:4✕f64向量

v8f64:8✕f64向量

x86mmx:XMM值

FlagVT:

isVoid:非值型別

untyped:非型別值

iPTRAny:將當前指標大小對映到任意地址空間

vAny:任意大小向量

fAny:任意格式浮點值

iAny:任意位元大小整數值

iPTR:當前指標大小

Any:任意大小、任意型別值

MetadataVT:元資料

其中vXiY,vXfY是X86的SSE、AVX指令集所支援的向量型別。

2.2.           指令描述

指令在Target.td(該檔案用於描述與目標機器無關的介面)中有一個定義Instruction。不過它更像是一個描述指令的容器,將指令方方面面的描述集中起來。其中,OutOperandList與InOperandList分別是輸出、輸入運算元,AsmString是字串形式的彙編程式碼,Pattern指出在SelectionDAG中什麼樣的DAG片段能匹配這條指令,Itinerary則可以描述該指令在處理器中的執行步驟,而SchedRW則描述該指令對CPU資源的佔用情況。

注意,描述一條指令不需要用到所有的域。

320     //===----------------------------------------------------------------------===//

321     // Instruction set description - Theseclasses correspond to the C++ classes in

322     // the Target/TargetInstrInfo.h file.

323     //

325       stringNamespace = "";

326    

327       dag OutOperandList;       // An dagcontaining the MI def operand list.

328       dag InOperandList;        // An dagcontaining the MI use operand list.

329       stringAsmString = "";    // The .s format to print the instruction with.

330    

331       // Pattern - Set to the DAG pattern for this instruction,if we know of one,

332      // otherwise, uninitialized.

333       list<dag> Pattern;

334    

335       // The follow state will eventually be inferredautomatically from the

336      // instruction pattern.

337    

338      list<Register> Uses = []; // Defaultto using no non-operand registers

339      list<Register> Defs = []; // Defaultto modifying no non-operand registers

340    

341       // Predicates - List of predicates which will be turnedinto isel matching

342      // code.

343      list<Predicate> Predicates = [];

344    

345       // Size - Size of encoded instruction, or zero if thesize cannot be determined

346      // from the opcode.

347       int Size = 0;

348    

349       // DecoderNamespace - The "namespace" in whichthis instruction exists, on

350      // targets like ARM which multiple ISA namespaces exist.

351       stringDecoderNamespace = "";

352    

353       // Code size, for instruction selection.

354      // FIXME: What does this actually mean?

355       int CodeSize =0;

356    

357       // Added complexity passed onto matching pattern.

358       intAddedComplexity  = 0;

359    

360       // These bits capture information about the high-levelsemantics of the

361      // instruction.

362       bitisReturn     = 0;     // Is thisinstruction a return instruction?

363       bitisBranch     = 0;     // Is thisinstruction a branch instruction?

364       bitisIndirectBranch = 0; // Is this instruction an indirectbranch?

365       bitisCompare    = 0;     // Is thisinstruction a comparison instruction?

366       bitisMoveImm    = 0;     // Is thisinstruction a move immediate instruction?

367       bitisBitcast    = 0;     // Is thisinstruction a bitcast instruction?

368       bitisSelect     = 0;     // Is thisinstruction a select instruction?

369       bitisBarrier    = 0;     // Can controlflow fall through this instruction?

370       bit isCall       = 0;    // Is this instruction a call instruction?

371       bit canFoldAsLoad= 0;    //Can this be folded as a simple memory operand?

372       bitmayLoad      = ?;     // Is it possible for this inst to readmemory?

373       bitmayStore     = ?;     // Is itpossible for this inst to write memory?

374       bitisConvertibleToThreeAddress = 0;  // Can this 2-addr instruction promote?

375       bitisCommutable = 0;     // Is this 3 operand instruction commutable?

376       bitisTerminator = 0;     // Is this part of the terminator for a basic block?

377       bitisReMaterializable = 0; // Is this instructionre-materializable?

378       bitisPredicable = 0;     // Is this instruction predicable?

379       bithasDelaySlot = 0;     // Does this instruction have an delay slot?

380       bitusesCustomInserter = 0; // Pseudo instr needingspecial help.

381       bithasPostISelHook = 0;  // To be *adjusted* after isel by target hook.

382       bithasCtrlDep   = 0;     // Does thisinstruction r/w ctrl-flow chains?

383       bitisNotDuplicable = 0;  // Is it unsafe to duplicate this instruction?

384       bitisConvergent = 0;     // Is this instruction convergent?

385       bitisAsCheapAsAMove = 0; // As cheap (or cheaper) thana move instruction.

386       bithasExtraSrcRegAllocReq = 0; // Sources have specialregalloc requirement?

387       bithasExtraDefRegAllocReq = 0; // Defs have specialregalloc requirement?

388       bitisRegSequence = 0;    // Is this instruction a kind of reg sequence?

389                                 // If so, make sureto override

390                                 //TargetInstrInfo::getRegSequenceLikeInputs.

391       bitisPseudo     = 0;     // Is thisinstruction a pseudo-instruction?

392                                 // If so, won'thave encoding information for

393                                 // the[MC]CodeEmitter stuff.

394       bitisExtractSubreg = 0;  // Is this instruction a kind of extract subreg?

395                                  // If so, makesure to override

396                                  //TargetInstrInfo::getExtractSubregLikeInputs.

397       bitisInsertSubreg = 0;   // Is this instruction a kind of insert subreg?

398                                 // If so, make sureto override

399                                 //TargetInstrInfo::getInsertSubregLikeInputs.

400    

401      // Side effect flags - When set, the flags have these meanings:

402      //

403      //  hasSideEffects - Theinstruction has side effects that are not

404      //    captured by any operands ofthe instruction or other flags.

405      //

406       bithasSideEffects = ?;

407    

408       // Is this instruction a "real" instruction(with a distinct machine

409      // encoding), or is it a pseudo instruction used for codegen modeling

410      // purposes.

411      // FIXME: For now this is distinct from isPseudo, above, ascode-gen-only

412      // instructions can (and often do) still have encoding information

413      // associated with them. Once we've migrated all of them over to true

414      // pseudo-instructions that are lowered to real instructions prior to

415      // the printer/emitter, we can remove this attribute and just useisPseudo.

416      //

417      // The intended use is:

418      // isPseudo: Does not have encoding information and should be expanded,

419      //   at the latest, duringlowering to MCInst.

420      //

421      // isCodeGenOnly: Does have encoding information and can go through tothe

422      //   CodeEmitter unchanged, butduplicates a canonical instruction

423      //   definition's encoding andshould be ignored when constructing the

424      //   assembler match tables.

425       bitisCodeGenOnly = 0;

426    

427       // Is this instruction a pseudo instruction for use bythe assembler parser.

428       bitisAsmParserOnly = 0;

429    

430       InstrItinClassItinerary = NoItinerary;// Execution steps used forscheduling.

431    

432       // Scheduling information from TargetSchedule.td.

433      list<SchedReadWrite> SchedRW;

434    

435       stringConstraints = "";  // OperandConstraint, e.g. $src = $dst.

436    

437       /// DisableEncoding - List of operand names (e.g."$op1,$op2") that should not

438      /// be encoded into the output machineinstr.

439       stringDisableEncoding = "";

440    

441       stringPostEncoderMethod = "";

442       stringDecoderMethod = "";

443    

444       /// Target-specific flags. This becomes the TSFlags fieldin TargetInstrDesc.

445       bits<64>TSFlags = 0;

446    

447       ///@name Assembler Parser Support

448      ///@{

449    

450       stringAsmMatchConverter = "";

451    

452       /// TwoOperandAliasConstraint - Enable TableGen toauto-generate a

453      /// two-operand matcher inst-alias for a three operand instruction.

454      /// For example, the arm instruction "add r3, r3, r5" can bewritten

455      /// as "add r3, r5". The constraint is of the same form as atied-operand

456      /// constraint. For example, "$Rn = $Rd".

457       stringTwoOperandAliasConstraint = "";

458    

459       ///@}

460    

461      /// UseNamedOperandTable - If set, the operand indices of thisinstruction

462      /// can be queried via the getNamedOperandIdx() function which isgenerated

463      /// by TableGen.

464       bitUseNamedOperandTable = 0;

465     }

TableGen對class宣告(包括def,defm,multiclass宣告)的處理是生成一個Record物件,並儲存在解析器的Records容器內。所不同者,def與defm宣告得到的Record物件是可援引的(Records容器內包含兩個子容器,分別用於class與def宣告)。對上面的Instruction宣告,TablegGen不會產生一個對應的C++類宣告,對應的C++類宣告定義在Target/TargetInstrInfo.h裡,是手動編寫的類TargetInstrInfo。

2.2.1.  TableGen的內建型別

在上面的Instruction宣告中,dag,bit,string,int與list都可視為TD語言的保留字,它們只允許出現在class,multiclass,def,defm及foreach這樣的聲明裡。它們是TD的內建型別。

2.2.1.1.        Dag

dag是一個很特殊的型別,它表示了程式中間表達樹中的dag【有向無環(子)圖】結構。因此,它的例項是一個遞迴構造,有這樣的語法:

“(“DagArg DagArgList”)”

DagArgList ::= DagArg (“,” DagArg)*

DagArg ::= Value [“:” TokVarName] | TokVarName

第一個式子中的DagArg稱為該dag的操作符。第三個式子中的Value也可以是一個dag結構。下面是一個LLVM裡的實際例子:

(set VR128:$dst, (v2i64 (scalar_to_vector (i64 (bitconvert (x86mmx VR64:$src))))))

這個dag值有6層巢狀。第一層的操作符是“set”,它有兩個值。第一個值“VR128:$dst”中,“VR128”是值部分,“$dst”是該值的符號名(符號名必須以“$”開頭),在上下文中代表這個值。第二個值則是一個dag值,其操作符是“v2i64”,“v2i64”的運算元又是一個dag值,其操作符是“scalar_to_vector”,運算元是一個dag值,以“i64”作為操作符,以此類推。

這個dag值描述了一個轉換操作:64位標量的源運算元儲存在MMX暫存器裡,首先轉換為64位有符號整數,然後轉換為一個2✕i64向量,儲存入一個128位目標暫存器。

dag的操作符要麼是一個簡單的def(例如“out”,“in”,“set”,它們對Tblgen有特殊的含義);要麼是一個SDNode派生定義,描述一個操作(例如上面的“scalar_to_vector”與“bitconvert”);又或者是一個ValueType派生定義,描述值的型別(例如上面的“VR128”,“i64”,“x86mmx”)。

2.2.2.2.        List

顧名思義,這代表了一個佇列。List值有這樣的形式:

“[“ ValueList ”]” [“<” Type ”>”]

ValueList ::= [ValueListNE]

ValueListNE ::= Value (“,” Value)*

List值可以是空的,即“[]”。下面是一個LLVM的實際例子:

[llvm_ptr_ty, llvm_ptr_ty]

注意,在TD語言裡,“[{…}]”結構不是一個List值,而是表示一個內嵌的程式碼片段。

2.2.2.3.        String

String基本上就是C++的字串常量。

2.2.2.4.        Bit、int

Bit代表一個位元位,int則是一個64位整數。

2.2.2.5.        Bits

Bits代表若干位元位,需要引數指定長度,如上面的“bits<64>”。