1. 程式人生 > >LLVM學習筆記(42)

LLVM學習筆記(42)

3.6.2.2.3. 資源及其使用的描述

我們已經知道有兩個方式可以描述指令的執行。一種是執行步驟,Itinerary,它包括了一系列包含一組InstrStage定義的InstrItinData定義,將InstrItinData與指令定義關聯起來的InstrItinClass,以及一個把有關定義組合起來的ProcessorItineraries定義。另一種則是通過描述資源使用情形,它由一系列相互關聯的SchedReadWrite派生定義組成。

這背後都是將處理器描述成若干資源,並敘述指令對這些資源的使用情況。現在是時候輸出相關的資料結構了。

1251     OS << "#ifdef DBGFIELD\n"

1252        << "#error \"<target>GenSubtargetInfo.inc requires a DBGFIELD macro\"\n"

1253        << "#endif\n"

1254        << "#ifndef NDEBUG\n"

1255        << "#define DBGFIELD(x) x,\n"

1256        << "#else\n"

1257        << "#define DBGFIELD(x)\n"

1258  

     << "#endif\n";

1259  

1260     if (SchedModels.hasItineraries()) {

1261       std::vector<std::vector<> > ProcItinLists;

1262       // Emit the stage data

1264       (OS, ProcItinLists);

1265     }

與前面章節看到的一樣,這裡的SchedModels物件中容器ProcModels儲存了同一族的各個處理器的CodeGenProcModel物件。如果處理器中有使用執行步驟來描述的,滿足1260行條件,將輸出這些處理器的步驟(stage)資料。類似於TD檔案裡使用的InstrStage定義,LLVM也有一個同名的、作用相類的型別。

60          enum ReservationKinds {

61            Required = 0,

62            Reserved = 1

63          };

64       

65          unsigned Cycles_;  ///< Length of stage in machine cycles

66          unsigned Units_;   ///< Choice of functional units

67          int NextCycles_;   ///< Number of machine cycles to next stage

68          ReservationKinds Kind_; ///< Kind of the FU reservation

69       

70          /// \brief Returns the number of cycles the stage is occupied.

71          unsigned getCycles() const {

72            return Cycles_;

73          }

74       

75          /// \brief Returns the choice of FUs.

76          unsigned getUnits() const {

77            return Units_;

78          }

79       

80          ReservationKinds getReservationKind() const {

81            return Kind_;

82          }

83       

84          /// \brief Returns the number of cycles from the start of this stage to the

85          /// start of the next stage in the itinerary

86          unsigned getNextCycles() const {

87            return (NextCycles_ >= 0) ? (unsigned)NextCycles_ : Cycles_;

88          }

89        };

InstrStage代表指令執行中的一個非流水線化的步驟。Cycles表示完成該步驟所需的週期,Units表示可供選擇用於完成該步驟的功能單元。比如IntUnit1,IntUnit2。NextCycles表示從該步驟開始到下一步開始所應該消逝的週期數。值-1表示下一步應該跟在當前步驟後立即開始。比如:

{ 1, x, -1 }:表示該步驟佔用FU x一個週期,下一步在該步驟後立即開始。

{ 2, x|y, 1 }:表示該步驟佔用FU x或FU y連續的兩個週期,下一步應該在該步驟開始一週期後開始。即,這些步驟要求在時間上重疊。

{ 1, x, 0 }:表示該步驟佔用FU x一個週期,下一步與該步驟在同一個週期開始。這可用於表示指令同一時間要求多個步驟。

有兩種FU保留型別:指令實際要求的FU,指令僅保留的FU。對其他指令的執行,保留單元不可用。不過,多條指令可以多次保留同一個單元。這兩種單元保留用於模擬指令欄位改變導致的暫停,使用同樣資源(比如同一個暫存器)的FU,等等。

98          int      NumMicroOps;        ///< # of micro-ops, -1 means it's variable

99          unsigned FirstStage;         ///< Index of first stage in itinerary

100        unsigned LastStage;          ///< Index of last + 1 stage in itinerary

101        unsigned FirstOperandCycle;  ///< Index of first operand rd/wr

102        unsigned LastOperandCycle;   ///< Index of last + 1 operand rd/wr

103      };

InstrItinerary代表指令的排程資訊。包括該指令所佔據的一組步驟及運算元讀、寫所在的流水線週期。它是InstrItinData定義在LLVM的對等物。更上一級的封裝則是InstrItineraryData,它所定義的資料成員及建構函式有下面這些。它為子目標機器提供資料的封裝。

109      class InstrItineraryData {

110      public:

111        MCSchedModel          SchedModel;     ///< Basic machine properties.

112        const InstrStage     *Stages;         ///< Array of stages selected

113        const unsigned       *OperandCycles;  ///< Array of operand cycles selected

114        const unsigned       *Forwardings;    ///< Array of pipeline forwarding pathes

115        const *Itineraries;    ///< Array of itineraries selected

116     

117        /// Ctors.

118        InstrItineraryData() : SchedModel(MCSchedModel::GetDefaultSchedModel()),

119                               Stages(nullptr), OperandCycles(nullptr),

120                               Forwardings(nullptr), Itineraries(nullptr) {}

121     

122        InstrItineraryData(const MCSchedModel &SM, const InstrStage *S,

123                           const unsigned *OS, const unsigned *F)

124          : SchedModel(SM), Stages(S), OperandCycles(OS), Forwardings(F),

125            Itineraries(SchedModel.InstrItineraries) {}

3.6.2.2.3.1. ​​​​​​​功能單元與旁路定義

我們已經知道一個處理器CodeGenProcModel物件的ItinsDef成員是其Processor派生定義裡實際使用的ProcessorItineraries定義的Record物件(ProcessoràProcItin或ProcessoràSchedModelà Itineraries)。

359      void SubtargetEmitter::

361                                   std::vector<std::vector<InstrItinerary> >

362                                     &ProcItinLists) {

363     

364        // Multiple processor models may share an itinerary record. Emit it once.

365        SmallPtrSet<Record*, 8> ItinsDefSet;

366     

367        // Emit functional units for all the itineraries.

368        for (CodeGenSchedModels::ProcIter PI = SchedModels.procModelBegin(),

369               PE = SchedModels.procModelEnd(); PI != PE; ++PI) {

370     

371          if (!ItinsDefSet.insert(PI->ItinsDef).second)

372            continue;

373     

374          std::vector<Record*> FUs = PI->ItinsDef->getValueAsListOfDefs("FU");

375          if (FUs.empty())

376            continue;

377     

378          const std::string &Name = PI->ItinsDef->getName();

379          OS << "\n// Functional units for \"" << Name << "\"\n"

380             << "namespace " << Name << "FU {\n";

381     

382          for (unsigned j = 0, FUN = FUs.size(); j < FUN; ++j)

383            OS << "  const unsigned " << FUs[j]->getName()

384               << " = 1 << " << j << ";\n";

385     

386          OS << "}\n";

387     

388          std::vector<Record*> BPs = PI->ItinsDef->getValueAsListOfDefs("BP");

389          if (!BPs.empty()) {

390            OS << "\n// Pipeline forwarding pathes for itineraries \"" << Name

391               << "\"\n" << "namespace " << Name << "Bypass {\n";

392     

393            OS << "  const unsigned NoBypass = 0;\n";

394            for (unsigned j = 0, BPN = BPs.size(); j < BPN; ++j)

395              OS << "  const unsigned " << BPs[j]->getName()

396                 << " = 1 << " << j << ";\n";

397     

398            OS << "}\n";

399          }

400        }

X86家族中只有Atom使用Itinerary機制。Atom的ProcessorItineraries定義沒有定義BP(旁路,bypass),而且只定義了兩個Port資源,因此我們得到如下的輸出:

#ifdef DBGFIELD

#error "<target>GenSubtargetInfo.inc requires a DBGFIELD macro"

#endif

#ifndef NDEBUG

#define DBGFIELD(x) x,

#else

#define DBGFIELD(x)

#endif

// Functional units for "AtomItineraries"

namespace AtomItinerariesFU {

  const unsigned Port0 = 1 << 0;

  const unsigned Port1 = 1 << 1;

}

接下來要輸出三張表。第一個是InstrStage型別描述的Stage陣列,第二個是描述操作數週期的字串陣列,第三個是描述旁路的字串陣列。這些陣列的第一個項都是預留給NoItineraries定義。

SubtargetEmitter::EmitStageAndOperandCycleData(續)

402        // Begin stages table

403        std::string StageTable = "\nextern const llvm::InstrStage " + Target +

404                                 "Stages[] = {\n";

405        StageTable += "  { 0, 0, 0, llvm::InstrStage::Required }, // No itinerary\n";

406     

407        // Begin operand cycle table

408        std::string OperandCycleTable = "extern const unsigned " + Target +

409          "OperandCycles[] = {\n";

410        OperandCycleTable += "  0, // No itinerary\n";

411     

412        // Begin pipeline bypass table

413        std::string BypassTable = "extern const unsigned " + Target +

414          "ForwardingPaths[] = {\n";

415        BypassTable += " 0, // No itinerary\n";

416     

417        // For each Itinerary across all processors, add a unique entry to the stages,

418        // operand cycles, and pipepine bypess tables. Then add the new Itinerary

419        // object with computed offsets to the ProcItinLists result.

420        unsigned StageCount = 1, OperandCycleCount = 1;

421        std::map<std::string, unsigned> ItinStageMap, ItinOperandMap;

422        for (CodeGenSchedModels::ProcIter PI = SchedModels.procModelBegin(),

423               PE = SchedModels.procModelEnd(); PI != PE; ++PI) {

424          const CodeGenProcModel &ProcModel = *PI;

425     

426          // Add process itinerary to the list.

427          ProcItinLists.resize(ProcItinLists.size()+1);

428     

429          // If this processor defines no itineraries, then leave the itinerary list

430          // empty.

431          std::vector<InstrItinerary> &ItinList = ProcItinLists.back();

432          if (!ProcModel.hasItineraries())

433            continue;

434     

435          const std::string &Name = ProcModel.ItinsDef->getName();

436     

437          ItinList.resize(SchedModels.numInstrSchedClasses());

438          assert(ProcModel.ItinDefList.size() == ItinList.size() && "bad Itins");

439     

440          for (unsigned SchedClassIdx = 0, SchedClassEnd = ItinList.size();

441               SchedClassIdx < SchedClassEnd; ++SchedClassIdx) {

442     

443            // Next itinerary data

444            Record *ItinData = ProcModel.ItinDefList[SchedClassIdx];

445     

446            // Get string and stage count

447            std::string ItinStageString;

448            unsigned NStages = 0;

449            if (ItinData)

450              (Name, ItinData, ItinStageString, NStages);

451     

452            // Get string and operand cycle count

453            std::string ItinOperandCycleString;

454            unsigned NOperandCycles = 0;

455            std::string ItinBypassString;

456            if (ItinData) {

458                                              NOperandCycles);

459     

461                                        NOperandCycles);

462            }

463     

464            // Check to see if stage already exists and create if it doesn't

465            unsigned FindStage = 0;

466            if (NStages > 0) {

467              FindStage = ItinStageMap[ItinStageString];

468              if (FindStage == 0) {

469                // Emit as { cycles, u1 | u2 | ... | un, timeinc }, // indices

470                StageTable += ItinStageString + ", // " + itostr(StageCount);

471                if (NStages > 1)

472                  StageTable += "-" + itostr(StageCount + NStages - 1);

473                StageTable += "\n";

474                // Record Itin class number.

475                ItinStageMap[ItinStageString] = FindStage = StageCount;

476                StageCount += NStages;

477              }

478            }

479     

480            // Check to see if operand cycle already exists and create if it doesn't

481            unsigned FindOperandCycle = 0;

482            if (NOperandCycles > 0) {

483              std::string ItinOperandString = ItinOperandCycleString+ItinBypassString;

484              FindOperandCycle = ItinOperandMap[ItinOperandString];

485              if (FindOperandCycle == 0) {

486                // Emit as  cycle, // index

487                OperandCycleTable += ItinOperandCycleString + ", // ";

488                std::string OperandIdxComment = itostr(OperandCycleCount);

489                if (NOperandCycles > 1)

490                  OperandIdxComment += "-"

491                    + itostr(OperandCycleCount + NOperandCycles - 1);

492                OperandCycleTable += OperandIdxComment + "\n";

493                // Record Itin class number.

494                ItinOperandMap[ItinOperandCycleString] =

495                  FindOperandCycle = OperandCycleCount;

496                // Emit as bypass, // index

497                BypassTable += ItinBypassString + ", // " + OperandIdxComment + "\n";

498                OperandCycleCount += NOperandCycles;

499              }

500            }

501     

502            // Set up itinerary as location and location + stage count

503            int NumUOps = ItinData ? ItinData->getValueAsInt("NumMicroOps") : 0;

504            InstrItinerary Intinerary = { NumUOps, FindStage, FindStage + NStages,

505                                          FindOperandCycle,

506                                          FindOperandCycle + NOperandCycles};

507     

508            // Inject - empty slots will be 0, 0

509            ItinList[SchedClassIdx] = Intinerary;

510          }

511        }

512     

513        // Closing stage

514        StageTable += "  { 0, 0, 0, llvm::InstrStage::Required } // End stages\n";

515        StageTable += "};\n";

516     

517        // Closing operand cycles

518        OperandCycleTable += "  0 // End operand cycles\n";

519        OperandCycleTable += "};\n";

520     

521        BypassTable += " 0 // End bypass tables\n";

522        BypassTable += "};\n";

523     

524        // Emit tables.

525        OS << StageTable;

526        OS << OperandCycleTable;

527        OS << BypassTable;

528      }

​​​​​​​3.6.2.2.3.2. ​​​​​​​執行步驟的資料

對使用執行步驟輔助指令排程的每個處理器,其CodeGenProcModel例項的ItinDefList容器儲存的是相關ProcessorItineraries定義裡的IID列表(型別list<InstrItinData>),這個容器關聯了援引相同InstrItinClass定義的排程型別與InstrItinData定義。上面438行斷言必須滿足,因為在collectProcItins的784行,ProcModel.ItinsDef被調整為NumInstrSchedClasses大小。

對某個處理器CodeGenProcModel物件,440行實質上是遍歷所有的非推導的CodeGenSchedClass物件,因此,444行獲取的是與指定排程型別匹配的InstrItinData定義的Record物件,並作為450行呼叫的FormItineraryStageString方法的第二個引數。

275                                                      Record *ItinData,

276                                                      std::string &ItinString,

277                                                      unsigned &NStages) {

278        // Get states list

279        const std::vector<Record*> &StageList =

280          ItinData->getValueAsListOfDefs("Stages");

281     

282        // For each stage

283        unsigned N = NStages = StageList.size();

284        for (unsigned i = 0; i < N;) {

285          // Next stage

286          const Record *Stage = StageList[i];

287     

288          // Form string as ,{ cycles, u1 | u2 | ... | un, timeinc, kind }

289          int Cycles = Stage->getValueAsInt("Cycles");

290          ItinString += "  { " + itostr(Cycles) + ", ";

291     

292          // Get unit list

293          const std::vector<Record*> &UnitList = Stage->getValueAsListOfDefs("Units");

294     

295          // For each unit

296          for (unsigned j = 0, M = UnitList.size(); j < M;) {

297            // Add name and bitwise or

298            ItinString += Name + "FU::" + UnitList[j]->getName();

299            if (++j < M) ItinString += " | ";

300          }

301     

302          int TimeInc = Stage->getValueAsInt("TimeInc");

303          ItinString += ", " + itostr(TimeInc);

304     

305          int Kind = Stage->getValueAsInt("Kind");

306          ItinString += ", (llvm::InstrStage::ReservationKinds)" + itostr(Kind);

307     

308          // Close off stage

309          ItinString += " }";

310          if (++i < N) ItinString += ", ";

311        }

312      }

所輸出的描述字串可以參考上面對類InstrStage說明的例子。InstrItinData定義裡還有一個OperandCycles定義用來描述指令發出後,指定運算元的值讀、寫完成所需的週期數。

320                               std::string &ItinString, unsigned &NOperandCycles) {

321        // Get operand cycle list

322        const std::vector<int64_t> &OperandCycleList =

323          ItinData->getValueAsListOfInts("OperandCycles");

324     

325        // For each operand cycle

326        unsigned N = NOperandCycles = OperandCycleList.size();

327        for (unsigned i = 0; i < N;) {

328          // Next operand cycle

329          const int OCycle = OperandCycleList[i];

330     

331          ItinString += "  " + itostr(OCycle);

332          if (++i < N) ItinString += ", ";

333        }

334      }

最後還要輸出一個描述旁路(bypass)的陣列。可以發現.td檔案裡的InstrItinData定義被拆分為這三個陣列,這是因為這是描寫InstrItinData定義比較獨立的3個維度。而且這3個維度本身也可能是存在不少的重複定義,建立這三個陣列,並通過陣列下標來標定InstrItinData定義會獲取更為緊湊的資料結構。

337                                                       Record *ItinData,

338                                                       std::string &ItinString,

339                                                       unsigned NOperandCycles) {

340        const std::vector<Record*> &BypassList =

341          ItinData->getValueAsListOfDefs("Bypasses");

342        unsigned N = BypassList.size();

343        unsigned i = 0;

344        for (; i < N;) {

345          ItinString += Name + "Bypass::" + BypassList[i]->getName();

346          if (++i < NOperandCycles) ItinString += ", ";

347        }

348        for (; i < NOperandCycles;) {

349          ItinString += " 0";

350          if (++i < NOperandCycles) ItinString += ", ";

351        }

352      }

注意,對方法FormItineraryOperandCycleString,引數NOperandCycles是一個引用,在326行被設定為InstrItinData定義裡OperandCycles的大小。它被傳給方法FormItineraryBypassString,用以控制旁路陣列的大小。

在EmitStageAndOperandCycleData的466行,NStages是由FormItineraryStageString方法設定的InstrItinData定義Stages的物件。容器ItinStageMap(std::map<std::string, unsigned>)用來保證生成InstrStage的唯一性,468~477行確保輸出唯一的InstrStage。容器ItinOperandMap也是類似的作用,確保OperandCycle輸出的唯一性。

在504行生成了一個InstrItinerary例項,儲存到ProcItinLists容器的相應位置。在514行開始輸出這三個陣列。例如對X86目標機器,這是:

extern const llvm::InstrStage X86Stages[] = {

  { 0, 0, 0, llvm::InstrStage::Required }, // No itinerary

  { 13, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 1

  { 7, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 2

  { 21, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 3

  { 1, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 4

   …

  { 202, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 92

  { 0, 0, 0, llvm::InstrStage::Required } // End stages

};

extern const unsigned X86OperandCycles[] = {

  0, // No itinerary

  0 // End operand cycles

};

extern const unsigned X86ForwardingPaths[] = {

 0, // No itinerary

 0 // End bypass tables

};

這三者通過下面將要生成的InstrItinerary陣列聯絡起來。方法EmitItineraries的引數ProcItinLists是在前面的方法EmitStageAndOperandCycleData裡準備的。注意,在546行對SchedModels容器ProcModels的遍歷順序與EmitStageAndOperandCycleData準備這些InstrItinerary物件資料時遍歷ProcModels容器的順序是一樣的,而且ProcItinLists與ProcModels容器的大小總是相等的(EmitStageAndOperandCycleData的427行)。另外在432行看到,對不使用Itinerary的處理器,ProcItinLists的項是空的,而在509行看到,對於使用Itinerary的處理器,不管是否存在內容相同的Intinerary例項,總是為該處理器的ProcItinLists項生成一個新的Intinerary例項。因此,在下面遍歷的處理器與ProcItinLists總是一一對應的(562行條件將不使用Itinerary的處理器濾除了)。

536      void SubtargetEmitter::

538                      std::vector<std::vector<InstrItinerary> > &ProcItinLists) {

539     

540        // Multiple processor models may share an itinerary record. Emit it once.

541        SmallPtrSet<Record*, 8> ItinsDefSet;

542     

543        // For each processor's machine model

544        std::vector<std::vector<InstrItinerary> >::iterator

545            ProcItinListsIter = ProcItinLists.begin();

546        for (CodeGenSchedModels::ProcIter PI = SchedModels.procModelBegin(),

547               PE = SchedModels.procModelEnd(); PI != PE; ++PI, ++ProcItinListsIter) {

548     

549          Record *ItinsDef = PI->ItinsDef;

550          if (!ItinsDefSet.insert(ItinsDef).second)

551            continue;

552     

553          // Get processor itinerary name

554          const std::string &Name = ItinsDef->getName();

555     

556          // Get the itinerary list for the processor.

557          assert(ProcItinListsIter != ProcItinLists.end() && "bad iterator");

558          std::vector<InstrItinerary> &ItinList = *ProcItinListsIter;

559     

560          // Empty itineraries aren't referenced anywhere in the tablegen output

561          // so don't emit them.

562          if (ItinList.empty())

563            continue;

564     

565          OS << "\n";

566          OS << "static const llvm::InstrItinerary ";

567     

568          // Begin processor itinerary table

569          OS << Name << "[] = {\n";

570     

571          // For each itinerary class in CodeGenSchedClass::Index order.

572          for (unsigned j = 0, M = ItinList.size(); j < M; ++j) {

573            InstrItinerary &Intinerary = ItinList[j];

574     

575            // Emit Itinerary in the form of

576            // { firstStage, lastStage, firstCycle, lastCycle } // index

577            OS << "  { " <<

578              Intinerary.NumMicroOps << ", " <<

579              Intinerary.FirstStage << ", " <<

580              Intinerary.LastStage << ", " <<

581              Intinerary.FirstOperandCycle << ", " <<

582              Intinerary.LastOperandCycle << " }" <<

583              ", // " << j << " " << SchedModels.getSchedClass(j).Name << "\n";

584          }

585          // End processor itinerary table

586          OS << "  { 0, ~0U, ~0U, ~0U, ~0U } // end marker\n";

587          OS << "};\n";

588        }

589      }

X86目標機器只有Atom處理器使用了Itinerary,因此它輸出這樣的陣列(有950項):

static const llvm:: AtomItineraries[] = {

  { 0, 0, 0, 0, 0 }, // 0 NoInstrModel

  { 1, 1, 2, 0, 0 }, // 1 IIC_AAA_WriteMicrocoded

  { 1, 2, 3, 0, 0 }, // 2 IIC_AAD_WriteMicrocoded

  { 1, 3, 4, 0, 0 }, // 3 IIC_AAM_WriteMicrocoded

  { 1, 1, 2, 0, 0 }, // 4 IIC_AAS_WriteMicrocoded

  { 1, 4, 5, 0, 0 }, // 5 IIC_BIN_CARRY_NONMEM_WriteALU

  …

  { 1, 43, 44, 0, 0 }, // 948 LDMXCSR_VLDMXCSR

  { 1, 17, 18, 0, 0 }, // 949 STMXCSR_VSTMXCSR

  { 0, ~0U, ~0U, ~0U, ~0U } // end marker

};

註釋裡給出的是所謂的排程型別。注意這裡輸出的順序與X86GenInstrInfo.inc裡Sched名字空間裡的表示排程型別的列舉常量的順序是完全一樣。這個一致性使得我們通過這些列舉常量就能得到對應排程型別的具體引數。