目錄

10.1 本篇概述

RHI全稱是Render Hardware Interface(渲染硬體介面),是UE渲染體系中非常基礎且重要的模組,封裝了眾多圖形API(DirectX、OpenGL、Vulkan、Metal)之間的差異,對Game和Renderer模組提供了簡便且一致的概念、資料、資源和介面,實現一份渲染程式碼跑在多個平臺的目標。

Game、Renderer、RHI分層示意圖,其中RHI是平臺相關的內容。

最初的RHI是基於D3D11 API設計而成,包含了資源管理和命令介面:

開啟RHI執行緒的情況下,與RHI相伴相隨的還有RHI執行緒,它負責將渲染執行緒Push進來的RHI中間指令轉譯到對應圖形平臺的GPU指令。在部分圖形API(DX12、Vulkan、主機)支援並行的情況下,如果渲染執行緒是並行生成的RHI中間指令,那麼RHI執行緒也會並行轉譯。

UE4的渲染執行緒並行生成中間指令和RHI執行緒並行轉譯後提交渲染指令示意圖。

本篇將著重闡述RHI的基礎概念、型別、介面,它們之間的關聯,涉及的原理和機制等內容,也會少量涉及具體圖形API的實現細節。

10.2 RHI基礎

本章將分析RHI涉及的基礎概念和型別,闡述它們之間的關係和原理。

10.2.1 FRenderResource

FRenderResource是渲染執行緒的渲染資源代表,由渲染執行緒管理和傳遞,介於遊戲執行緒和RHI執行緒的中間資料。由於之前篇章雖然有涉及它的概念,但沒有詳細闡述,所以放到此篇章中。FRenderResource的定義如下:

// Engine\Source\Runtime\RenderCore\Public\RenderResource.h

class RENDERCORE_API FRenderResource
{
public:
// 遍歷所有資源, 執行回撥介面.
template<typename FunctionType>
static void ForAllResources(const FunctionType& Function);
static void InitRHIForAllResources();
static void ReleaseRHIForAllResources();
static void ChangeFeatureLevel(ERHIFeatureLevel::Type NewFeatureLevel); FRenderResource();
FRenderResource(ERHIFeatureLevel::Type InFeatureLevel);
virtual ~FRenderResource(); // 以下介面只能被渲染執行緒呼叫. // 初始化此資源的動態RHI資源和(或)RHI渲染目標紋理.
virtual void InitDynamicRHI() {}
// 釋放此資源的動態RHI資源和(或)RHI渲染目標紋理.
virtual void ReleaseDynamicRHI() {} // 初始化此資源使用的RHI資源.
virtual void InitRHI() {}
// 釋放此資源使用的RHI資源.
virtual void ReleaseRHI() {} // 初始化資源.
virtual void InitResource();
// 釋放資源.
virtual void ReleaseResource(); // 如果RHI資源已被初始化, 會被釋放並重新初始化.
void UpdateRHI(); virtual FString GetFriendlyName() const { return TEXT("undefined"); }
FORCEINLINE bool IsInitialized() const { return ListIndex != INDEX_NONE; } static void InitPreRHIResources(); private:
// 全域性資源列表(靜態).
static TArray<FRenderResource*>& GetResourceList();
static FThreadSafeCounter ResourceListIterationActive; int32 ListIndex;
TEnumAsByte<ERHIFeatureLevel::Type> FeatureLevel; (......)
};

下面是遊戲執行緒向渲染執行緒傳送操作FRenderResource的介面:

// 初始化/更新/釋放資源.
extern RENDERCORE_API void BeginInitResource(FRenderResource* Resource);
extern RENDERCORE_API void BeginUpdateResourceRHI(FRenderResource* Resource);
extern RENDERCORE_API void BeginReleaseResource(FRenderResource* Resource);
extern RENDERCORE_API void StartBatchedRelease();
extern RENDERCORE_API void EndBatchedRelease();
extern RENDERCORE_API void ReleaseResourceAndFlush(FRenderResource* Resource);

FRenderResource只是基礎父類,定義了一組渲染資源的行為,實際的資料和邏輯由子類實現。涉及的子類和層級比較多且複雜,下面是部分重要子類的定義:

// Engine\Source\Runtime\RenderCore\Public\RenderResource.h

// 紋理資源.
class FTexture : public FRenderResource
{
public:
FTextureRHIRef TextureRHI; // 紋理的RHI資源.
FSamplerStateRHIRef SamplerStateRHI; // 紋理的取樣器RHI資源.
FSamplerStateRHIRef DeferredPassSamplerStateRHI; // 延遲通道取樣器RHI資源. mutable double LastRenderTime; // 上次渲染的時間.
FMipBiasFade MipBiasFade; // 淡入/淡出的Mip偏移值.
bool bGreyScaleFormat; // 灰度圖.
bool bIgnoreGammaConversions; // 是否忽略Gamma轉換.
bool bSRGB; // 是否sRGB空間的顏色. virtual uint32 GetSizeX() const;
virtual uint32 GetSizeY() const;
virtual uint32 GetSizeZ() const; // 釋放資源.
virtual void ReleaseRHI() override
{
TextureRHI.SafeRelease();
SamplerStateRHI.SafeRelease();
DeferredPassSamplerStateRHI.SafeRelease();
}
virtual FString GetFriendlyName() const override { return TEXT("FTexture"); } (......) protected:
RENDERCORE_API static FRHISamplerState* GetOrCreateSamplerState(const FSamplerStateInitializerRHI& Initializer);
}; // 包含了SRV/UAV的紋理資源.
class FTextureWithSRV : public FTexture
{
public:
// 訪問整張紋理的SRV.
FShaderResourceViewRHIRef ShaderResourceViewRHI;
// 訪問整張紋理的UAV.
FUnorderedAccessViewRHIRef UnorderedAccessViewRHI; virtual void ReleaseRHI() override;
}; // 持有RHI紋理資源引用的渲染資源.
class RENDERCORE_API FTextureReference : public FRenderResource
{
public:
// 紋理的RHI資源引用.
FTextureReferenceRHIRef TextureReferenceRHI; // FRenderResource interface.
virtual void InitRHI();
virtual void ReleaseRHI(); (......)
}; class RENDERCORE_API FVertexBuffer : public FRenderResource
{
public:
// 頂點緩衝的RHI資源引用.
FVertexBufferRHIRef VertexBufferRHI; virtual void ReleaseRHI() override; (......);
}; class RENDERCORE_API FVertexBufferWithSRV : public FVertexBuffer
{
public:
// 訪問整個緩衝區的SRV/UAV.
FShaderResourceViewRHIRef ShaderResourceViewRHI;
FUnorderedAccessViewRHIRef UnorderedAccessViewRHI; (......)
}; // 索引緩衝.
class FIndexBuffer : public FRenderResource
{
public:
// 索引緩衝對應的RHI資源.
FIndexBufferRHIRef IndexBufferRHI; (......)
};

以上可知,FRenderResource的子類就是對應地將RHI的子類資源封裝起來,以便渲染執行緒將遊戲執行緒的資料和操作傳遞到RHI執行緒(或模組)中。下面來個UML圖將FRenderResource的部分繼承體系直觀地呈現出來:

classDiagram-v2
FRHIResource <-- FRenderResource
FRenderResource <|-- FTextureReference
FRenderResource <|-- FTexture
FTexture <|-- FTextureWithSRV
FTexture <|-- FTextureResource

FTextureResource <|-- FStaticShadowDepthMap
FTextureResource <|-- FTexture2DDynamicResource
FTextureResource <|-- FTextureRenderTargetResource
FTextureRenderTargetResource <|-- FTextureRenderTarget2DResource
FTextureRenderTargetResource <|-- FTextureRenderTargetCubeResource

FRenderResource <|-- FVertexBuffer

FVertexBuffer <|-- FTangentsVertexBuffer
FVertexBuffer <|-- FVertexBufferWithSRV
FVertexBuffer <|-- FColorVertexBuffer
FVertexBuffer <|-- FPositionVertexBuffer
FVertexBuffer <|-- FSkinWeightDataVertexBuffer

FRenderResource <|-- FIndexBuffer
FIndexBuffer <|-- FDynamicMeshIndexBuffer16
FIndexBuffer <|-- FDynamicMeshIndexBuffer32
FIndexBuffer <|-- FRawIndexBuffer
FIndexBuffer <|-- FRawStaticIndexBuffer

FVertexBufferWithSRV <|-- FWhiteVertexBuffer
FVertexBufferWithSRV <|-- FEmptyVertexBuffer

class FRenderResource{
InitDynamicRHI()
ReleaseDynamicRHI()
InitRHI()
ReleaseRHI()
InitResource()
ReleaseResource()
UpdateRHI()
}

class FTexture{
FTextureRHIRef TextureRHI;
FSamplerStateRHIRef SamplerStateRHI;
}
class FTextureWithSRV{
FShaderResourceViewRHIRef ShaderResourceViewRHI;
FUnorderedAccessViewRHIRef UnorderedAccessViewRHI;
}
class FTextureReference{
FTextureReferenceRHIRef TextureReferenceRHI;
}
class FVertexBuffer{
FVertexBufferRHIRef VertexBufferRHI;
}
class FVertexBufferWithSRV{
FShaderResourceViewRHIRef ShaderResourceViewRHI;
FUnorderedAccessViewRHIRef UnorderedAccessViewRHI;
}
class FIndexBuffer{
FIndexBufferRHIRef IndexBufferRHI;
}

如果看不清請點選下面的圖片:

再次強調,以上只是FRenderResource的部分繼承體系,無法完整地繪製出來。可知FRenderResource擁有龐大的子類層級關係,以適應和滿足UE渲染體系在資源方面複雜多變的的需求。

10.2.2 FRHIResource

FRHIResource抽象了GPU側的資源,也是眾多RHI資源型別的父類。定義如下:

// Engine\Source\Runtime\RHI\Public\RHIResources.h

class RHI_API FRHIResource
{
public:
FRHIResource(bool InbDoNotDeferDelete = false);
virtual ~FRHIResource(); // 資源的引用計數.
uint32 AddRef() const;
uint32 Release() const
{
int32 NewValue = NumRefs.Decrement();
if (NewValue == 0)
{
if (!DeferDelete())
{
delete this;
}
else
{
// 加入待刪除列表.
if (FPlatformAtomics::InterlockedCompareExchange(&MarkedForDelete, 1, 0) == 0)
{
PendingDeletes.Push(const_cast<FRHIResource*>(this));
}
}
}
return uint32(NewValue);
}
uint32 GetRefCount() const; // 靜態介面.
static void FlushPendingDeletes(bool bFlushDeferredDeletes = false);
static bool PlatformNeedsExtraDeletionLatency();
static bool Bypass(); void DoNoDeferDelete();
// 瞬時資源追蹤.
void SetCommitted(bool bInCommitted);
bool IsCommitted() const;
bool IsValid() const; private:
// 執行時標記和資料.
mutable FThreadSafeCounter NumRefs;
mutable int32 MarkedForDelete;
bool bDoNotDeferDelete;
bool bCommitted; // 待刪除的資源.
static TLockFreePointerListUnordered<FRHIResource, PLATFORM_CACHE_LINE_SIZE> PendingDeletes;
// 正在刪除的資源.
static FRHIResource* CurrentlyDeleting; bool DeferDelete() const; // 有些api不做內部引用計數,所以必須在刪除資源之前等待額外的幾幀,以確保GPU完全完成它們. 可避免昂貴的柵欄等.
struct ResourcesToDelete
{
TArray<FRHIResource*> Resources; // 待刪除的資源.
uint32 FrameDeleted; // 等待的幀數. (......)
}; // 延遲刪除的資源佇列.
static TArray<ResourcesToDelete> DeferredDeletionQueue;
static uint32 CurrentFrame;
};

以上可知,FRHIResource提供了幾種功能:引用計數、延遲刪除及追蹤、執行時資料和標記。它擁有數量眾多的子類,主要有:

// Engine\Source\Runtime\RHI\Public\RHIResources.h

// 狀態塊(State blocks)資源

class FRHISamplerState : public FRHIResource
{
public:
virtual bool IsImmutable() const { return false; }
};
class FRHIRasterizerState : public FRHIResource
{
public:
virtual bool GetInitializer(struct FRasterizerStateInitializerRHI& Init) { return false; }
};
class FRHIDepthStencilState : public FRHIResource
{
public:
virtual bool GetInitializer(struct FDepthStencilStateInitializerRHI& Init) { return false; }
};
class FRHIBlendState : public FRHIResource
{
public:
virtual bool GetInitializer(class FBlendStateInitializerRHI& Init) { return false; }
}; // 著色器繫結資源. typedef TArray<struct FVertexElement,TFixedAllocator<MaxVertexElementCount> > FVertexDeclarationElementList;
class FRHIVertexDeclaration : public FRHIResource
{
public:
virtual bool GetInitializer(FVertexDeclarationElementList& Init) { return false; }
}; class FRHIBoundShaderState : public FRHIResource {}; // 著色器 class FRHIShader : public FRHIResource
{
public:
void SetHash(FSHAHash InHash);
FSHAHash GetHash() const;
explicit FRHIShader(EShaderFrequency InFrequency);
inline EShaderFrequency GetFrequency() const; private:
FSHAHash Hash;
EShaderFrequency Frequency;
}; class FRHIGraphicsShader : public FRHIShader
{
public:
explicit FRHIGraphicsShader(EShaderFrequency InFrequency) : FRHIShader(InFrequency) {}
}; class FRHIVertexShader : public FRHIGraphicsShader
{
public:
FRHIVertexShader() : FRHIGraphicsShader(SF_Vertex) {}
}; class FRHIHullShader : public FRHIGraphicsShader
{
public:
FRHIHullShader() : FRHIGraphicsShader(SF_Hull) {}
}; class FRHIDomainShader : public FRHIGraphicsShader
{
public:
FRHIDomainShader() : FRHIGraphicsShader(SF_Domain) {}
}; class FRHIPixelShader : public FRHIGraphicsShader
{
public:
FRHIPixelShader() : FRHIGraphicsShader(SF_Pixel) {}
}; class FRHIGeometryShader : public FRHIGraphicsShader
{
public:
FRHIGeometryShader() : FRHIGraphicsShader(SF_Geometry) {}
}; class RHI_API FRHIComputeShader : public FRHIShader
{
public:
FRHIComputeShader() : FRHIShader(SF_Compute), Stats(nullptr) {} inline void SetStats(struct FPipelineStateStats* Ptr) { Stats = Ptr; }
void UpdateStats(); private:
struct FPipelineStateStats* Stats;
}; // 管線狀態 class FRHIGraphicsPipelineState : public FRHIResource {};
class FRHIComputePipelineState : public FRHIResource {};
class FRHIRayTracingPipelineState : public FRHIResource {}; // 緩衝區. class FRHIUniformBuffer : public FRHIResource
{
public:
FRHIUniformBuffer(const FRHIUniformBufferLayout& InLayout); FORCEINLINE_DEBUGGABLE uint32 AddRef() const;
FORCEINLINE_DEBUGGABLE uint32 Release() const;
uint32 GetSize() const;
const FRHIUniformBufferLayout& GetLayout() const;
bool HasStaticSlot() const; private:
const FRHIUniformBufferLayout* Layout;
uint32 LayoutConstantBufferSize;
}; class FRHIIndexBuffer : public FRHIResource
{
public:
FRHIIndexBuffer(uint32 InStride,uint32 InSize,uint32 InUsage); uint32 GetStride() const;
uint32 GetSize() const;
uint32 GetUsage() const; protected:
FRHIIndexBuffer(); void Swap(FRHIIndexBuffer& Other);
void ReleaseUnderlyingResource(); private:
uint32 Stride;
uint32 Size;
uint32 Usage;
}; class FRHIVertexBuffer : public FRHIResource
{
public:
FRHIVertexBuffer(uint32 InSize,uint32 InUsage)
uint32 GetSize() const;
uint32 GetUsage() const; protected:
FRHIVertexBuffer();
void Swap(FRHIVertexBuffer& Other);
void ReleaseUnderlyingResource(); private:
uint32 Size;
// e.g. BUF_UnorderedAccess
uint32 Usage;
}; class FRHIStructuredBuffer : public FRHIResource
{
public:
FRHIStructuredBuffer(uint32 InStride,uint32 InSize,uint32 InUsage) uint32 GetStride() const;
uint32 GetSize() const;
uint32 GetUsage() const; private:
uint32 Stride;
uint32 Size;
uint32 Usage;
}; // 紋理 class FRHITexture : public FRHIResource
{
public:
FRHITexture(uint32 InNumMips, uint32 InNumSamples, EPixelFormat InFormat, uint32 InFlags, FLastRenderTimeContainer* InLastRenderTime, const FClearValueBinding& InClearValue); // 動態型別轉換介面.
virtual class FRHITexture2D* GetTexture2D();
virtual class FRHITexture2DArray* GetTexture2DArray();
virtual class FRHITexture3D* GetTexture3D();
virtual class FRHITextureCube* GetTextureCube();
virtual class FRHITextureReference* GetTextureReference(); virtual FIntVector GetSizeXYZ() const = 0;
// 獲取平臺相關的原生資源指標.
virtual void* GetNativeResource() const;
virtual void* GetNativeShaderResourceView() const
// 獲取平臺相關的RHI紋理基類.
virtual void* GetTextureBaseRHI(); // 資料介面.
uint32 GetNumMips() const;
EPixelFormat GetFormat();
uint32 GetFlags() const;
uint32 GetNumSamples() const;
bool IsMultisampled() const;
bool HasClearValue() const;
FLinearColor GetClearColor() const;
void GetDepthStencilClearValue(float& OutDepth, uint32& OutStencil) const;
float GetDepthClearValue() const;
uint32 GetStencilClearValue() const;
const FClearValueBinding GetClearBinding() const;
virtual void GetWriteMaskProperties(void*& OutData, uint32& OutSize); (......) // RHI資源資訊.
FRHIResourceInfo ResourceInfo; private:
// 紋理資料.
FClearValueBinding ClearValue;
uint32 NumMips;
uint32 NumSamples;
EPixelFormat Format;
uint32 Flags;
FLastRenderTimeContainer& LastRenderTime;
FLastRenderTimeContainer DefaultLastRenderTime;
FName TextureName;
}; // 2D RHI紋理.
class FRHITexture2D : public FRHITexture
{
public:
FRHITexture2D(uint32 InSizeX,uint32 InSizeY,uint32 InNumMips,uint32 InNumSamples,EPixelFormat InFormat,uint32 InFlags, const FClearValueBinding& InClearValue); virtual FRHITexture2D* GetTexture2D() { return this; } uint32 GetSizeX() const { return SizeX; }
uint32 GetSizeY() const { return SizeY; }
inline FIntPoint GetSizeXY() const;
virtual FIntVector GetSizeXYZ() const override; private:
uint32 SizeX;
uint32 SizeY;
}; // 2D RHI紋理陣列.
class FRHITexture2DArray : public FRHITexture2D
{
public:
FRHITexture2DArray(uint32 InSizeX,uint32 InSizeY,uint32 InSizeZ,uint32 InNumMips,uint32 NumSamples, EPixelFormat InFormat,uint32 InFlags, const FClearValueBinding& InClearValue); virtual FRHITexture2DArray* GetTexture2DArray() { return this; }
virtual FRHITexture2D* GetTexture2D() { return NULL; } uint32 GetSizeZ() const { return SizeZ; }
virtual FIntVector GetSizeXYZ() const final override; private:
uint32 SizeZ;
}; // 2D RHI紋理.
class FRHITexture3D : public FRHITexture
{
public:
FRHITexture3D(uint32 InSizeX,uint32 InSizeY,uint32 InSizeZ,uint32 InNumMips,EPixelFormat InFormat,uint32 InFlags, const FClearValueBinding& InClearValue); virtual FRHITexture3D* GetTexture3D() { return this; }
uint32 GetSizeX() const { return SizeX; }
uint32 GetSizeY() const { return SizeY; }
uint32 GetSizeZ() const { return SizeZ; }
virtual FIntVector GetSizeXYZ() const final override; private:
uint32 SizeX;
uint32 SizeY;
uint32 SizeZ;
}; // 立方體RHI紋理.
class FRHITextureCube : public FRHITexture
{
public:
FRHITextureCube(uint32 InSize,uint32 InNumMips,EPixelFormat InFormat,uint32 InFlags, const FClearValueBinding& InClearValue); virtual FRHITextureCube* GetTextureCube();
uint32 GetSize() const;
virtual FIntVector GetSizeXYZ() const final override; private:
uint32 Size;
}; // 紋理引用.
class FRHITextureReference : public FRHITexture
{
public:
explicit FRHITextureReference(FLastRenderTimeContainer* InLastRenderTime); virtual FRHITextureReference* GetTextureReference() override { return this; }
inline FRHITexture* GetReferencedTexture() const;
// 設定引用的紋理
void SetReferencedTexture(FRHITexture* InTexture);
virtual FIntVector GetSizeXYZ() const final override; private:
// 被引用的紋理資源.
TRefCountPtr<FRHITexture> ReferencedTexture;
}; class FRHITextureReferenceNullImpl : public FRHITextureReference
{
public:
FRHITextureReferenceNullImpl(); void SetReferencedTexture(FRHITexture* InTexture)
{
FRHITextureReference::SetReferencedTexture(InTexture);
}
}; // 雜項資源. // 時間戳校準查詢.
class FRHITimestampCalibrationQuery : public FRHIResource
{
public:
uint64 GPUMicroseconds = 0;
uint64 CPUMicroseconds = 0;
}; // GPU柵欄類. 粒度因RHI而異,即它可能只表示命令緩衝區粒度. RHI的特殊圍欄由此派生而來,實現了真正的GPU->CPU柵欄.
// 預設實現總是為輪詢(Poll)返回false,直到插入柵欄的下一幀,因為不是所有api都有GPU/CPU同步物件,需要偽造它。
class FRHIGPUFence : public FRHIResource
{
public:
FRHIGPUFence(FName InName) : FenceName(InName) {}
virtual ~FRHIGPUFence() {} virtual void Clear() = 0;
// 輪詢圍欄,看看GPU是否已經發出訊號. 如果是, 則返回true.
virtual bool Poll() const = 0;
// 輪詢GPU的子集.
virtual bool Poll(FRHIGPUMask GPUMask) const { return Poll(); }
// 等待寫入命令的數量.
FThreadSafeCounter NumPendingWriteCommands; protected:
FName FenceName;
}; // 通用的FRHIGPUFence實現.
class RHI_API FGenericRHIGPUFence : public FRHIGPUFence
{
public:
FGenericRHIGPUFence(FName InName); virtual void Clear() final override;
virtual bool Poll() const final override;
void WriteInternal(); private:
uint32 InsertedFrameNumber;
}; // 渲染查詢.
class FRHIRenderQuery : public FRHIResource
{
}; // 池化的渲染查詢.
class RHI_API FRHIPooledRenderQuery
{
TRefCountPtr<FRHIRenderQuery> Query;
FRHIRenderQueryPool* QueryPool = nullptr; public:
bool IsValid() const;
FRHIRenderQuery* GetQuery() const;
void ReleaseQuery(); (.....)
}; // 渲染查詢池.
class FRHIRenderQueryPool : public FRHIResource
{
public:
virtual ~FRHIRenderQueryPool() {};
virtual FRHIPooledRenderQuery AllocateQuery() = 0; private:
friend class FRHIPooledRenderQuery;
virtual void ReleaseQuery(TRefCountPtr<FRHIRenderQuery>&& Query) = 0;
}; // 計算柵欄.
class FRHIComputeFence : public FRHIResource
{
public:
FRHIComputeFence(FName InName); FORCEINLINE bool GetWriteEnqueued() const;
virtual void Reset();
virtual void WriteFence(); private:
// 自建立以來,標記標籤是否被寫入. 在命令建立時,當佇列等待捕獲CPU上的GPU掛起時,檢查這個標記.
bool bWriteEnqueued;
}; // 視口.
class FRHIViewport : public FRHIResource
{
public:
// 獲取平臺相關的原生交換鏈.
virtual void* GetNativeSwapChain() const { return nullptr; }
// 獲取原生的BackBuffer紋理.
virtual void* GetNativeBackBufferTexture() const { return nullptr; }
// 獲取原生的BackBuffer渲染紋理.
virtual void* GetNativeBackBufferRT() const { return nullptr; }
// 獲取原生的視窗.
virtual void* GetNativeWindow(void** AddParam = nullptr) const { return nullptr; } // 在視口上設定FRHICustomPresent的handler.
virtual void SetCustomPresent(class FRHICustomPresent*) {}
virtual class FRHICustomPresent* GetCustomPresent() const { return nullptr; } // 在遊戲執行緒幀更新視口.
virtual void Tick(float DeltaTime) {}
}; // 檢視: UAV/SRV class FRHIUnorderedAccessView : public FRHIResource {};
class FRHIShaderResourceView : public FRHIResource {}; // 各種RHI資源引用型別定義.
typedef TRefCountPtr<FRHISamplerState> FSamplerStateRHIRef;
typedef TRefCountPtr<FRHIRasterizerState> FRasterizerStateRHIRef;
typedef TRefCountPtr<FRHIDepthStencilState> FDepthStencilStateRHIRef;
typedef TRefCountPtr<FRHIBlendState> FBlendStateRHIRef;
typedef TRefCountPtr<FRHIVertexDeclaration> FVertexDeclarationRHIRef;
typedef TRefCountPtr<FRHIVertexShader> FVertexShaderRHIRef;
typedef TRefCountPtr<FRHIHullShader> FHullShaderRHIRef;
typedef TRefCountPtr<FRHIDomainShader> FDomainShaderRHIRef;
typedef TRefCountPtr<FRHIPixelShader> FPixelShaderRHIRef;
typedef TRefCountPtr<FRHIGeometryShader> FGeometryShaderRHIRef;
typedef TRefCountPtr<FRHIComputeShader> FComputeShaderRHIRef;
typedef TRefCountPtr<FRHIRayTracingShader> FRayTracingShaderRHIRef;
typedef TRefCountPtr<FRHIComputeFence> FComputeFenceRHIRef;
typedef TRefCountPtr<FRHIBoundShaderState> FBoundShaderStateRHIRef;
typedef TRefCountPtr<FRHIUniformBuffer> FUniformBufferRHIRef;
typedef TRefCountPtr<FRHIIndexBuffer> FIndexBufferRHIRef;
typedef TRefCountPtr<FRHIVertexBuffer> FVertexBufferRHIRef;
typedef TRefCountPtr<FRHIStructuredBuffer> FStructuredBufferRHIRef;
typedef TRefCountPtr<FRHITexture> FTextureRHIRef;
typedef TRefCountPtr<FRHITexture2D> FTexture2DRHIRef;
typedef TRefCountPtr<FRHITexture2DArray> FTexture2DArrayRHIRef;
typedef TRefCountPtr<FRHITexture3D> FTexture3DRHIRef;
typedef TRefCountPtr<FRHITextureCube> FTextureCubeRHIRef;
typedef TRefCountPtr<FRHITextureReference> FTextureReferenceRHIRef;
typedef TRefCountPtr<FRHIRenderQuery> FRenderQueryRHIRef;
typedef TRefCountPtr<FRHIRenderQueryPool> FRenderQueryPoolRHIRef;
typedef TRefCountPtr<FRHITimestampCalibrationQuery> FTimestampCalibrationQueryRHIRef;
typedef TRefCountPtr<FRHIGPUFence> FGPUFenceRHIRef;
typedef TRefCountPtr<FRHIViewport> FViewportRHIRef;
typedef TRefCountPtr<FRHIUnorderedAccessView> FUnorderedAccessViewRHIRef;
typedef TRefCountPtr<FRHIShaderResourceView> FShaderResourceViewRHIRef;
typedef TRefCountPtr<FRHIGraphicsPipelineState> FGraphicsPipelineStateRHIRef;
typedef TRefCountPtr<FRHIRayTracingPipelineState> FRayTracingPipelineStateRHIRef; // FRHIGPUMemoryReadback使用的通用分段緩衝類.
class FRHIStagingBuffer : public FRHIResource
{
public:
FRHIStagingBuffer();
virtual ~FRHIStagingBuffer();
virtual void *Lock(uint32 Offset, uint32 NumBytes) = 0;
virtual void Unlock() = 0;
protected:
bool bIsLocked;
}; class FGenericRHIStagingBuffer : public FRHIStagingBuffer
{
public:
FGenericRHIStagingBuffer();
~FGenericRHIStagingBuffer();
virtual void* Lock(uint32 Offset, uint32 NumBytes) final override;
virtual void Unlock() final override; FVertexBufferRHIRef ShadowBuffer;
uint32 Offset;
}; // 自定義呈現.
class FRHICustomPresent : public FRHIResource
{
public:
FRHICustomPresent() {}
virtual ~FRHICustomPresent() {} // 視口尺寸改變時的呼叫.
virtual void OnBackBufferResize() = 0;
// 從渲染執行緒中呼叫,以檢視是否會請求一個原生呈現。
virtual bool NeedsNativePresent() = 0;
// RHI執行緒呼叫, 執行自定義呈現.
virtual bool Present(int32& InOutSyncInterval) = 0;
// RHI執行緒呼叫, 在Present之後呼叫.
virtual void PostPresent() {}; // 當渲染執行緒被捕獲時呼叫.
virtual void OnAcquireThreadOwnership() {}
// 當渲染執行緒被釋放時呼叫.
virtual void OnReleaseThreadOwnership() {}
};

以上可知,FRHIResource的種類和子類都非常多,可分為狀態塊、著色器繫結、著色器、管線狀態、緩衝區、紋理、檢視以及其它雜項。需要注意的是,以上只是顯示了平臺無關的基礎型別,實際上,在不同的圖形API中,會繼承上面的型別。以FRHIUniformBuffer為例,它的繼承體系如下:

classDiagram-v2
FRHIResource <|-- FRHIUniformBuffer
FRHIUniformBuffer <|-- FD3D11UniformBuffer
FRHIUniformBuffer <|-- FD3D12UniformBuffer
FRHIUniformBuffer <|-- FOpenGLUniformBuffer
FRHIUniformBuffer <|-- FVulkanUniformBuffer
FRHIUniformBuffer <|-- FMetalSuballocatedUniformBuffer
FRHIUniformBuffer <|-- FEmptyUniformBuffer

以上顯示出FRHIUniformBuffer在D3D11、D3D12、OpenGL、Vulkan、Metal等圖形API的子類,以便實現統一緩衝區的平臺相關的資源和操作介面,還有一個特殊的空實現FEmptyUniformBuffer。

與FRHIUniformBuffer類似的是,FRHIResource的其它直接或間接子類也需要被具體的圖形API或作業系統子類實現,以支援在該平臺的渲染。下面繪製出最複雜的紋理資源類繼承體系UML圖:

classDiagram-v2
FRHIResource <|-- FRHITexture
FRHITexture <|-- FRHITexture2D
FRHITexture2D <|-- FRHITexture2DArray
FRHITexture <|-- FRHITexture3D
FRHITexture <|-- FRHITextureCube
FRHITexture <|-- FRHITextureReference
FRHITextureReference <|-- FRHITextureReferenceNullImpl

FRHITexture2D <|-- FMetalTexture2D
FRHITexture2D <|-- FD3D12BaseTexture2D
FRHITexture2D <|-- FOpenGLBaseTexture2D
FRHITexture2D <|-- FVulkanTexture2D
FRHITexture2D <|-- FD3D11BaseTexture2D
FRHITexture2D <|-- FEmptyTexture2D

如果看不清請點選放大下面的圖片版本:

需要注意,上圖做了簡化,除了FRHITexture2D會被各個圖形API繼承子類,其它紋理型別(如FRHITexture2DArray、FRHITexture3D、FRHITextureCube、FRHITextureReference)也會被各個平臺繼承並實現。

10.2.3 FRHICommand

FRHICommand是RHI模組的渲染指令基類,這些指令通常由渲染執行緒通過命令佇列Push到RHI執行緒,在合適的時機由RHI執行緒執行。FRHICommand同時又繼承自FRHICommandBase,它們的定義如下:

// Engine\Source\Runtime\RHI\Public\RHICommandList.h

// RHI命令基類.
struct FRHICommandBase
{
// 下一個命令. (命令連結串列的節點)
FRHICommandBase* Next = nullptr; // 執行命令後銷燬.
virtual void ExecuteAndDestruct(FRHICommandListBase& CmdList, FRHICommandListDebugContext& DebugContext) = 0;
}; emplate<typename TCmd, typename NameType = FUnnamedRhiCommand>
struct FRHICommand : public FRHICommandBase
{
// 執行命令後銷燬.
void ExecuteAndDestruct(FRHICommandListBase& CmdList, FRHICommandListDebugContext& Context) override final
{
TCmd *ThisCmd = static_cast<TCmd*>(this);
ThisCmd->Execute(CmdList);
ThisCmd->~TCmd();
}
};

值得一提的是,FRHICommandBase有指向下一個節點的Next變數,意味著FRHICommandBase是命令連結串列的節點。FRHICommand擁有數量眾多的子類,是通過特殊的巨集來快速宣告:

// 定義RHI命令子類的巨集
#define FRHICOMMAND_MACRO(CommandName) \
struct PREPROCESSOR_JOIN(CommandName##String, __LINE__) \
{ \
static const TCHAR* TStr() { return TEXT(#CommandName); } \
}; \
// 命令繼承了FRHICommand.
struct CommandName final : public FRHICommand<CommandName, PREPROCESSOR_JOIN(CommandName##String, __LINE__)>

有了以上的巨集,就可以快速定義FRHICommand的子類(亦即具體的RHI命令),例如:

FRHICOMMAND_MACRO(FRHICommandSetStencilRef)
{
uint32 StencilRef;
FORCEINLINE_DEBUGGABLE FRHICommandSetStencilRef(uint32 InStencilRef)
: StencilRef(InStencilRef)
{
}
RHI_API void Execute(FRHICommandListBase& CmdList);
};

展開巨集定義之後,程式碼如下:

struct FRHICommandSetStencilRefString853
{
static const TCHAR* TStr() { return TEXT("FRHICommandSetStencilRef"); }
}; // FRHICommandSetStencilRef繼承了FRHICommand.
struct FRHICommandSetStencilRef final : public FRHICommand<FRHICommandSetStencilRef, FRHICommandSetStencilRefString853>
{
uint32 StencilRef;
FRHICommandSetStencilRef(uint32 InStencilRef)
: StencilRef(InStencilRef)
{
}
RHI_API void Execute(FRHICommandListBase& CmdList);
};

利用FRHICOMMAND_MACRO宣告的RHI命令數量眾多,下面列舉其中一部分:

FRHICOMMAND_MACRO(FRHISyncFrameCommand)
FRHICOMMAND_MACRO(FRHICommandStat)
FRHICOMMAND_MACRO(FRHICommandRHIThreadFence)
FRHICOMMAND_MACRO(FRHIAsyncComputeSubmitList)
FRHICOMMAND_MACRO(FRHICommandSubmitSubList) FRHICOMMAND_MACRO(FRHICommandWaitForAndSubmitSubListParallel)
FRHICOMMAND_MACRO(FRHICommandWaitForAndSubmitSubList)
FRHICOMMAND_MACRO(FRHICommandWaitForAndSubmitRTSubList)
FRHICOMMAND_MACRO(FRHICommandWaitForTemporalEffect)
FRHICOMMAND_MACRO(FRHICommandWaitForTemporalEffect)
FRHICOMMAND_MACRO(FRHICommandBroadcastTemporalEffect) FRHICOMMAND_MACRO(FRHICommandBeginUpdateMultiFrameResource)
FRHICOMMAND_MACRO(FRHICommandEndUpdateMultiFrameResource)
FRHICOMMAND_MACRO(FRHICommandBeginUpdateMultiFrameUAV)
FRHICOMMAND_MACRO(FRHICommandEndUpdateMultiFrameUAV)
FRHICOMMAND_MACRO(FRHICommandSetGPUMask) FRHICOMMAND_MACRO(FRHICommandSetStencilRef)
FRHICOMMAND_MACRO(FRHICommandSetBlendFactor)
FRHICOMMAND_MACRO(FRHICommandSetStreamSource)
FRHICOMMAND_MACRO(FRHICommandSetStreamSource)
FRHICOMMAND_MACRO(FRHICommandSetViewport)
FRHICOMMAND_MACRO(FRHICommandSetScissorRect) FRHICOMMAND_MACRO(FRHICommandBeginRenderPass)
FRHICOMMAND_MACRO(FRHICommandEndRenderPass)
FRHICOMMAND_MACRO(FRHICommandNextSubpass)
FRHICOMMAND_MACRO(FRHICommandBeginParallelRenderPass)
FRHICOMMAND_MACRO(FRHICommandEndParallelRenderPass)
FRHICOMMAND_MACRO(FRHICommandBeginRenderSubPass)
FRHICOMMAND_MACRO(FRHICommandEndRenderSubPass) FRHICOMMAND_MACRO(FRHICommandDrawPrimitive)
FRHICOMMAND_MACRO(FRHICommandDrawIndexedPrimitive)
FRHICOMMAND_MACRO(FRHICommandDrawPrimitiveIndirect)
FRHICOMMAND_MACRO(FRHICommandDrawIndexedIndirect)
FRHICOMMAND_MACRO(FRHICommandDrawIndexedPrimitiveIndirect) FRHICOMMAND_MACRO(FRHICommandSetGraphicsPipelineState)
FRHICOMMAND_MACRO(FRHICommandBeginUAVOverlap)
FRHICOMMAND_MACRO(FRHICommandEndUAVOverlap) FRHICOMMAND_MACRO(FRHICommandSetDepthBounds)
FRHICOMMAND_MACRO(FRHICommandSetShadingRate)
FRHICOMMAND_MACRO(FRHICommandSetShadingRateImage)
FRHICOMMAND_MACRO(FRHICommandClearUAVFloat)
FRHICOMMAND_MACRO(FRHICommandCopyToResolveTarget)
FRHICOMMAND_MACRO(FRHICommandCopyTexture)
FRHICOMMAND_MACRO(FRHICommandBeginTransitions)
FRHICOMMAND_MACRO(FRHICommandEndTransitions)
FRHICOMMAND_MACRO(FRHICommandResourceTransition)
FRHICOMMAND_MACRO(FRHICommandClearColorTexture)
FRHICOMMAND_MACRO(FRHICommandClearDepthStencilTexture)
FRHICOMMAND_MACRO(FRHICommandClearColorTextures) FRHICOMMAND_MACRO(FRHICommandSetGlobalUniformBuffers)
FRHICOMMAND_MACRO(FRHICommandBuildLocalUniformBuffer) FRHICOMMAND_MACRO(FRHICommandBeginRenderQuery)
FRHICOMMAND_MACRO(FRHICommandEndRenderQuery)
FRHICOMMAND_MACRO(FRHICommandPollOcclusionQueries) FRHICOMMAND_MACRO(FRHICommandBeginScene)
FRHICOMMAND_MACRO(FRHICommandEndScene)
FRHICOMMAND_MACRO(FRHICommandBeginFrame)
FRHICOMMAND_MACRO(FRHICommandEndFrame)
FRHICOMMAND_MACRO(FRHICommandBeginDrawingViewport)
FRHICOMMAND_MACRO(FRHICommandEndDrawingViewport) FRHICOMMAND_MACRO(FRHICommandInvalidateCachedState)
FRHICOMMAND_MACRO(FRHICommandDiscardRenderTargets) FRHICOMMAND_MACRO(FRHICommandUpdateTextureReference)
FRHICOMMAND_MACRO(FRHICommandUpdateRHIResources)
FRHICOMMAND_MACRO(FRHICommandBackBufferWaitTrackingBeginFrame)
FRHICOMMAND_MACRO(FRHICommandFlushTextureCacheBOP)
FRHICOMMAND_MACRO(FRHICommandCopyBufferRegion)
FRHICOMMAND_MACRO(FRHICommandCopyBufferRegions) FRHICOMMAND_MACRO(FClearCachedRenderingDataCommand)
FRHICOMMAND_MACRO(FClearCachedElementDataCommand) FRHICOMMAND_MACRO(FRHICommandRayTraceOcclusion)
FRHICOMMAND_MACRO(FRHICommandRayTraceIntersection)
FRHICOMMAND_MACRO(FRHICommandRayTraceDispatch)
FRHICOMMAND_MACRO(FRHICommandSetRayTracingBindings)
FRHICOMMAND_MACRO(FRHICommandClearRayTracingBindings)

FRHICommand的子類除了以上用FRHICOMMAND_MACRO宣告的,還擁有以下直接派生的:

  • FRHICommandSetShaderParameter
  • FRHICommandSetShaderUniformBuffer
  • FRHICommandSetShaderTexture
  • FRHICommandSetShaderResourceViewParameter
  • FRHICommandSetUAVParameter
  • FRHICommandSetShaderSampler
  • FRHICommandSetComputeShader
  • FRHICommandSetComputePipelineState
  • FRHICommandDispatchComputeShader
  • FRHICommandDispatchIndirectComputeShader
  • FRHICommandSetAsyncComputeBudget
  • FRHICommandCopyToStagingBuffer
  • FRHICommandWriteGPUFence
  • FRHICommandSetLocalUniformBuffer
  • FRHICommandSubmitCommandsHint
  • FRHICommandPushEvent
  • FRHICommandPopEvent
  • FRHICommandBuildAccelerationStructure
  • FRHICommandBuildAccelerationStructures
  • ......

無論是直接派生還是用FRHICOMMAND_MACRO,沒有本質的區別,都是FRHICommand的子類,都是可以提供給渲染執行緒操作的RHI層中間渲染命令。只是用FRHICOMMAND_MACRO會更簡便,少寫一些重複的程式碼罷了。

因此可知,RHI命令種類繁多,主要包含以下幾大類:

  • 資料和資源的設定、更新、清理、轉換、拷貝、回讀。
  • 圖元繪製。
  • Pass、SubPass、場景、ViewPort等的開始和結束事件。
  • 柵欄、等待、廣播介面。
  • 光線追蹤。
  • Slate、除錯相關的命令。

下面繪製出FRHICommand的核心繼承體系:

classDiagram-v2
FRHICommandBase <|-- FRHICommand

class FRHICommandBase{
FRHICommandBase* Next
ExecuteAndDestruct()
}

FRHICommand <|-- FRHICommandDrawPrimitive
FRHICommand <|-- FRHICommandWaitForAndSubmitSubList
FRHICommand <|-- FRHICommandResourceTransition
FRHICommand <|-- etc

10.2.4 FRHICommandList

FRHICommandList是RHI的指令佇列,用來管理、執行一組FRHICommand的物件。它和父類的定義如下:

// Engine\Source\Runtime\RHI\Public\RHICommandList.h

// RHI命令列表基類.
class FRHICommandListBase : public FNoncopyable
{
public:
~FRHICommandListBase(); // 附帶了迴圈利用的自定義new/delete操作.
void* operator new(size_t Size);
void operator delete(void *RawMemory); // 重新整理命令佇列.
inline void Flush();
// 是否立即模式.
inline bool IsImmediate();
// 是否立即的非同步計算.
inline bool IsImmediateAsyncCompute(); // 獲取已佔用的記憶體.
const int32 GetUsedMemory() const; // 入隊非同步命令佇列的提交.
void QueueAsyncCommandListSubmit(FGraphEventRef& AnyThreadCompletionEvent, class FRHICommandList* CmdList);
// 入隊並行的非同步命令佇列的提交.
void QueueParallelAsyncCommandListSubmit(FGraphEventRef* AnyThreadCompletionEvents, bool bIsPrepass, class FRHICommandList** CmdLists, int32* NumDrawsIfKnown, int32 Num, int32 MinDrawsPerTranslate, bool bSpewMerge);
// 入隊渲染執行緒命令佇列的提交.
void QueueRenderThreadCommandListSubmit(FGraphEventRef& RenderThreadCompletionEvent, class FRHICommandList* CmdList);
// 入隊命令佇列的提交.
void QueueCommandListSubmit(class FRHICommandList* CmdList);
// 增加派發前序任務.
void AddDispatchPrerequisite(const FGraphEventRef& Prereq); // 等待介面.
void WaitForTasks(bool bKnownToBeComplete = false);
void WaitForDispatch();
void WaitForRHIThreadTasks();
void HandleRTThreadTaskCompletion(const FGraphEventRef& MyCompletionGraphEvent); // 分配介面.
void* Alloc(int32 AllocSize, int32 Alignment);
template <typename T>
void* Alloc();
template <typename T>
const TArrayView<T> AllocArray(const TArrayView<T> InArray);
TCHAR* AllocString(const TCHAR* Name);
// 分配指令.
void* AllocCommand(int32 AllocSize, int32 Alignment);
template <typename TCmd>
void* AllocCommand(); bool HasCommands() const;
bool IsExecuting() const;
bool IsBottomOfPipe() const;
bool IsTopOfPipe() const;
bool IsGraphics() const;
bool IsAsyncCompute() const;
// RHI管線, ERHIPipeline::Graphics或ERHIPipeline::AsyncCompute.
ERHIPipeline GetPipeline() const; // 是否忽略RHI執行緒而直接當同步執行.
bool Bypass() const; // 交換命令佇列.
void ExchangeCmdList(FRHICommandListBase& Other);
// 設定Context.
void SetContext(IRHICommandContext* InContext);
IRHICommandContext& GetContext();
void SetComputeContext(IRHIComputeContext* InComputeContext);
IRHIComputeContext& GetComputeContext();
void CopyContext(FRHICommandListBase& ParentCommandList); void MaybeDispatchToRHIThread();
void MaybeDispatchToRHIThreadInner(); (......) private:
// 命令連結串列的頭.
FRHICommandBase* Root;
// 指向Root的指標.
FRHICommandBase** CommandLink; bool bExecuting;
uint32 NumCommands;
uint32 UID; // 裝置上下文.
IRHICommandContext* Context;
// 計算上下文.
IRHIComputeContext* ComputeContext; FMemStackBase MemManager;
FGraphEventArray RTTasks; // 重置.
void Reset(); public:
enum class ERenderThreadContext
{
SceneRenderTargets,
Num
}; // 渲染執行緒上下文.
void *RenderThreadContexts[(int32)ERenderThreadContext::Num]; protected:
//the values of this struct must be copied when the commandlist is split
struct FPSOContext
{
uint32 CachedNumSimultanousRenderTargets = 0;
TStaticArray<FRHIRenderTargetView, MaxSimultaneousRenderTargets> CachedRenderTargets;
FRHIDepthRenderTargetView CachedDepthStencilTarget; ESubpassHint SubpassHint = ESubpassHint::None;
uint8 SubpassIndex = 0;
uint8 MultiViewCount = 0;
bool HasFragmentDensityAttachment = false;
} PSOContext; // 繫結的著色器輸入.
FBoundShaderStateInput BoundShaderInput;
// 繫結的計算著色器RHI資源.
FRHIComputeShader* BoundComputeShaderRHI; // 使繫結的著色器生效.
void ValidateBoundShader(FRHIVertexShader* ShaderRHI);
void ValidateBoundShader(FRHIPixelShader* ShaderRHI);
(......) void CacheActiveRenderTargets(...);
void CacheActiveRenderTargets(const FRHIRenderPassInfo& Info);
void IncrementSubpass();
void ResetSubpass(ESubpassHint SubpassHint); public:
void CopyRenderThreadContexts(const FRHICommandListBase& ParentCommandList);
void SetRenderThreadContext(void* InContext, ERenderThreadContext Slot);
void* GetRenderThreadContext(ERenderThreadContext Slot); // 通用資料.
struct FCommonData
{
class FRHICommandListBase* Parent = nullptr; enum class ECmdListType
{
Immediate = 1,
Regular,
};
ECmdListType Type = ECmdListType::Regular;
bool bInsideRenderPass = false;
bool bInsideComputePass = false;
}; bool DoValidation() const;
inline bool IsOutsideRenderPass() const;
inline bool IsInsideRenderPass() const;
inline bool IsInsideComputePass() const; FCommonData Data;
}; // 計算命令佇列.
class FRHIComputeCommandList : public FRHICommandListBase
{
public:
FRHIComputeCommandList(FRHIGPUMask GPUMask) : FRHICommandListBase(GPUMask) {} void* operator new(size_t Size);
void operator delete(void *RawMemory); // 著色器引數設定和獲取.
inline FRHIComputeShader* GetBoundComputeShader() const;
void SetGlobalUniformBuffers(const FUniformBufferStaticBindings& UniformBuffers);
void SetShaderUniformBuffer(FRHIComputeShader* Shader, uint32 BaseIndex, FRHIUniformBuffer* UniformBuffer);
void SetShaderUniformBuffer(const FComputeShaderRHIRef& Shader, uint32 BaseIndex, FRHIUniformBuffer* UniformBuffer);
void SetShaderParameter(FRHIComputeShader* Shader, uint32 BufferIndex, uint32 BaseIndex, uint32 NumBytes, const void* NewValue);
void SetShaderParameter(FComputeShaderRHIRef& Shader, uint32 BufferIndex, uint32 BaseIndex, uint32 NumBytes, const void* NewValue);
void SetShaderTexture(FRHIComputeShader* Shader, uint32 TextureIndex, FRHITexture* Texture);
void SetShaderResourceViewParameter(FRHIComputeShader* Shader, uint32 SamplerIndex, FRHIShaderResourceView* SRV);
void SetShaderSampler(FRHIComputeShader* Shader, uint32 SamplerIndex, FRHISamplerState* State);
void SetUAVParameter(FRHIComputeShader* Shader, uint32 UAVIndex, FRHIUnorderedAccessView* UAV);
void SetUAVParameter(FRHIComputeShader* Shader, uint32 UAVIndex, FRHIUnorderedAccessView* UAV, uint32 InitialCount);
void SetComputeShader(FRHIComputeShader* ComputeShader);
void SetComputePipelineState(FComputePipelineState* ComputePipelineState, FRHIComputeShader* ComputeShader); void SetAsyncComputeBudget(EAsyncComputeBudget Budget);
// 派發計算著色器.
void DispatchComputeShader(uint32 ThreadGroupCountX, uint32 ThreadGroupCountY, uint32 ThreadGroupCountZ);
void DispatchIndirectComputeShader(FRHIVertexBuffer* ArgumentBuffer, uint32 ArgumentOffset); // 清理.
void ClearUAVFloat(FRHIUnorderedAccessView* UnorderedAccessViewRHI, const FVector4& Values);
void ClearUAVUint(FRHIUnorderedAccessView* UnorderedAccessViewRHI, const FUintVector4& Values); // 資源轉換.
void BeginTransitions(TArrayView<const FRHITransition*> Transitions);
void EndTransitions(TArrayView<const FRHITransition*> Transitions);
inline void Transition(TArrayView<const FRHITransitionInfo> Infos);
void BeginTransition(const FRHITransition* Transition);
void EndTransition(const FRHITransition* Transition);
void Transition(const FRHITransitionInfo& Info) // ---- 舊有的API ---- void TransitionResource(ERHIAccess TransitionType, const FTextureRHIRef& InTexture);
void TransitionResource(ERHIAccess TransitionType, FRHITexture* InTexture);
inline void TransitionResources(ERHIAccess TransitionType, FRHITexture* const* InTextures, int32 NumTextures);
void TransitionResourceArrayNoCopy(ERHIAccess TransitionType, TArray<FRHITexture*>& InTextures);
inline void TransitionResources(ERHIAccess TransitionType, EResourceTransitionPipeline /* ignored TransitionPipeline */, FRHIUnorderedAccessView* const* InUAVs, int32 NumUAVs, FRHIComputeFence* WriteFence);
void TransitionResource(ERHIAccess TransitionType, EResourceTransitionPipeline TransitionPipeline, FRHIUnorderedAccessView* InUAV, FRHIComputeFence* WriteFence);
void TransitionResource(ERHIAccess TransitionType, EResourceTransitionPipeline TransitionPipeline, FRHIUnorderedAccessView* InUAV);
void TransitionResources(ERHIAccess TransitionType, EResourceTransitionPipeline TransitionPipeline, FRHIUnorderedAccessView* const* InUAVs, int32 NumUAVs);
void WaitComputeFence(FRHIComputeFence* WaitFence); void BeginUAVOverlap();
void EndUAVOverlap();
void BeginUAVOverlap(FRHIUnorderedAccessView* UAV);
void EndUAVOverlap(FRHIUnorderedAccessView* UAV);
void BeginUAVOverlap(TArrayView<FRHIUnorderedAccessView* const> UAVs);
void EndUAVOverlap(TArrayView<FRHIUnorderedAccessView* const> UAVs); void PushEvent(const TCHAR* Name, FColor Color);
void PopEvent();
void BreakPoint(); void SubmitCommandsHint();
void CopyToStagingBuffer(FRHIVertexBuffer* SourceBuffer, FRHIStagingBuffer* DestinationStagingBuffer, uint32 Offset, uint32 NumBytes); void WriteGPUFence(FRHIGPUFence* Fence);
void SetGPUMask(FRHIGPUMask InGPUMask); (......)
}; // RHI命令佇列.
class FRHICommandList : public FRHIComputeCommandList
{
public:
FRHICommandList(FRHIGPUMask GPUMask) : FRHIComputeCommandList(GPUMask) {} bool AsyncPSOCompileAllowed() const; void* operator new(size_t Size);
void operator delete(void *RawMemory); // 獲取繫結的著色器.
inline FRHIVertexShader* GetBoundVertexShader() const;
inline FRHIHullShader* GetBoundHullShader() const;
inline FRHIDomainShader* GetBoundDomainShader() const;
inline FRHIPixelShader* GetBoundPixelShader() const;
inline FRHIGeometryShader* GetBoundGeometryShader() const; // 更新多幀資源.
void BeginUpdateMultiFrameResource(FRHITexture* Texture);
void EndUpdateMultiFrameResource(FRHITexture* Texture);
void BeginUpdateMultiFrameResource(FRHIUnorderedAccessView* UAV);
void EndUpdateMultiFrameResource(FRHIUnorderedAccessView* UAV); // Uniform Buffer介面.
FLocalUniformBuffer BuildLocalUniformBuffer(const void* Contents, uint32 ContentsSize, const FRHIUniformBufferLayout& Layout);
template <typename TRHIShader>
void SetLocalShaderUniformBuffer(TRHIShader* Shader, uint32 BaseIndex, const FLocalUniformBuffer& UniformBuffer);
template <typename TShaderRHI>
void SetLocalShaderUniformBuffer(const TRefCountPtr<TShaderRHI>& Shader, uint32 BaseIndex, const FLocalUniformBuffer& UniformBuffer);
void SetShaderUniformBuffer(FRHIGraphicsShader* Shader, uint32 BaseIndex, FRHIUniformBuffer* UniformBuffer);
template <typename TShaderRHI>
FORCEINLINE void SetShaderUniformBuffer(const TRefCountPtr<TShaderRHI>& Shader, uint32 BaseIndex, FRHIUniformBuffer* UniformBuffer); // 著色器引數.
void SetShaderParameter(FRHIGraphicsShader* Shader, uint32 BufferIndex, uint32 BaseIndex, uint32 NumBytes, const void* NewValue);
template <typename TShaderRHI>
void SetShaderParameter(const TRefCountPtr<TShaderRHI>& Shader, uint32 BufferIndex, uint32 BaseIndex, uint32 NumBytes, const void* NewValue);
void SetShaderTexture(FRHIGraphicsShader* Shader, uint32 TextureIndex, FRHITexture* Texture);
template <typename TShaderRHI>
void SetShaderTexture(const TRefCountPtr<TShaderRHI>& Shader, uint32 TextureIndex, FRHITexture* Texture);
void SetShaderResourceViewParameter(FRHIGraphicsShader* Shader, uint32 SamplerIndex, FRHIShaderResourceView* SRV);
template <typename TShaderRHI>
void SetShaderResourceViewParameter(const TRefCountPtr<TShaderRHI>& Shader, uint32 SamplerIndex, FRHIShaderResourceView* SRV);
void SetShaderSampler(FRHIGraphicsShader* Shader, uint32 SamplerIndex, FRHISamplerState* State);
template <typename TShaderRHI>
void SetShaderSampler(const TRefCountPtr<TShaderRHI>& Shader, uint32 SamplerIndex, FRHISamplerState* State);
void SetUAVParameter(FRHIPixelShader* Shader, uint32 UAVIndex, FRHIUnorderedAccessView* UAV);
void SetUAVParameter(const TRefCountPtr<FRHIPixelShader>& Shader, uint32 UAVIndex, FRHIUnorderedAccessView* UAV);
void SetBlendFactor(const FLinearColor& BlendFactor = FLinearColor::White); // 圖元繪製.
void DrawPrimitive(uint32 BaseVertexIndex, uint32 NumPrimitives, uint32 NumInstances);
void DrawIndexedPrimitive(FRHIIndexBuffer* IndexBuffer, int32 BaseVertexIndex, uint32 FirstInstance, uint32 NumVertices, uint32 StartIndex, uint32 NumPrimitives, uint32 NumInstances);
void DrawPrimitiveIndirect(FRHIVertexBuffer* ArgumentBuffer, uint32 ArgumentOffset);
void DrawIndexedIndirect(FRHIIndexBuffer* IndexBufferRHI, FRHIStructuredBuffer* ArgumentsBufferRHI, uint32 DrawArgumentsIndex, uint32 NumInstances);
void DrawIndexedPrimitiveIndirect(FRHIIndexBuffer* IndexBuffer, FRHIVertexBuffer* ArgumentsBuffer, uint32 ArgumentOffset); // 設定資料.
void SetStreamSource(uint32 StreamIndex, FRHIVertexBuffer* VertexBuffer, uint32 Offset);
void SetStencilRef(uint32 StencilRef);
void SetViewport(float MinX, float MinY, float MinZ, float MaxX, float MaxY, float MaxZ);
void SetStereoViewport(float LeftMinX, float RightMinX, float LeftMinY, float RightMinY, float MinZ, float LeftMaxX, float RightMaxX, float LeftMaxY, float RightMaxY, float MaxZ);
void SetScissorRect(bool bEnable, uint32 MinX, uint32 MinY, uint32 MaxX, uint32 MaxY);
void ApplyCachedRenderTargets(FGraphicsPipelineStateInitializer& GraphicsPSOInit);
void SetGraphicsPipelineState(class FGraphicsPipelineState* GraphicsPipelineState, const FBoundShaderStateInput& ShaderInput, bool bApplyAdditionalState);
void SetDepthBounds(float MinDepth, float MaxDepth);
void SetShadingRate(EVRSShadingRate ShadingRate, EVRSRateCombiner Combiner);
void SetShadingRateImage(FRHITexture* RateImageTexture, EVRSRateCombiner Combiner); // 拷貝紋理.
void CopyToResolveTarget(FRHITexture* SourceTextureRHI, FRHITexture* DestTextureRHI, const FResolveParams& ResolveParams);
void CopyTexture(FRHITexture* SourceTextureRHI, FRHITexture* DestTextureRHI, const FRHICopyTextureInfo& CopyInfo); void ResummarizeHTile(FRHITexture2D* DepthTexture); // 渲染查詢.
void BeginRenderQuery(FRHIRenderQuery* RenderQuery)
void EndRenderQuery(FRHIRenderQuery* RenderQuery)
void CalibrateTimers(FRHITimestampCalibrationQuery* CalibrationQuery);
void PollOcclusionQueries() /* LEGACY API */
void TransitionResource(FExclusiveDepthStencil DepthStencilMode, FRHITexture* DepthTexture);
void BeginRenderPass(const FRHIRenderPassInfo& InInfo, const TCHAR* Name);
void EndRenderPass();
void NextSubpass(); // 下面介面需要在立即模式的命令佇列呼叫.
void BeginScene();
void EndScene();
void BeginDrawingViewport(FRHIViewport* Viewport, FRHITexture* RenderTargetRHI);
void EndDrawingViewport(FRHIViewport* Viewport, bool bPresent, bool bLockToVsync);
void BeginFrame();
void EndFrame(); void RHIInvalidateCachedState();
void DiscardRenderTargets(bool Depth, bool Stencil, uint32 ColorBitMask); void CopyBufferRegion(FRHIVertexBuffer* DestBuffer, uint64 DstOffset, FRHIVertexBuffer* SourceBuffer, uint64 SrcOffset, uint64 NumBytes); (......)
};

FRHICommandListBase定義了命令佇列所需的基本資料(命令列表、裝置上下文)和介面(命令的重新整理、等待、入隊、派發等,記憶體分配)。FRHIComputeCommandList定義了計算著色器相關的介面、GPU資源狀態轉換和著色器部分引數的設定。FRHICommandList定義了普通渲染管線的介面,包含VS、PS、GS的繫結,圖元繪製,更多著色器引數的設定和資源狀態轉換,資源建立、更新和等待等等。

FRHICommandList還有數個子類,定義如下:

// 立即模式的命令佇列.
class FRHICommandListImmediate : public FRHICommandList
{
// 命令匿名函式.
template <typename LAMBDA>
struct TRHILambdaCommand final : public FRHICommandBase
{
LAMBDA Lambda; void ExecuteAndDestruct(FRHICommandListBase& CmdList, FRHICommandListDebugContext&) override final;
}; FRHICommandListImmediate();
~FRHICommandListImmediate(); public:
// 立即重新整理命令.
void ImmediateFlush(EImmediateFlushType::Type FlushType);
// 阻塞RHI執行緒.
bool StallRHIThread();
// 取消阻塞RHI執行緒.
void UnStallRHIThread();
// 是否阻塞中.
static bool IsStalled(); void SetCurrentStat(TStatId Stat); static FGraphEventRef RenderThreadTaskFence();
static FGraphEventArray& GetRenderThreadTaskArray();
static void WaitOnRenderThreadTaskFence(FGraphEventRef& Fence);
static bool AnyRenderThreadTasksOutstanding();
FGraphEventRef RHIThreadFence(bool bSetLockFence = false); // 將給定的非同步計算命令列表按當前立即命令列表的順序排列.
void QueueAsyncCompute(FRHIComputeCommandList& RHIComputeCmdList); bool IsBottomOfPipe();
bool IsTopOfPipe();
template <typename LAMBDA>
void EnqueueLambda(LAMBDA&& Lambda); // 資源建立.
FSamplerStateRHIRef CreateSamplerState(const FSamplerStateInitializerRHI& Initializer)
FRasterizerStateRHIRef CreateRasterizerState(const FRasterizerStateInitializerRHI& Initializer)
FDepthStencilStateRHIRef CreateDepthStencilState(const FDepthStencilStateInitializerRHI& Initializer)
FBlendStateRHIRef CreateBlendState(const FBlendStateInitializerRHI& Initializer)
FPixelShaderRHIRef CreatePixelShader(TArrayView<const uint8> Code, const FSHAHash& Hash)
FVertexShaderRHIRef CreateVertexShader(TArrayView<const uint8> Code, const FSHAHash& Hash)
FHullShaderRHIRef CreateHullShader(TArrayView<const uint8> Code, const FSHAHash& Hash)
FDomainShaderRHIRef CreateDomainShader(TArrayView<const uint8> Code, const FSHAHash& Hash)
FGeometryShaderRHIRef CreateGeometryShader(TArrayView<const uint8> Code, const FSHAHash& Hash)
FComputeShaderRHIRef CreateComputeShader(TArrayView<const uint8> Code, const FSHAHash& Hash)
FComputeFenceRHIRef CreateComputeFence(const FName& Name)
FGPUFenceRHIRef CreateGPUFence(const FName& Name)
FStagingBufferRHIRef CreateStagingBuffer()
FBoundShaderStateRHIRef CreateBoundShaderState(...)
FGraphicsPipelineStateRHIRef CreateGraphicsPipelineState(const FGraphicsPipelineStateInitializer& Initializer)
TRefCountPtr<FRHIComputePipelineState> CreateComputePipelineState(FRHIComputeShader* ComputeShader)
FUniformBufferRHIRef CreateUniformBuffer(...)
FIndexBufferRHIRef CreateAndLockIndexBuffer(uint32 Stride, uint32 Size, EBufferUsageFlags InUsage, ERHIAccess InResourceState, FRHIResourceCreateInfo& CreateInfo, void*& OutDataBuffer)
FIndexBufferRHIRef CreateAndLockIndexBuffer(uint32 Stride, uint32 Size, uint32 InUsage, FRHIResourceCreateInfo& CreateInfo, void*& OutDataBuffer) // 頂點/索引介面.
void* LockIndexBuffer(FRHIIndexBuffer* IndexBuffer, uint32 Offset, uint32 SizeRHI, EResourceLockMode LockMode);
void UnlockIndexBuffer(FRHIIndexBuffer* IndexBuffer);
void* LockStagingBuffer(FRHIStagingBuffer* StagingBuffer, FRHIGPUFence* Fence, uint32 Offset, uint32 SizeRHI);
void UnlockStagingBuffer(FRHIStagingBuffer* StagingBuffer);
FVertexBufferRHIRef CreateAndLockVertexBuffer(uint32 Size, EBufferUsageFlags InUsage, ...);
FVertexBufferRHIRef CreateAndLockVertexBuffer(uint32 Size, uint32 InUsage, FRHIResourceCreateInfo& CreateInfo, void*& OutDataBuffer);
void* LockVertexBuffer(FRHIVertexBuffer* VertexBuffer, uint32 Offset, uint32 SizeRHI, EResourceLockMode LockMode);
void UnlockVertexBuffer(FRHIVertexBuffer* VertexBuffer);
void CopyVertexBuffer(FRHIVertexBuffer* SourceBuffer, FRHIVertexBuffer* DestBuffer);
void* LockStructuredBuffer(FRHIStructuredBuffer* StructuredBuffer, uint32 Offset, uint32 SizeRHI, EResourceLockMode LockMode);
void UnlockStructuredBuffer(FRHIStructuredBuffer* StructuredBuffer); // UAV/SRV建立.
FUnorderedAccessViewRHIRef CreateUnorderedAccessView(FRHIStructuredBuffer* StructuredBuffer, bool bUseUAVCounter, bool bAppendBuffer)
FUnorderedAccessViewRHIRef CreateUnorderedAccessView(FRHITexture* Texture, uint32 MipLevel)
FUnorderedAccessViewRHIRef CreateUnorderedAccessView(FRHITexture* Texture, uint32 MipLevel, uint8 Format)
FUnorderedAccessViewRHIRef CreateUnorderedAccessView(FRHIVertexBuffer* VertexBuffer, uint8 Format)
FUnorderedAccessViewRHIRef CreateUnorderedAccessView(FRHIIndexBuffer* IndexBuffer, uint8 Format)
FShaderResourceViewRHIRef CreateShaderResourceView(FRHIStructuredBuffer* StructuredBuffer)
FShaderResourceViewRHIRef CreateShaderResourceView(FRHIVertexBuffer* VertexBuffer, uint32 Stride, uint8 Format)
FShaderResourceViewRHIRef CreateShaderResourceView(const FShaderResourceViewInitializer& Initializer)
FShaderResourceViewRHIRef CreateShaderResourceView(FRHIIndexBuffer* Buffer) uint64 CalcTexture2DPlatformSize(...);
uint64 CalcTexture3DPlatformSize(...);
uint64 CalcTextureCubePlatformSize(...); // 紋理操作.
void GetTextureMemoryStats(FTextureMemoryStats& OutStats);
bool GetTextureMemoryVisualizeData(...);
void CopySharedMips(FRHITexture2D* DestTexture2D, FRHITexture2D* SrcTexture2D);
void TransferTexture(FRHITexture2D* Texture, FIntRect Rect, uint32 SrcGPUIndex, uint32 DestGPUIndex, bool PullData);
void TransferTextures(const TArrayView<const FTransferTextureParams> Params);
void GetResourceInfo(FRHITexture* Ref, FRHIResourceInfo& OutInfo);
FShaderResourceViewRHIRef CreateShaderResourceView(FRHITexture* Texture, const FRHITextureSRVCreateInfo& CreateInfo);
FShaderResourceViewRHIRef CreateShaderResourceView(FRHITexture* Texture, uint8 MipLevel);
FShaderResourceViewRHIRef CreateShaderResourceView(FRHITexture* Texture, uint8 MipLevel, uint8 NumMipLevels, uint8 Format);
FShaderResourceViewRHIRef CreateShaderResourceViewWriteMask(FRHITexture2D* Texture2DRHI);
FShaderResourceViewRHIRef CreateShaderResourceViewFMask(FRHITexture2D* Texture2DRHI);
uint32 ComputeMemorySize(FRHITexture* TextureRHI);
FTexture2DRHIRef AsyncReallocateTexture2D(...);
ETextureReallocationStatus FinalizeAsyncReallocateTexture2D(FRHITexture2D* Texture2D, bool bBlockUntilCompleted);
ETextureReallocationStatus CancelAsyncReallocateTexture2D(FRHITexture2D* Texture2D, bool bBlockUntilCompleted);
void* LockTexture2D(...);
void UnlockTexture2D(FRHITexture2D* Texture, uint32 MipIndex, bool bLockWithinMiptail, bool bFlushRHIThread = true);
void* LockTexture2DArray(...);
void UnlockTexture2DArray(FRHITexture2DArray* Texture, uint32 TextureIndex, uint32 MipIndex, bool bLockWithinMiptail);
void UpdateTexture2D(...);
void UpdateFromBufferTexture2D(...);
FUpdateTexture3DData BeginUpdateTexture3D(...);
void EndUpdateTexture3D(FUpdateTexture3DData& UpdateData);
void EndMultiUpdateTexture3D(TArray<FUpdateTexture3DData>& UpdateDataArray);
void UpdateTexture3D(...);
void* LockTextureCubeFace(...);
void UnlockTextureCubeFace(FRHITextureCube* Texture, ...); // 讀取紋理表面資料.
void ReadSurfaceData(FRHITexture* Texture, ...);
void ReadSurfaceData(FRHITexture* Texture, ...);
void MapStagingSurface(FRHITexture* Texture, void*& OutData, int32& OutWidth, int32& OutHeight);
void MapStagingSurface(FRHITexture* Texture, ...);
void UnmapStagingSurface(FRHITexture* Texture);
void ReadSurfaceFloatData(FRHITexture* Texture, ...);
void ReadSurfaceFloatData(FRHITexture* Texture, ...);
void Read3DSurfaceFloatData(FRHITexture* Texture,...); // 渲染執行緒的資源狀態轉換.
void AcquireTransientResource_RenderThread(FRHITexture* Texture);
void DiscardTransientResource_RenderThread(FRHITexture* Texture);
void AcquireTransientResource_RenderThread(FRHIVertexBuffer* Buffer);
void DiscardTransientResource_RenderThread(FRHIVertexBuffer* Buffer);
void AcquireTransientResource_RenderThread(FRHIStructuredBuffer* Buffer);
void DiscardTransientResource_RenderThread(FRHIStructuredBuffer* Buffer); // 獲取渲染查詢結果.
bool GetRenderQueryResult(FRHIRenderQuery* RenderQuery, ...);
void PollRenderQueryResults(); // 視口
FViewportRHIRef CreateViewport(void* WindowHandle, ...);
uint32 GetViewportNextPresentGPUIndex(FRHIViewport* Viewport);
FTexture2DRHIRef GetViewportBackBuffer(FRHIViewport* Viewport);
void AdvanceFrameForGetViewportBackBuffer(FRHIViewport* Viewport);
void ResizeViewport(FRHIViewport* Viewport, ...); void AcquireThreadOwnership();
void ReleaseThreadOwnership(); // 提交命令並重新整理到GPU.
void SubmitCommandsAndFlushGPU();
// 執行命令佇列.
void ExecuteCommandList(FRHICommandList* CmdList); // 更新資源.
void UpdateTextureReference(FRHITextureReference* TextureRef, FRHITexture* NewTexture);
void UpdateRHIResources(FRHIResourceUpdateInfo* UpdateInfos, int32 Num, bool bNeedReleaseRefs);
// 重新整理資源.
void FlushResources(); // 幀更新.
void Tick(float DeltaTime);
// 阻塞直到GPU空閒.
void BlockUntilGPUIdle(); // 暫停/開啟渲染.
void SuspendRendering();
void ResumeRendering();
bool IsRenderingSuspended(); // 壓縮/解壓資料.
bool EnqueueDecompress(uint8_t* SrcBuffer, uint8_t* DestBuffer, int CompressedSize, void* ErrorCodeBuffer);
bool EnqueueCompress(uint8_t* SrcBuffer, uint8_t* DestBuffer, int UnCompressedSize, void* ErrorCodeBuffer); // 其它介面.
bool GetAvailableResolutions(FScreenResolutionArray& Resolutions, bool bIgnoreRefreshRate);
void GetSupportedResolution(uint32& Width, uint32& Height);
void VirtualTextureSetFirstMipInMemory(FRHITexture2D* Texture, uint32 FirstMip);
void VirtualTextureSetFirstMipVisible(FRHITexture2D* Texture, uint32 FirstMip); // 獲取原生的資料.
void* GetNativeDevice();
void* GetNativeInstance();
// 獲取立即模式的命令上下文.
IRHICommandContext* GetDefaultContext();
// 獲取命令上下文容器.
IRHICommandContextContainer* GetCommandContextContainer(int32 Index, int32 Num); uint32 GetGPUFrameCycles();
}; // 在RHI實現中標記命令列表的遞迴使用的型別定義.
class FRHICommandList_RecursiveHazardous : public FRHICommandList
{
public:
FRHICommandList_RecursiveHazardous(IRHICommandContext *Context, FRHIGPUMask InGPUMask = FRHIGPUMask::All());
}; // RHI內部使用的工具類,以更安全地使用FRHICommandList_RecursiveHazardous
template <typename ContextType>
class TRHICommandList_RecursiveHazardous : public FRHICommandList_RecursiveHazardous
{
template <typename LAMBDA>
struct TRHILambdaCommand final : public FRHICommandBase
{
LAMBDA Lambda; TRHILambdaCommand(LAMBDA&& InLambda);
void ExecuteAndDestruct(FRHICommandListBase& CmdList, FRHICommandListDebugContext&) override final;
}; public:
TRHICommandList_RecursiveHazardous(ContextType *Context, FRHIGPUMask GPUMask = FRHIGPUMask::All()); template <typename LAMBDA>
void RunOnContext(LAMBDA&& Lambda);
};

FRHICommandListImmediate封裝了立即模式的圖形API介面,在UE渲染體系中被應用得非常廣泛。它額外定義了資源的操作、建立、更新、讀取和狀態轉換介面,也增加了執行緒同步和GPU同步的介面。

下面對FRHICommandList核心繼承體系來個UML圖總結一下:

classDiagram-v2
FNoncopyable <|-- FRHICommandListBase

class FRHICommandListBase{
FRHICommandBase* Root
FRHICommandBase** CommandLink
IRHICommandContext* Context
IRHIComputeContext* ComputeContext

AllocCommand()
Flush()
WaitForXXX()
QueueCommandListXXX()
}

FRHICommandListBase <|-- FRHIComputeCommandList
class FRHIComputeCommandList{
DispatchComputeShader()
DispatchIndirectComputeShader()
SetShaderXXX()
}

FRHIComputeCommandList <|-- FRHICommandList
class FRHICommandList{
SetShaderXXX()
GetBoundXXXShader()
DrawPrimitive()
DrawXXX()
}

FRHICommandList <|-- FRHICommandListImmediate
class FRHICommandListImmediate{
SubmitCommandsAndFlushGPU()
ExecuteCommandList()
ImmediateFlush()
FlushResources()
Tick()
BlockUntilGPUIdle()
StallRHIThread()
UnStallRHIThread()
SuspendRendering()
ResumeRendering()
CreateXXX()
}

FRHICommandList <|-- FRHICommandList_RecursiveHazardous
FRHICommandList_RecursiveHazardous <|-- TRHICommandList_RecursiveHazardous

10.3 RHIContext, DynamicRHI

本章將闡述RHI Context、DynamicRHI的概念、型別和關聯。

10.3.1 IRHICommandContext

IRHICommandContext是RHI的命令上下文介面類,定義了一組圖形API相關的操作。在可以並行處理命令列表的平臺上,它是一個單獨的物件。它和相關繼承型別定義如下:

// Engine\Source\Runtime\RHI\Public\RHIContext.h

// 能夠執行計算工作的上下文。可以在gfx管道上執行非同步或計算.
class IRHIComputeContext
{
public:
virtual ~IRHIComputeContext(); // 設定/派發計算著色器.
virtual void RHISetComputeShader(FRHIComputeShader* ComputeShader) = 0;
virtual void RHISetComputePipelineState(FRHIComputePipelineState* ComputePipelineState);
virtual void RHIDispatchComputeShader(uint32 ThreadGroupCountX, uint32 ThreadGroupCountY, uint32 ThreadGroupCountZ) = 0;
virtual void RHIDispatchIndirectComputeShader(FRHIVertexBuffer* ArgumentBuffer, uint32 ArgumentOffset) = 0;
virtual void RHISetAsyncComputeBudget(EAsyncComputeBudget Budget) {} // 轉換資源.
virtual void RHIBeginTransitions(TArrayView<const FRHITransition*> Transitions) = 0;
virtual void RHIEndTransitions(TArrayView<const FRHITransition*> Transitions) = 0; // UAV
virtual void RHIClearUAVFloat(FRHIUnorderedAccessView* UnorderedAccessViewRHI, const FVector4& Values) = 0;
virtual void RHIClearUAVUint(FRHIUnorderedAccessView* UnorderedAccessViewRHI, const FUintVector4& Values) = 0;
virtual void RHIBeginUAVOverlap() {}
virtual void RHIEndUAVOverlap() {}
virtual void RHIBeginUAVOverlap(TArrayView<FRHIUnorderedAccessView* const> UAVs) {}
virtual void RHIEndUAVOverlap(TArrayView<FRHIUnorderedAccessView* const> UAVs) {} // 著色器引數.
virtual void RHISetShaderTexture(FRHIComputeShader* PixelShader, uint32 TextureIndex, FRHITexture* NewTexture) = 0;
virtual void RHISetShaderSampler(FRHIComputeShader* ComputeShader, uint32 SamplerIndex, FRHISamplerState* NewState) = 0;
virtual void RHISetUAVParameter(FRHIComputeShader* ComputeShader, uint32 UAVIndex, FRHIUnorderedAccessView* UAV) = 0;
virtual void RHISetUAVParameter(FRHIComputeShader* ComputeShader, uint32 UAVIndex, FRHIUnorderedAccessView* UAV, uint32 InitialCount) = 0;
virtual void RHISetShaderResourceViewParameter(FRHIComputeShader* ComputeShader, uint32 SamplerIndex, FRHIShaderResourceView* SRV) = 0;
virtual void RHISetShaderUniformBuffer(FRHIComputeShader* ComputeShader, uint32 BufferIndex, FRHIUniformBuffer* Buffer) = 0;
virtual void RHISetShaderParameter(FRHIComputeShader* ComputeShader, uint32 BufferIndex, uint32 BaseIndex, uint32 NumBytes, const void* NewValue) = 0;
virtual void RHISetGlobalUniformBuffers(const FUniformBufferStaticBindings& InUniformBuffers); // 壓入/彈出事件.
virtual void RHIPushEvent(const TCHAR* Name, FColor Color) = 0;
virtual void RHIPopEvent() = 0; // 其它介面.
virtual void RHISubmitCommandsHint() = 0;
virtual void RHIInvalidateCachedState() {}
virtual void RHICopyToStagingBuffer(FRHIVertexBuffer* SourceBufferRHI, FRHIStagingBuffer* DestinationStagingBufferRHI, uint32 InOffset, uint32 InNumBytes);
virtual void RHIWriteGPUFence(FRHIGPUFence* FenceRHI);
virtual void RHISetGPUMask(FRHIGPUMask GPUMask); // 加速結構.
virtual void RHIBuildAccelerationStructure(FRHIRayTracingGeometry* Geometry);
virtual void RHIBuildAccelerationStructures(const TArrayView<const FAccelerationStructureBuildParams> Params);
virtual void RHIBuildAccelerationStructure(FRHIRayTracingScene* Scene); // 獲取計算上下文.
inline IRHIComputeContext& GetLowestLevelContext() { return *this; }
inline IRHIComputeContext& GetHighestLevelContext() { return *this; }
}; // 命令上下文.
class IRHICommandContext : public IRHIComputeContext
{
public:
virtual ~IRHICommandContext(); // 派發計算.
virtual void RHIDispatchComputeShader(uint32 ThreadGroupCountX, uint32 ThreadGroupCountY, uint32 ThreadGroupCountZ) = 0;
virtual void RHIDispatchIndirectComputeShader(FRHIVertexBuffer* ArgumentBuffer, uint32 ArgumentOffset) = 0; // 渲染查詢.
virtual void RHIBeginRenderQuery(FRHIRenderQuery* RenderQuery) = 0;
virtual void RHIEndRenderQuery(FRHIRenderQuery* RenderQuery) = 0;
virtual void RHIPollOcclusionQueries(); // 開啟/結束介面.
virtual void RHIBeginDrawingViewport(FRHIViewport* Viewport, FRHITexture* RenderTargetRHI) = 0;
virtual void RHIEndDrawingViewport(FRHIViewport* Viewport, bool bPresent, bool bLockToVsync) = 0;
virtual void RHIBeginFrame() = 0;
virtual void RHIEndFrame() = 0;
virtual void RHIBeginScene() = 0;
virtual void RHIEndScene() = 0;
virtual void RHIBeginUpdateMultiFrameResource(FRHITexture* Texture);
virtual void RHIEndUpdateMultiFrameResource(FRHITexture* Texture);
virtual void RHIBeginUpdateMultiFrameResource(FRHIUnorderedAccessView* UAV);
virtual void RHIEndUpdateMultiFrameResource(FRHIUnorderedAccessView* UAV); // 設定資料.
virtual void RHISetStreamSource(uint32 StreamIndex, FRHIVertexBuffer* VertexBuffer, uint32 Offset) = 0;
virtual void RHISetViewport(float MinX, float MinY, float MinZ, float MaxX, float MaxY, float MaxZ) = 0;
virtual void RHISetStereoViewport(...);
virtual void RHISetScissorRect(bool bEnable, uint32 MinX, uint32 MinY, uint32 MaxX, uint32 MaxY) = 0;
virtual void RHISetGraphicsPipelineState(FRHIGraphicsPipelineState* GraphicsState, bool bApplyAdditionalState) = 0; // 設定著色器引數.
virtual void RHISetShaderTexture(FRHIGraphicsShader* Shader, uint32 TextureIndex, FRHITexture* NewTexture) = 0;
virtual void RHISetShaderTexture(FRHIComputeShader* PixelShader, uint32 TextureIndex, FRHITexture* NewTexture) = 0;
virtual void RHISetShaderSampler(FRHIComputeShader* ComputeShader, uint32 SamplerIndex, FRHISamplerState* NewState) = 0;
virtual void RHISetShaderSampler(FRHIGraphicsShader* Shader, uint32 SamplerIndex, FRHISamplerState* NewState) = 0;
virtual void RHISetUAVParameter(FRHIPixelShader* PixelShader, uint32 UAVIndex, FRHIUnorderedAccessView* UAV) = 0;
virtual void RHISetUAVParameter(FRHIComputeShader* ComputeShader, uint32 UAVIndex, FRHIUnorderedAccessView* UAV) = 0;
virtual void RHISetUAVParameter(FRHIComputeShader* ComputeShader, uint32 UAVIndex, FRHIUnorderedAccessView* UAV, uint32 InitialCount) = 0;
virtual void RHISetShaderResourceViewParameter(FRHIComputeShader* ComputeShader, uint32 SamplerIndex, FRHIShaderResourceView* SRV) = 0;
virtual void RHISetShaderResourceViewParameter(FRHIGraphicsShader* Shader, uint32 SamplerIndex, FRHIShaderResourceView* SRV) = 0;
virtual void RHISetShaderUniformBuffer(FRHIGraphicsShader* Shader, uint32 BufferIndex, FRHIUniformBuffer* Buffer) = 0;
virtual void RHISetShaderUniformBuffer(FRHIComputeShader* ComputeShader, uint32 BufferIndex, FRHIUniformBuffer* Buffer) = 0;
virtual void RHISetShaderParameter(FRHIGraphicsShader* Shader, uint32 BufferIndex, uint32 BaseIndex, uint32 NumBytes, const void* NewValue) = 0;
virtual void RHISetShaderParameter(FRHIComputeShader* ComputeShader, uint32 BufferIndex, uint32 BaseIndex, uint32 NumBytes, const void* NewValue) = 0;
virtual void RHISetStencilRef(uint32 StencilRef) {}
virtual void RHISetBlendFactor(const FLinearColor& BlendFactor) {} // 繪製圖元.
virtual void RHIDrawPrimitive(uint32 BaseVertexIndex, uint32 NumPrimitives, uint32 NumInstances) = 0;
virtual void RHIDrawPrimitiveIndirect(FRHIVertexBuffer* ArgumentBuffer, uint32 ArgumentOffset) = 0;
virtual void RHIDrawIndexedIndirect(FRHIIndexBuffer* IndexBufferRHI, FRHIStructuredBuffer* ArgumentsBufferRHI, int32 DrawArgumentsIndex, uint32 NumInstances) = 0;
virtual void RHIDrawIndexedPrimitive(FRHIIndexBuffer* IndexBuffer, int32 BaseVertexIndex, uint32 FirstInstance, uint32 NumVertices, uint32 StartIndex, uint32 NumPrimitives, uint32 NumInstances) = 0;
virtual void RHIDrawIndexedPrimitiveIndirect(FRHIIndexBuffer* IndexBuffer, FRHIVertexBuffer* ArgumentBuffer, uint32 ArgumentOffset) = 0; // 其它介面
virtual void RHISetDepthBounds(float MinDepth, float MaxDepth) = 0;
virtual void RHISetShadingRate(EVRSShadingRate ShadingRate, EVRSRateCombiner Combiner);
virtual void RHISetShadingRateImage(FRHITexture* RateImageTexture, EVRSRateCombiner Combiner);
virtual void RHISetMultipleViewports(uint32 Count, const FViewportBounds* Data) = 0;
virtual void RHICopyToResolveTarget(FRHITexture* SourceTexture, FRHITexture* DestTexture, const FResolveParams& ResolveParams) = 0;
virtual void RHIResummarizeHTile(FRHITexture2D* DepthTexture);
virtual void RHICalibrateTimers();
virtual void RHICalibrateTimers(FRHITimestampCalibrationQuery* CalibrationQuery);
virtual void RHIDiscardRenderTargets(bool Depth, bool Stencil, uint32 ColorBitMask) {} // 紋理
virtual void RHIUpdateTextureReference(FRHITextureReference* TextureRef, FRHITexture* NewTexture) = 0;
virtual void RHICopyTexture(FRHITexture* SourceTexture, FRHITexture* DestTexture, const FRHICopyTextureInfo& CopyInfo);
virtual void RHICopyBufferRegion(FRHIVertexBuffer* DestBuffer, ...); // Pass相關.
virtual void RHIBeginRenderPass(const FRHIRenderPassInfo& InInfo, const TCHAR* InName) = 0;
virtual void RHIEndRenderPass() = 0;
virtual void RHINextSubpass(); // 光線追蹤.
virtual void RHIClearRayTracingBindings(FRHIRayTracingScene* Scene);
virtual void RHIBuildAccelerationStructures(const TArrayView<const FAccelerationStructureBuildParams> Params);
virtual void RHIBuildAccelerationStructure(FRHIRayTracingGeometry* Geometry) final override;
virtual void RHIBuildAccelerationStructure(FRHIRayTracingScene* Scene);
virtual void RHIRayTraceOcclusion(FRHIRayTracingScene* Scene, ...);
virtual void RHIRayTraceIntersection(FRHIRayTracingScene* Scene, ...);
virtual void RHIRayTraceDispatch(FRHIRayTracingPipelineState* RayTracingPipelineState, ...);
virtual void RHISetRayTracingHitGroups(FRHIRayTracingScene* Scene, ...);
virtual void RHISetRayTracingHitGroup(FRHIRayTracingScene* Scene, ...);
virtual void RHISetRayTracingCallableShader(FRHIRayTracingScene* Scene, ...);
virtual void RHISetRayTracingMissShader(FRHIRayTracingScene* Scene, ...); (......) protected:
// 渲染Pass資訊.
FRHIRenderPassInfo RenderPassInfo;
};

以上可知,IRHICommandContext的介面和FRHICommandList的介面高度相似且重疊。IRHICommandContext還有許多子類:

  • IRHICommandContextPSOFallback:不支援真正的圖形管道的RHI命令上下文。

    • FNullDynamicRHI:空實現的動態繫結RHI。
    • FOpenGLDynamicRHI:OpenGL的動態RHI。
    • FD3D11DynamicRHI:D3D11的動態RHI。
  • FMetalRHICommandContext:Metal平臺的命令上下文。

  • FD3D12CommandContextBase:D3D12的命令上下文。

  • FVulkanCommandListContext:Vulkan平臺的命令佇列上下文。

  • FEmptyDynamicRHI:動態繫結的RHI實現的介面。

  • FValidationContext:校驗上下文。

上述的子類中,平臺相關的部分子類還繼承了FDynamicRHI。IRHICommandContextPSOFallback比較特殊,它的子類都是不支援並行繪製的圖形API(OpenGL、D3D11)。IRHICommandContextPSOFallback定義如下:

class IRHICommandContextPSOFallback : public IRHICommandContext
{
public:
// 設定渲染狀態.
virtual void RHISetBoundShaderState(FRHIBoundShaderState* BoundShaderState) = 0;
virtual void RHISetDepthStencilState(FRHIDepthStencilState* NewState, uint32 StencilRef) = 0;
virtual void RHISetRasterizerState(FRHIRasterizerState* NewState) = 0;
virtual void RHISetBlendState(FRHIBlendState* NewState, const FLinearColor& BlendFactor) = 0;
virtual void RHIEnableDepthBoundsTest(bool bEnable) = 0;
// 管線狀態.
virtual void RHISetGraphicsPipelineState(FRHIGraphicsPipelineState* GraphicsState, bool bApplyAdditionalState) override;
};

IRHICommandContext的核心繼承UML圖如下:

classDiagram-v2
IRHIComputeContext <|.. IRHICommandContext
IRHICommandContext <|.. IRHICommandContextPSOFallback
IRHICommandContextPSOFallback <|-- FNullDynamicRHI
IRHICommandContextPSOFallback <|-- FOpenGLDynamicRHI
IRHICommandContextPSOFallback <|-- FD3D11DynamicRHI
IRHICommandContext <|-- FD3D12CommandContextBase
IRHICommandContext <|-- FMetalRHICommandContext
IRHICommandContext <|-- FVulkanCommandListContext
IRHICommandContext <|-- FEmptyDynamicRHI

class IRHIComputeContext{

}

10.3.2 IRHICommandContextContainer

IRHICommandContextContainer就是包含了IRHICommandContext物件的型別,它和核心繼承子類的定義如下:

// Engine\Source\Runtime\RHI\Public\RHICommandList.h

class IRHICommandContextContainer
{
public:
virtual ~IRHICommandContextContainer(); // 獲取IRHICommandContext例項.
virtual IRHICommandContext* GetContext();
virtual void SubmitAndFreeContextContainer(int32 Index, int32 Num);
virtual void FinishContext();
}; // Engine\Source\Runtime\Apple\MetalRHI\Private\MetalContext.cpp class FMetalCommandContextContainer : public IRHICommandContextContainer
{
// FMetalRHICommandContext列表的下一個.
FMetalRHICommandContext* CmdContext;
int32 Index;
int32 Num; public:
void* operator new(size_t Size);
void operator delete(void *RawMemory); FMetalCommandContextContainer(int32 InIndex, int32 InNum);
virtual ~FMetalCommandContextContainer() override final; virtual IRHICommandContext* GetContext() override final;
virtual void FinishContext() override final;
// 提交併釋放自己.
virtual void SubmitAndFreeContextContainer(int32 NewIndex, int32 NewNum) override final;
}; // FMetalCommandContextContainer分配器.
static TLockFreeFixedSizeAllocator<sizeof(FMetalCommandContextContainer), PLATFORM_CACHE_LINE_SIZE, FThreadSafeCounter> FMetalCommandContextContainerAllocator; // Engine\Source\Runtime\D3D12RHI\Private\D3D12CommandContext.cpp class FD3D12CommandContextContainer : public IRHICommandContextContainer
{
// 介面卡.
FD3D12Adapter* Adapter;
// 命令上下文.
FD3D12CommandContext* CmdContext;
// 上下文重定向器.
FD3D12CommandContextRedirector* CmdContextRedirector;
FRHIGPUMask GPUMask; // 命令佇列列表.
TArray<FD3D12CommandListHandle> CommandLists; public:
void* operator new(size_t Size);
void operator delete(void* RawMemory); FD3D12CommandContextContainer(FD3D12Adapter* InAdapter, FRHIGPUMask InGPUMask);
virtual ~FD3D12CommandContextContainer() override virtual IRHICommandContext* GetContext() override;
virtual void FinishContext() override;
virtual void SubmitAndFreeContextContainer(int32 Index, int32 Num) override;
}; // Engine\Source\Runtime\VulkanRHI\Private\VulkanContext.h struct FVulkanCommandContextContainer : public IRHICommandContextContainer, public VulkanRHI::FDeviceChild
{
// 命令佇列上下文.
FVulkanCommandListContext* CmdContext; FVulkanCommandContextContainer(FVulkanDevice* InDevice); virtual IRHICommandContext* GetContext() override final;
virtual void FinishContext() override final;
virtual void SubmitAndFreeContextContainer(int32 Index, int32 Num) override final; void* operator new(size_t Size);
void operator delete(void* RawMemory);
};

IRHICommandContextContainer相當於儲存了一個或一組命令上下文的容器,以支援並行化地提交命令佇列,只在D3D12、Metal、Vulkan等現代圖形API中有實現。完整繼承UML圖如下:

classDiagram-v2
IRHICommandContextContainer <|-- FMetalCommandContextContainer

class IRHICommandContextContainer{
IRHICommandContext* GetContext()
SubmitAndFreeContextContainer()
FinishContext()
}

class FMetalCommandContextContainer{
FMetalRHICommandContext* CmdContext
}

IRHICommandContextContainer <|-- FD3D12CommandContextContainer
class FD3D12CommandContextContainer{
FD3D12Adapter* Adapter
FD3D12CommandContext* CmdContext
FD3D12CommandContextRedirector* CmdContextRedirector
TArray<FD3D12CommandListHandle> CommandLists
}

IRHICommandContextContainer <|-- FVulkanCommandContextContainer
class FVulkanCommandContextContainer{
FVulkanCommandListContext* CmdContext
}

IRHICommandContextContainer <|-- FValidationRHICommandContextContainer

10.3.3 FDynamicRHI

FDynamicRHI是由動態繫結的RHI實現的介面,它定義的介面和CommandList、CommandContext比較相似,部分如下:

class RHI_API FDynamicRHI
{
public:
virtual ~FDynamicRHI() {} virtual void Init() = 0;
virtual void PostInit() {}
virtual void Shutdown() = 0; void InitPixelFormatInfo(const TArray<uint32>& PixelFormatBlockBytesIn); // ---- RHI介面 ---- // 下列介面要求FlushType: Thread safe
virtual FSamplerStateRHIRef RHICreateSamplerState(const FSamplerStateInitializerRHI& Initializer) = 0;
virtual FRasterizerStateRHIRef RHICreateRasterizerState(const FRasterizerStateInitializerRHI& Initializer) = 0;
virtual FDepthStencilStateRHIRef RHICreateDepthStencilState(const FDepthStencilStateInitializerRHI& Initializer) = 0;
virtual FBlendStateRHIRef RHICreateBlendState(const FBlendStateInitializerRHI& Initializer) = 0; // 下列介面要求FlushType: Wait RHI Thread
virtual FVertexDeclarationRHIRef RHICreateVertexDeclaration(const FVertexDeclarationElementList& Elements) = 0;
virtual FPixelShaderRHIRef RHICreatePixelShader(TArrayView<const uint8> Code, const FSHAHash& Hash) = 0;
virtual FVertexShaderRHIRef RHICreateVertexShader(TArrayView<const uint8> Code, const FSHAHash& Hash) = 0;
virtual FHullShaderRHIRef RHICreateHullShader(TArrayView<const uint8> Code, const FSHAHash& Hash) = 0;
virtual FDomainShaderRHIRef RHICreateDomainShader(TArrayView<const uint8> Code, const FSHAHash& Hash) = 0;
virtual FGeometryShaderRHIRef RHICreateGeometryShader(TArrayView<const uint8> Code, const FSHAHash& Hash) = 0;
virtual FComputeShaderRHIRef RHICreateComputeShader(TArrayView<const uint8> Code, const FSHAHash& Hash) = 0; // FlushType: Must be Thread-Safe.
virtual FRenderQueryPoolRHIRef RHICreateRenderQueryPool(ERenderQueryType QueryType, uint32 NumQueries = UINT32_MAX);
inline FComputeFenceRHIRef RHICreateComputeFence(const FName& Name); virtual FGPUFenceRHIRef RHICreateGPUFence(const FName &Name);
virtual void RHICreateTransition(FRHITransition* Transition, ERHIPipeline SrcPipelines, ERHIPipeline DstPipelines, ERHICreateTransitionFlags CreateFlags, TArrayView<const FRHITransitionInfo> Infos);
virtual void RHIReleaseTransition(FRHITransition* Transition); // FlushType: Thread safe.
virtual FStagingBufferRHIRef RHICreateStagingBuffer();
virtual void* RHILockStagingBuffer(FRHIStagingBuffer* StagingBuffer, FRHIGPUFence* Fence, uint32 Offset, uint32 SizeRHI);
virtual void RHIUnlockStagingBuffer(FRHIStagingBuffer* StagingBuffer); // FlushType: Thread safe, but varies depending on the RHI
virtual FBoundShaderStateRHIRef RHICreateBoundShaderState(FRHIVertexDeclaration* VertexDeclaration, FRHIVertexShader* VertexShader, FRHIHullShader* HullShader, FRHIDomainShader* DomainShader, FRHIPixelShader* PixelShader, FRHIGeometryShader* GeometryShader) = 0;
// FlushType: Thread safe
virtual FGraphicsPipelineStateRHIRef RHICreateGraphicsPipelineState(const FGraphicsPipelineStateInitializer& Initializer); // FlushType: Thread safe, but varies depending on the RHI
virtual FUniformBufferRHIRef RHICreateUniformBuffer(const void* Contents, const FRHIUniformBufferLayout& Layout, EUniformBufferUsage Usage, EUniformBufferValidation Validation) = 0;
virtual void RHIUpdateUniformBuffer(FRHIUniformBuffer* UniformBufferRHI, const void* Contents) = 0; // FlushType: Wait RHI Thread
virtual FIndexBufferRHIRef RHICreateIndexBuffer(uint32 Stride, uint32 Size, uint32 InUsage, ERHIAccess InResourceState, FRHIResourceCreateInfo& CreateInfo) = 0;
virtual void* RHILockIndexBuffer(FRHICommandListImmediate& RHICmdList, FRHIIndexBuffer* IndexBuffer, uint32 Offset, uint32 Size, EResourceLockMode LockMode);
virtual void RHIUnlockIndexBuffer(FRHICommandListImmediate& RHICmdList, FRHIIndexBuffer* IndexBuffer);
virtual void RHITransferIndexBufferUnderlyingResource(FRHIIndexBuffer* DestIndexBuffer, FRHIIndexBuffer* SrcIndexBuffer); // FlushType: Wait RHI Thread
virtual FVertexBufferRHIRef RHICreateVertexBuffer(uint32 Size, uint32 InUsage, ERHIAccess InResourceState, FRHIResourceCreateInfo& CreateInfo) = 0;
// FlushType: Flush RHI Thread
virtual void* RHILockVertexBuffer(FRHICommandListImmediate& RHICmdList, FRHIVertexBuffer* VertexBuffer, uint32 Offset, uint32 SizeRHI, EResourceLockMode LockMode);
virtual void RHIUnlockVertexBuffer(FRHICommandListImmediate& RHICmdList, FRHIVertexBuffer* VertexBuffer);
// FlushType: Flush Immediate (seems dangerous)
virtual void RHICopyVertexBuffer(FRHIVertexBuffer* SourceBuffer, FRHIVertexBuffer* DestBuffer) = 0;
virtual void RHITransferVertexBufferUnderlyingResource(FRHIVertexBuffer* DestVertexBuffer, FRHIVertexBuffer* SrcVertexBuffer); // FlushType: Wait RHI Thread
virtual FStructuredBufferRHIRef RHICreateStructuredBuffer(uint32 Stride, uint32 Size, uint32 InUsage, ERHIAccess InResourceState, FRHIResourceCreateInfo& CreateInfo) = 0;
// FlushType: Flush RHI Thread
virtual void* RHILockStructuredBuffer(FRHICommandListImmediate& RHICmdList, FRHIStructuredBuffer* StructuredBuffer, uint32 Offset, uint32 SizeRHI, EResourceLockMode LockMode);
virtual void RHIUnlockStructuredBuffer(FRHICommandListImmediate& RHICmdList, FRHIStructuredBuffer* StructuredBuffer); // FlushType: Wait RHI Thread
virtual FUnorderedAccessViewRHIRef RHICreateUnorderedAccessView(FRHIStructuredBuffer* StructuredBuffer, bool bUseUAVCounter, bool bAppendBuffer) = 0;
// FlushType: Wait RHI Thread
virtual FUnorderedAccessViewRHIRef RHICreateUnorderedAccessView(FRHITexture* Texture, uint32 MipLevel) = 0;
// FlushType: Wait RHI Thread
virtual FUnorderedAccessViewRHIRef RHICreateUnorderedAccessView(FRHITexture* Texture, uint32 MipLevel, uint8 Format); (......) // RHI幀更新,須從主執行緒呼叫,FlushType: Thread safe
virtual void RHITick(float DeltaTime) = 0;
// 阻塞CPU直到GPU執行完成變成空閒. FlushType: Flush Immediate (seems wrong)
virtual void RHIBlockUntilGPUIdle() = 0;
// 開始當前幀,並確保GPU正在積極地工作 FlushType: Flush Immediate (copied from RHIBlockUntilGPUIdle)
virtual void RHISubmitCommandsAndFlushGPU() {}; // 通知RHI準備暫停它.
virtual void RHIBeginSuspendRendering() {};
// 暫停RHI渲染並將控制權交給系統的操作, FlushType: Thread safe
virtual void RHISuspendRendering() {};
// 繼續RHI渲染, FlushType: Thread safe
virtual void RHIResumeRendering() {};
// FlushType: Flush Immediate
virtual bool RHIIsRenderingSuspended() { return false; }; // FlushType: called from render thread when RHI thread is flushed
// 僅在FRHIResource::FlushPendingDeletes內的延遲刪除之前每幀呼叫.
virtual void RHIPerFrameRHIFlushComplete(); // 執行命令佇列, FlushType: Wait RHI Thread
virtual void RHIExecuteCommandList(FRHICommandList* CmdList) = 0; // FlushType: Flush RHI Thread
virtual void* RHIGetNativeDevice() = 0;
// FlushType: Flush RHI Thread
virtual void* RHIGetNativeInstance() = 0; // 獲取命令上下文. FlushType: Thread safe
virtual IRHICommandContext* RHIGetDefaultContext() = 0;
// 獲取計算上下文. FlushType: Thread safe
virtual IRHIComputeContext* RHIGetDefaultAsyncComputeContext(); // FlushType: Thread safe
virtual class IRHICommandContextContainer* RHIGetCommandContextContainer(int32 Index, int32 Num) = 0; // 直接由渲染執行緒呼叫的介面, 以優化RHI呼叫.
virtual FVertexBufferRHIRef CreateAndLockVertexBuffer_RenderThread(class FRHICommandListImmediate& RHICmdList, uint32 Size, uint32 InUsage, ERHIAccess InResourceState, FRHIResourceCreateInfo& CreateInfo, void*& OutDataBuffer);
virtual FIndexBufferRHIRef CreateAndLockIndexBuffer_RenderThread(class FRHICommandListImmediate& RHICmdList, uint32 Stride, uint32 Size, uint32 InUsage, ERHIAccess InResourceState, FRHIResourceCreateInfo& CreateInfo, void*& OutDataBuffer); (......) // Buffer Lock/Unlock
virtual void* LockVertexBuffer_BottomOfPipe(class FRHICommandListImmediate& RHICmdList, ...);
virtual void* LockIndexBuffer_BottomOfPipe(class FRHICommandListImmediate& RHICmdList, ...); (......)
};

以上只顯示了部分介面,其中部分介面要求從渲染執行緒呼叫,部分須從遊戲執行緒呼叫。大多數介面在被呼叫前需重新整理指定型別的命令,比如:

class RHI_API FDynamicRHI
{
// FlushType: Wait RHI Thread
void RHIExecuteCommandList(FRHICommandList* CmdList); // FlushType: Flush Immediate
void RHIBlockUntilGPUIdle(); // FlushType: Thread safe
void RHITick(float DeltaTime);
};

那麼呼叫以上介面的程式碼如下:

class RHI_API FRHICommandListImmediate : public FRHICommandList
{
void ExecuteCommandList(FRHICommandList* CmdList)
{
// 等待RHI執行緒.
FScopedRHIThreadStaller StallRHIThread(*this);
GDynamicRHI->RHIExecuteCommandList(CmdList);
} void BlockUntilGPUIdle()
{
// 呼叫FDynamicRHI::RHIBlockUntilGPUIdle須重新整理RHI.
ImmediateFlush(EImmediateFlushType::FlushRHIThread);
GDynamicRHI->RHIBlockUntilGPUIdle();
} void Tick(float DeltaTime)
{
// 由於FDynamicRHI::RHITick是Thread Safe(執行緒安全), 所以不需要呼叫ImmediateFlush或等待事件.
GDynamicRHI->RHITick(DeltaTime);
}
};

我們繼續看FDynamicRHI的子類定義:

// Engine\Source\Runtime\Apple\MetalRHI\Private\MetalDynamicRHI.h

class FMetalDynamicRHI : public FDynamicRHI
{
public:
FMetalDynamicRHI(ERHIFeatureLevel::Type RequestedFeatureLevel);
~FMetalDynamicRHI(); // 設定必要的內部資源
void SetupRecursiveResources(); // FDynamicRHI interface.
virtual void Init();
virtual void Shutdown() {}
virtual const TCHAR* GetName() override { return TEXT("Metal"); } virtual FSamplerStateRHIRef RHICreateSamplerState(const FSamplerStateInitializerRHI& Initializer) final override;
virtual FRasterizerStateRHIRef RHICreateRasterizerState(const FRasterizerStateInitializerRHI& Initializer) final override;
virtual FDepthStencilStateRHIRef RHICreateDepthStencilState(...) final override; (......) private:
// 立即模式上下文.
FMetalRHIImmediateCommandContext ImmediateContext;
// 非同步計算上下文.
FMetalRHICommandContext* AsyncComputeContext;
// 頂點宣告快取.
TMap<uint32, FVertexDeclarationRHIRef> VertexDeclarationCache;
}; // Engine\Source\Runtime\D3D12RHI\Private\D3D12RHIPrivate.h class FD3D12DynamicRHI : public FDynamicRHI
{
static FD3D12DynamicRHI* SingleD3DRHI; public:
static D3D12RHI_API FD3D12DynamicRHI* GetD3DRHI() { return SingleD3DRHI; } FD3D12DynamicRHI(const TArray<TSharedPtr<FD3D12Adapter>>& ChosenAdaptersIn, bool bInPixEventEnabled);
virtual ~FD3D12DynamicRHI(); // FDynamicRHI interface.
virtual void Init() override;
virtual void PostInit() override;
virtual void Shutdown() override;
virtual const TCHAR* GetName() override { return TEXT("D3D12"); } virtual FSamplerStateRHIRef RHICreateSamplerState(const FSamplerStateInitializerRHI& Initializer) final override;
virtual FRasterizerStateRHIRef RHICreateRasterizerState(const FRasterizerStateInitializerRHI& Initializer) final override;
virtual FDepthStencilStateRHIRef RHICreateDepthStencilState(const FDepthStencilStateInitializerRHI& Initializer) final override; (......) protected:
// 已選擇的介面卡.
TArray<TSharedPtr<FD3D12Adapter>> ChosenAdapters;
// AMD AGS工具庫上下文.
AGSContext* AmdAgsContext; // D3D12裝置.
inline FD3D12Device* GetRHIDevice(uint32 GPUIndex)
{
return GetAdapter().GetDevice(GPUIndex);
} (......)
}; // Engine\Source\Runtime\EmptyRHI\Public\EmptyRHI.h class FEmptyDynamicRHI : public FDynamicRHI, public IRHICommandContext
{
(......)
}; // Engine\Source\Runtime\NullDrv\Public\NullRHI.h class FNullDynamicRHI : public FDynamicRHI , public IRHICommandContextPSOFallback
{
(......)
}; class OPENGLDRV_API FOpenGLDynamicRHI final : public FDynamicRHI, public IRHICommandContextPSOFallback
{
public:
FOpenGLDynamicRHI();
~FOpenGLDynamicRHI(); // FDynamicRHI interface.
virtual void Init();
virtual void PostInit(); virtual void Shutdown();
virtual const TCHAR* GetName() override { return TEXT("OpenGL"); } virtual FRasterizerStateRHIRef RHICreateRasterizerState(const FRasterizerStateInitializerRHI& Initializer) final override;
virtual FDepthStencilStateRHIRef RHICreateDepthStencilState(const FDepthStencilStateInitializerRHI& Initializer) final override;
virtual FBlendStateRHIRef RHICreateBlendState(const FBlendStateInitializerRHI& Initializer) final override; (......) private:
// 計數器.
uint32 SceneFrameCounter;
uint32 ResourceTableFrameCounter; // RHI裝置狀態, 獨立於使用的底層OpenGL上下文.
FOpenGLRHIState PendingState;
FOpenGLStreamedVertexBufferArray DynamicVertexBuffers;
FOpenGLStreamedIndexBufferArray DynamicIndexBuffers;
FSamplerStateRHIRef PointSamplerState; // 已建立的視口.
TArray<FOpenGLViewport*> Viewports;
TRefCountPtr<FOpenGLViewport> DrawingViewport;
bool bRevertToSharedContextAfterDrawingViewport; // 已繫結的著色器狀態歷史.
TGlobalResource< TBoundShaderStateHistory<10000> > BoundShaderStateHistory; // 逐上下文狀態快取.
FOpenGLContextState InvalidContextState;
FOpenGLContextState SharedContextState;
FOpenGLContextState RenderingContextState; // 統一緩衝區.
TArray<FRHIUniformBuffer*> GlobalUniformBuffers;
TMap<GLuint, TPair<GLenum, GLenum>> TextureMipLimits; // 底層平臺相關的資料.
FPlatformOpenGLDevice* PlatformDevice; // 查詢相關.
TArray<FOpenGLRenderQuery*> Queries;
FCriticalSection QueriesListCriticalSection; // 配置和呈現資料.
FOpenGLGPUProfiler GPUProfilingData;
FCriticalSection CustomPresentSection;
TRefCountPtr<class FRHICustomPresent> CustomPresent; (......)
}; // Engine\Source\Runtime\RHI\Public\RHIValidation.h class FValidationRHI : public FDynamicRHI
{
public:
RHI_API FValidationRHI(FDynamicRHI* InRHI);
RHI_API virtual ~FValidationRHI(); virtual FSamplerStateRHIRef RHICreateSamplerState(const FSamplerStateInitializerRHI& Initializer) override final;
virtual FRasterizerStateRHIRef RHICreateRasterizerState(const FRasterizerStateInitializerRHI& Initializer) override final;
virtual FDepthStencilStateRHIRef RHICreateDepthStencilState(const FDepthStencilStateInitializerRHI& Initializer) override final; (......) // RHI例項.
FDynamicRHI* RHI;
// 所屬的上下文.
TIndirectArray<IRHIComputeContext> OwnedContexts;
// 深度模板狀態列表.
TMap<FRHIDepthStencilState*, FDepthStencilStateInitializerRHI> DepthStencilStates;
}; // Engine\Source\Runtime\VulkanRHI\Public\VulkanDynamicRHI.h class FVulkanDynamicRHI : public FDynamicRHI
{
public:
FVulkanDynamicRHI();
~FVulkanDynamicRHI(); // FDynamicRHI interface.
virtual void Init() final override;
virtual void PostInit() final override;
virtual void Shutdown() final override;;
virtual const TCHAR* GetName() final override { return TEXT("Vulkan"); } void InitInstance(); virtual FSamplerStateRHIRef RHICreateSamplerState(const FSamplerStateInitializerRHI& Initializer) final override;
virtual FRasterizerStateRHIRef RHICreateRasterizerState(const FRasterizerStateInitializerRHI& Initializer) final override;
virtual FDepthStencilStateRHIRef RHICreateDepthStencilState(const FDepthStencilStateInitializerRHI& Initializer) final override; (......) protected:
// 例項.
VkInstance Instance;
TArray<const ANSICHAR*> InstanceExtensions;
TArray<const ANSICHAR*> InstanceLayers; // 裝置.
TArray<FVulkanDevice*> Devices;
FVulkanDevice* Device; // 視口.
TArray<FVulkanViewport*> Viewports;
TRefCountPtr<FVulkanViewport> DrawingViewport; // 快取.
IConsoleObject* SavePipelineCacheCmd = nullptr;
IConsoleObject* RebuildPipelineCacheCmd = nullptr; // 臨界區.
FCriticalSection LockBufferCS; // 內部介面.
void CreateInstance();
void SelectAndInitDevice();
void InitGPU(FVulkanDevice* Device);
void InitDevice(FVulkanDevice* Device); (......)
}; // Engine\Source\Runtime\Windows\D3D11RHI\Private\D3D11RHIPrivate.h class D3D11RHI_API FD3D11DynamicRHI : public FDynamicRHI, public IRHICommandContextPSOFallback
{
public:
FD3D11DynamicRHI(IDXGIFactory1* InDXGIFactory1,D3D_FEATURE_LEVEL InFeatureLevel,int32 InChosenAdapter, const DXGI_ADAPTER_DESC& ChosenDescription);
virtual ~FD3D11DynamicRHI(); virtual void InitD3DDevice(); // FDynamicRHI interface.
virtual void Init() override;
virtual void PostInit() override;
virtual void Shutdown() override;
virtual const TCHAR* GetName() override { return TEXT("D3D11"); } // HDR display output
virtual void EnableHDR();
virtual void ShutdownHDR(); virtual FSamplerStateRHIRef RHICreateSamplerState(const FSamplerStateInitializerRHI& Initializer) final override;
virtual FRasterizerStateRHIRef RHICreateRasterizerState(const FRasterizerStateInitializerRHI& Initializer) final override;
virtual FDepthStencilStateRHIRef RHICreateDepthStencilState(const FDepthStencilStateInitializerRHI& Initializer) final override; (......) ID3D11Device* GetDevice() const
{
return Direct3DDevice;
}
FD3D11DeviceContext* GetDeviceContext() const
{
return Direct3DDeviceIMContext;
}
IDXGIFactory1* GetFactory() const
{
return DXGIFactory1;
} protected:
// D3D工廠(介面).
TRefCountPtr<IDXGIFactory1> DXGIFactory1;
// D3D裝置.
TRefCountPtr<FD3D11Device> Direct3DDevice;
// D3D裝置的立即上下文.
TRefCountPtr<FD3D11DeviceContext> Direct3DDeviceIMContext; // 執行緒鎖.
FD3D11LockTracker LockTracker;
FCriticalSection LockTrackerCS; // 視口.
TArray<FD3D11Viewport*> Viewports;
TRefCountPtr<FD3D11Viewport> DrawingViewport; // AMD AGS工具庫上下文.
AGSContext* AmdAgsContext; // RT, UAV, 著色器等資源.
TRefCountPtr<ID3D11RenderTargetView> CurrentRenderTargets[D3D11_SIMULTANEOUS_RENDER_TARGET_COUNT];
TRefCountPtr<FD3D11UnorderedAccessView> CurrentUAVs[D3D11_PS_CS_UAV_REGISTER_COUNT];
ID3D11UnorderedAccessView* UAVBound[D3D11_PS_CS_UAV_REGISTER_COUNT];
TRefCountPtr<ID3D11DepthStencilView> CurrentDepthStencilTarget;
TRefCountPtr<FD3D11TextureBase> CurrentDepthTexture;
FD3D11BaseShaderResource* CurrentResourcesBoundAsSRVs[SF_NumStandardFrequencies][D3D11_COMMONSHADER_INPUT_RESOURCE_SLOT_COUNT];
FD3D11BaseShaderResource* CurrentResourcesBoundAsVBs[D3D11_IA_VERTEX_INPUT_RESOURCE_SLOT_COUNT];
FD3D11BaseShaderResource* CurrentResourceBoundAsIB;
int32 MaxBoundShaderResourcesIndex[SF_NumStandardFrequencies];
FUniformBufferRHIRef BoundUniformBuffers[SF_NumStandardFrequencies][MAX_UNIFORM_BUFFERS_PER_SHADER_STAGE];
uint16 DirtyUniformBuffers[SF_NumStandardFrequencies];
TArray<FRHIUniformBuffer*> GlobalUniformBuffers; // 已建立的常量緩衝區.
TArray<TRefCountPtr<FD3D11ConstantBuffer> > VSConstantBuffers;
TArray<TRefCountPtr<FD3D11ConstantBuffer> > HSConstantBuffers;
TArray<TRefCountPtr<FD3D11ConstantBuffer> > DSConstantBuffers;
TArray<TRefCountPtr<FD3D11ConstantBuffer> > PSConstantBuffers;
TArray<TRefCountPtr<FD3D11ConstantBuffer> > GSConstantBuffers;
TArray<TRefCountPtr<FD3D11ConstantBuffer> > CSConstantBuffers; // 已繫結的著色器狀態歷史.
TGlobalResource< TBoundShaderStateHistory<10000> > BoundShaderStateHistory;
FComputeShaderRHIRef CurrentComputeShader; (......)
};

它們的核心繼承UML圖如下:

classDiagram-v2
IRHIComputeContext <|.. IRHICommandContext
IRHICommandContext <|.. IRHICommandContextPSOFallback

class FDynamicRHI{
void* RHIGetNativeDevice()
void* RHIGetNativeInstance()
IRHICommandContext* RHIGetDefaultContext()
IRHIComputeContext* RHIGetDefaultAsyncComputeContext()
IRHICommandContextContainer* RHIGetCommandContextContainer()
}
FDynamicRHI <|-- FMetalDynamicRHI
class FMetalDynamicRHI{
FMetalRHIImmediateCommandContext ImmediateContext
FMetalRHICommandContext* AsyncComputeContext
}
FDynamicRHI <|-- FD3D12DynamicRHI
class FD3D12DynamicRHI{
static FD3D12DynamicRHI* SingleD3DRHI
FD3D12Adapter* ChosenAdapters
FD3D12Device* GetRHIDevice()
}

FDynamicRHI <|-- FD3D11DynamicRHI
IRHICommandContextPSOFallback <|-- FD3D11DynamicRHI
class FD3D11DynamicRHI{
IDXGIFactory1* DXGIFactory1
FD3D11Device* Direct3DDevice
FD3D11DeviceContext* Direct3DDeviceIMContext
}

FDynamicRHI <|-- FOpenGLDynamicRHI
IRHICommandContextPSOFallback <|-- FOpenGLDynamicRHI
class FOpenGLDynamicRHI{
FPlatformOpenGLDevice* PlatformDevice
}

FDynamicRHI <|-- FValidationRHI
class FValidationRHI{
}

FDynamicRHI <|-- FVulkanDynamicRHI
class FVulkanDynamicRHI{
VkInstance Instance
FVulkanDevice* Devices
}

FDynamicRHI <|-- FEmptyDynamicRHI
IRHICommandContext <|-- FEmptyDynamicRHI

FDynamicRHI <|-- FNullDynamicRHI
IRHICommandContextPSOFallback <|-- FNullDynamicRHI

可點選下面圖片放大:

需要注意的是,傳統圖形API(D3D11、OpenGL)除了繼承FDynamicRHI,還需要繼承IRHICommandContextPSOFallback,因為需要藉助後者的介面處理PSO的資料和行為,以保證傳統和現代API對PSO的一致處理行為。也正因為此,現代圖形API(D3D12、Vulkan、Metal)不需要繼承IRHICommandContext的任何繼承體系的型別,單單直接繼承FDynamicRHI就可以處理RHI層的所有資料和操作。

既然現代圖形API(D3D12、Vulkan、Metal)的DynamicRHI沒有繼承IRHICommandContext的任何繼承體系的型別,那麼它們是如何實現FDynamicRHI::RHIGetDefaultContext的介面?下面以FD3D12DynamicRHI為例:

IRHICommandContext* FD3D12DynamicRHI::RHIGetDefaultContext()
{
FD3D12Adapter& Adapter = GetAdapter(); IRHICommandContext* DefaultCommandContext = nullptr;
if (GNumExplicitGPUsForRendering > 1) // 多GPU
{
DefaultCommandContext = static_cast<IRHICommandContext*>(&Adapter.GetDefaultContextRedirector());
}
else // 單GPU
{
FD3D12Device* Device = Adapter.GetDevice(0);
DefaultCommandContext = static_cast<IRHICommandContext*>(&Device->GetDefaultCommandContext());
} return DefaultCommandContext;
}

無論是單GPU還是多GPU,都是從FD3D12CommandContext強制轉換而來,而FD3D12CommandContext又是IRHICommandContext的子子子類,因此靜態型別轉換完全沒問題。

10.3.3.1 FD3D11DynamicRHI

FD3D11DynamicRHI包含或引用了若干D3D11平臺相關的核心型別,它們的定義如下所示:

// Engine\Source\Runtime\Windows\D3D11RHI\Private\D3D11RHIPrivate.h

class D3D11RHI_API FD3D11DynamicRHI : public FDynamicRHI, public IRHICommandContextPSOFallback
{
(......) protected:
// D3D工廠(介面).
TRefCountPtr<IDXGIFactory1> DXGIFactory1;
// D3D裝置.
TRefCountPtr<FD3D11Device> Direct3DDevice;
// D3D裝置的立即上下文.
TRefCountPtr<FD3D11DeviceContext> Direct3DDeviceIMContext; // 視口.
TArray<FD3D11Viewport*> Viewports;
TRefCountPtr<FD3D11Viewport> DrawingViewport; // AMD AGS工具庫上下文.
AGSContext* AmdAgsContext; (......)
}; // Engine\Source\Runtime\Windows\D3D11RHI\Private\Windows\D3D11RHIBasePrivate.h typedef ID3D11DeviceContext FD3D11DeviceContext;
typedef ID3D11Device FD3D11Device; // Engine\Source\Runtime\Windows\D3D11RHI\Public\D3D11Viewport.h class FD3D11Viewport : public FRHIViewport
{
public:
FD3D11Viewport(class FD3D11DynamicRHI* InD3DRHI) : D3DRHI(InD3DRHI), PresentFailCount(0), ValidState (0), FrameSyncEvent(InD3DRHI);
FD3D11Viewport(class FD3D11DynamicRHI* InD3DRHI, HWND InWindowHandle, uint32 InSizeX, uint32 InSizeY, bool bInIsFullscreen, EPixelFormat InPreferredPixelFormat);
~FD3D11Viewport(); virtual void Resize(uint32 InSizeX, uint32 InSizeY, bool bInIsFullscreen, EPixelFormat PreferredPixelFormat);
void ConditionalResetSwapChain(bool bIgnoreFocus);
void CheckHDRMonitorStatus(); // 呈現交換鏈.
bool Present(bool bLockToVsync); // Accessors.
FIntPoint GetSizeXY() const;
FD3D11Texture2D* GetBackBuffer() const;
EColorSpaceAndEOTF GetPixelColorSpace() const; void WaitForFrameEventCompletion();
void IssueFrameEvent() IDXGISwapChain* GetSwapChain() const;
virtual void* GetNativeSwapChain() const override;
virtual void* GetNativeBackBufferTexture() const override;
virtual void* GetNativeBackBufferRT() const overrid; virtual void SetCustomPresent(FRHICustomPresent* InCustomPresent) override
virtual FRHICustomPresent* GetCustomPresent() const; virtual void* GetNativeWindow(void** AddParam = nullptr) const override;
static FD3D11Texture2D* GetSwapChainSurface(FD3D11DynamicRHI* D3DRHI, EPixelFormat PixelFormat, uint32 SizeX, uint32 SizeY, IDXGISwapChain* SwapChain); protected:
// 動態RHI.
FD3D11DynamicRHI* D3DRHI;
// 交換鏈.
TRefCountPtr<IDXGISwapChain> SwapChain;
// 後渲染緩衝.
TRefCountPtr<FD3D11Texture2D> BackBuffer; FD3D11EventQuery FrameSyncEvent;
FCustomPresentRHIRef CustomPresent; (......)
};

FD3D11DynamicRHI繪製成UML圖之後如下所示:

classDiagram-v2
IRHIComputeContext <|.. IRHICommandContext
IRHICommandContext <|.. IRHICommandContextPSOFallback

ID3D11DeviceContext -- FD3D11DeviceContext
ID3D11Device -- FD3D11Device

FDynamicRHI <|-- FD3D11DynamicRHI
IRHICommandContextPSOFallback <|-- FD3D11DynamicRHI
IDXGIFactory1 --* FD3D11DynamicRHI
FD3D11Device --* FD3D11DynamicRHI
FD3D11DeviceContext --* FD3D11DynamicRHI

FRenderResource <|-- FViewport
FViewport <|-- FD3D11Viewport
FD3D11Viewport --o FD3D11DynamicRHI

class FD3D11DynamicRHI{
IDXGIFactory1* DXGIFactory1
FD3D11Device* Direct3DDevice
FD3D11DeviceContext* Direct3DDeviceIMContext
FD3D11Viewport* Viewports
}

10.3.3.2 FOpenGLDynamicRHI

FOpenGLDynamicRHI相關的核心型別定義如下:

class OPENGLDRV_API FOpenGLDynamicRHI  final : public FDynamicRHI, public IRHICommandContextPSOFallback
{
(......) private:
// 已建立的視口.
TArray<FOpenGLViewport*> Viewports;
// 底層平臺相關的資料.
FPlatformOpenGLDevice* PlatformDevice;
}; // Engine\Source\Runtime\OpenGLDrv\Public\OpenGLResources.h class FOpenGLViewport : public FRHIViewport
{
public:
FOpenGLViewport(class FOpenGLDynamicRHI* InOpenGLRHI,void* InWindowHandle,uint32 InSizeX,uint32 InSizeY,bool bInIsFullscreen,EPixelFormat PreferredPixelFormat);
~FOpenGLViewport(); void Resize(uint32 InSizeX,uint32 InSizeY,bool bInIsFullscreen); // Accessors.
FIntPoint GetSizeXY() const;
FOpenGLTexture2D *GetBackBuffer() const;
bool IsFullscreen( void ) const; void WaitForFrameEventCompletion();
void IssueFrameEvent();
virtual void* GetNativeWindow(void** AddParam) const override; struct FPlatformOpenGLContext* GetGLContext() const;
FOpenGLDynamicRHI* GetOpenGLRHI() const; virtual void SetCustomPresent(FRHICustomPresent* InCustomPresent) override;
FRHICustomPresent* GetCustomPresent() const; private:
FOpenGLDynamicRHI* OpenGLRHI;
struct FPlatformOpenGLContext* OpenGLContext;
uint32 SizeX;
uint32 SizeY;
bool bIsFullscreen;
EPixelFormat PixelFormat;
bool bIsValid;
TRefCountPtr<FOpenGLTexture2D> BackBuffer;
FOpenGLEventQuery FrameSyncEvent;
FCustomPresentRHIRef CustomPresent;
}; // Engine\Source\Runtime\OpenGLDrv\Private\Android\AndroidOpenGL.cpp // 安卓系統的OpenGL裝置.
struct FPlatformOpenGLDevice
{
bool TargetDirty; void SetCurrentSharedContext();
void SetCurrentRenderingContext();
void SetupCurrentContext();
void SetCurrentNULLContext(); FPlatformOpenGLDevice();
~FPlatformOpenGLDevice(); void Init();
void LoadEXT();
void Terminate();
void ReInit();
}; // Engine\Source\Runtime\OpenGLDrv\Private\Windows\OpenGLWindows.cpp // Windows系統的OpenGL裝置.
struct FPlatformOpenGLDevice
{
FPlatformOpenGLContext SharedContext;
FPlatformOpenGLContext RenderingContext;
TArray<FPlatformOpenGLContext*> ViewportContexts;
bool TargetDirty; /** Guards against operating on viewport contexts from more than one thread at the same time. */
FCriticalSection* ContextUsageGuard;
}; // Engine\Source\Runtime\OpenGLDrv\Private\Lumin\LuminOpenGL.cpp // Lumin系統的OpenGL裝置.
struct FPlatformOpenGLDevice
{
void SetCurrentSharedContext();
void SetCurrentRenderingContext();
void SetCurrentNULLContext(); FPlatformOpenGLDevice();
~FPlatformOpenGLDevice(); void Init();
void LoadEXT();
void Terminate();
void ReInit();
}; // Engine\Source\Runtime\OpenGLDrv\Private\Linux\OpenGLLinux.cpp // Linux系統的OpenGL裝置.
struct FPlatformOpenGLDevice
{
FPlatformOpenGLContext SharedContext;
FPlatformOpenGLContext RenderingContext;
int32 NumUsedContexts;
FCriticalSection* ContextUsageGuard;
}; // Engine\Source\Runtime\OpenGLDrv\Private\Lumin\LuminGL4.cpp // Lumin系統的OpenGL裝置.
struct FPlatformOpenGLDevice
{
FPlatformOpenGLContext SharedContext;
FPlatformOpenGLContext RenderingContext;
TArray<FPlatformOpenGLContext*> ViewportContexts;
bool TargetDirty;
FCriticalSection* ContextUsageGuard;
};

以上顯示不同作業系統,OpenGL裝置物件的定義有所不同。實際上,OpenGL上下文也因作業系統而異,下面以Windows為例:

// Engine\Source\Runtime\OpenGLDrv\Private\Windows\OpenGLWindows.cpp

struct FPlatformOpenGLContext
{
// 視窗控制代碼
HWND WindowHandle;
// 裝置上下文.
HDC DeviceContext;
// OpenGL上下文.
HGLRC OpenGLContext; // 其它實際.
bool bReleaseWindowOnDestroy;
int32 SyncInterval;
GLuint ViewportFramebuffer;
GLuint VertexArrayObject; // one has to be generated and set for each context (OpenGL 3.2 Core requirements)
GLuint BackBufferResource;
GLenum BackBufferTarget;
};

FOpenGLDynamicRHI繪製成的UML圖如下所示:

classDiagram-v2
IRHIComputeContext <|.. IRHICommandContext
IRHICommandContext <|.. IRHICommandContextPSOFallback

FDynamicRHI <|-- FOpenGLDynamicRHI
IRHICommandContextPSOFallback <|-- FOpenGLDynamicRHI

FPlatformOpenGLDevice --* FOpenGLDynamicRHI

FRenderResource <|-- FViewport
FViewport <|-- FOpenGLViewport
FOpenGLViewport --o FOpenGLDynamicRHI

FPlatformOpenGLDevice o-- FPlatformOpenGLContext

class FOpenGLDynamicRHI{
FOpenGLViewport* Viewports
FPlatformOpenGLDevice* PlatformDevice
}

10.3.3.3 FD3D12DynamicRHI

FD3D12DynamicRHI的核心型別定義如下:

// Engine\Source\Runtime\D3D12RHI\Private\D3D12RHIPrivate.h

class FD3D12DynamicRHI : public FDynamicRHI
{
(......) protected:
// 已選擇的介面卡.
TArray<TSharedPtr<FD3D12Adapter>> ChosenAdapters; // D3D12裝置.
inline FD3D12Device* GetRHIDevice(uint32 GPUIndex)
{
return GetAdapter().GetDevice(GPUIndex);
} (......)
}; // Engine\Source\Runtime\D3D12RHI\Private\D3D12Adapter.h class FD3D12Adapter : public FNoncopyable
{
public:
void Initialize(FD3D12DynamicRHI* RHI);
void InitializeDevices();
void InitializeRayTracing(); // 資源建立.
HRESULT CreateCommittedResource(...)
HRESULT CreateBuffer(...);
template <typename BufferType>
BufferType* CreateRHIBuffer(...); inline FD3D12CommandContextRedirector& GetDefaultContextRedirector();
inline FD3D12CommandContextRedirector& GetDefaultAsyncComputeContextRedirector();
FD3D12FastConstantAllocator& GetTransientUniformBufferAllocator(); void BlockUntilIdle(); (......) protected:
virtual void CreateRootDevice(bool bWithDebug); FD3D12DynamicRHI* OwningRHI; // LDA設定擁有一個ID3D12Device
TRefCountPtr<ID3D12Device> RootDevice;
TRefCountPtr<ID3D12Device1> RootDevice1; TRefCountPtr<IDXGIAdapter> DxgiAdapter; TRefCountPtr<IDXGIFactory> DxgiFactory;
TRefCountPtr<IDXGIFactory2> DxgiFactory2; // 每個裝置代表一個物理GPU“節點”.
FD3D12Device* Devices[MAX_NUM_GPUS]; FD3D12CommandContextRedirector DefaultContextRedirector;
FD3D12CommandContextRedirector DefaultAsyncComputeContextRedirector; TArray<FD3D12Viewport*> Viewports;
TRefCountPtr<FD3D12Viewport> DrawingViewport; (......)
}; // Engine\Source\Runtime\D3D12RHI\Private\D3D12RHICommon.h class FD3D12AdapterChild
{
protected:
FD3D12Adapter* ParentAdapter; (......)
}; class FD3D12DeviceChild
{
protected:
FD3D12Device* Parent; (......)
}; // Engine\Source\Runtime\D3D12RHI\Private\D3D12Device.h class FD3D12Device : public FD3D12SingleNodeGPUObject, public FNoncopyable, public FD3D12AdapterChild
{
public:
TArray<FD3D12CommandListHandle> PendingCommandLists; void Initialize();
void CreateCommandContexts();
void InitPlatformSpecific();
virtual void Cleanup();
bool GetQueryData(FD3D12RenderQuery& Query, bool bWait); ID3D12Device* GetDevice(); void BlockUntilIdle();
bool IsGPUIdle(); FD3D12SamplerState* CreateSampler(const FSamplerStateInitializerRHI& Initializer); (......) protected:
// CommandListManager
FD3D12CommandListManager* CommandListManager;
FD3D12CommandListManager* CopyCommandListManager;
FD3D12CommandListManager* AsyncCommandListManager;
FD3D12CommandAllocatorManager TextureStreamingCommandAllocatorManager; // Allocator
FD3D12OfflineDescriptorManager RTVAllocator;
FD3D12OfflineDescriptorManager DSVAllocator;
FD3D12OfflineDescriptorManager SRVAllocator;
FD3D12OfflineDescriptorManager UAVAllocator;
FD3D12DefaultBufferAllocator DefaultBufferAllocator; // FD3D12CommandContext
TArray<FD3D12CommandContext*> CommandContextArray;
TArray<FD3D12CommandContext*> FreeCommandContexts;
TArray<FD3D12CommandContext*> AsyncComputeContextArray; (......)
}; // Engine\Source\Runtime\D3D12RHI\Public\D3D12Viewport.h class FD3D12Viewport : public FRHIViewport, public FD3D12AdapterChild
{
public:
void Init();
void Resize(uint32 InSizeX, uint32 InSizeY, bool bInIsFullscreen, EPixelFormat PreferredPixelFormat); void ConditionalResetSwapChain(bool bIgnoreFocus);
bool Present(bool bLockToVsync); void WaitForFrameEventCompletion();
bool CurrentOutputSupportsHDR() const; (......) private:
HWND WindowHandle; #if D3D12_VIEWPORT_EXPOSES_SWAP_CHAIN
TRefCountPtr<IDXGISwapChain1> SwapChain1;
TRefCountPtr<IDXGISwapChain4> SwapChain4;
#endif TArray<TRefCountPtr<FD3D12Texture2D>> BackBuffers;
TRefCountPtr<FD3D12Texture2D> DummyBackBuffer_RenderThread;
uint32 CurrentBackBufferIndex_RHIThread;
FD3D12Texture2D* BackBuffer_RHIThread;
TArray<TRefCountPtr<FD3D12Texture2D>> SDRBackBuffers;
TRefCountPtr<FD3D12Texture2D> SDRDummyBackBuffer_RenderThread;
FD3D12Texture2D* SDRBackBuffer_RHIThread; bool CheckHDRSupport();
void EnableHDR();
void ShutdownHDR(); (......)
}; // Engine\Source\Runtime\D3D12RHI\Private\D3D12CommandContext.h class FD3D12CommandContextBase : public IRHICommandContext, public FD3D12AdapterChild
{
public:
FD3D12CommandContextBase(class FD3D12Adapter* InParent, FRHIGPUMask InGPUMask, bool InIsDefaultContext, bool InIsAsyncComputeContext); void RHIBeginDrawingViewport(FRHIViewport* Viewport, FRHITexture* RenderTargetRHI) final override;
void RHIEndDrawingViewport(FRHIViewport* Viewport, bool bPresent, bool bLockToVsync) final override;
void RHIBeginFrame() final override;
void RHIEndFrame() final override; (......) protected:
virtual FD3D12CommandContext* GetContext(uint32 InGPUIndex) = 0; FRHIGPUMask GPUMask; (......)
}; class FD3D12CommandContext : public FD3D12CommandContextBase, public FD3D12DeviceChild
{
public:
FD3D12CommandContext(class FD3D12Device* InParent, bool InIsDefaultContext, bool InIsAsyncComputeContext);
virtual ~FD3D12CommandContext(); void EndFrame();
void ConditionalObtainCommandAllocator();
void ReleaseCommandAllocator(); FD3D12CommandListManager& GetCommandListManager();
void OpenCommandList();
void CloseCommandList(); FD3D12CommandListHandle FlushCommands(bool WaitForCompletion = false, EFlushCommandsExtraAction ExtraAction = FCEA_None);
void Finish(TArray<FD3D12CommandListHandle>& CommandLists); FD3D12FastConstantAllocator ConstantsAllocator;
FD3D12CommandListHandle CommandListHandle;
FD3D12CommandAllocator* CommandAllocator;
FD3D12CommandAllocatorManager CommandAllocatorManager; FD3D12DynamicRHI& OwningRHI; // State Block.
FD3D12RenderTargetView* CurrentRenderTargets[D3D12_SIMULTANEOUS_RENDER_TARGET_COUNT];
FD3D12DepthStencilView* CurrentDepthStencilTarget;
FD3D12TextureBase* CurrentDepthTexture;
uint32 NumSimultaneousRenderTargets; // Uniform Buffer.
FD3D12UniformBuffer* BoundUniformBuffers[SF_NumStandardFrequencies][MAX_CBS];
FUniformBufferRHIRef BoundUniformBufferRefs[SF_NumStandardFrequencies][MAX_CBS];
uint16 DirtyUniformBuffers[SF_NumStandardFrequencies]; // 常量緩衝區.
FD3D12ConstantBuffer VSConstantBuffer;
FD3D12ConstantBuffer HSConstantBuffer;
FD3D12ConstantBuffer DSConstantBuffer;
FD3D12ConstantBuffer PSConstantBuffer;
FD3D12ConstantBuffer GSConstantBuffer;
FD3D12ConstantBuffer CSConstantBuffer; template <class ShaderType> void SetResourcesFromTables(const ShaderType* RESTRICT);
template <class ShaderType> uint32 SetUAVPSResourcesFromTables(const ShaderType* RESTRICT Shader);
void CommitGraphicsResourceTables();
void CommitComputeResourceTables(FD3D12ComputeShader* ComputeShader);
void ValidateExclusiveDepthStencilAccess(FExclusiveDepthStencil Src) const;
void CommitRenderTargetsAndUAVs(); virtual void SetDepthBounds(float MinDepth, float MaxDepth);
virtual void SetShadingRate(EVRSShadingRate ShadingRate, EVRSRateCombiner Combiner); (......) protected:
FD3D12CommandContext* GetContext(uint32 InGPUIndex) final override;
TArray<FRHIUniformBuffer*> GlobalUniformBuffers;
}; class FD3D12CommandContextRedirector final : public FD3D12CommandContextBase
{
public:
FD3D12CommandContextRedirector(class FD3D12Adapter* InParent, bool InIsDefaultContext, bool InIsAsyncComputeContext); virtual void RHISetComputeShader(FRHIComputeShader* ComputeShader) final override;
virtual void RHISetComputePipelineState(FRHIComputePipelineState* ComputePipelineState) final override;
virtual void RHIDispatchComputeShader(uint32 ThreadGroupCountX, uint32 ThreadGroupCountY, uint32 ThreadGroupCountZ) final override; (......) private:
FRHIGPUMask PhysicalGPUMask;
FD3D12CommandContext* PhysicalContexts[MAX_NUM_GPUS];
}; // Engine\Source\Runtime\D3D12RHI\Private\D3D12CommandContext.cpp class FD3D12CommandContextContainer : public IRHICommandContextContainer
{
FD3D12Adapter* Adapter;
FD3D12CommandContext* CmdContext;
FD3D12CommandContextRedirector* CmdContextRedirector;
FRHIGPUMask GPUMask;
TArray<FD3D12CommandListHandle> CommandLists; (......)
};

以上可知,D3D12涉及的核心型別非常多,涉及多層級的複雜的資料結構鏈,其記憶體佈局如下所示:

[Engine]--
|
|-[RHI]--
|
|-[Adapter]-- (LDA)
| |
| |- [Device]
| |
| |- [Device]
|
|-[Adapter]--
|
|- [Device]--
|
|-[CommandContext]
|
|-[CommandContext]---
|
|-[StateCache]

在這種方案下,FD3D12Device表示1個節點,屬於1個物理介面卡。這種結構允許一個RHI控制幾個不同型別的硬體設定,例如:

  • 單GPU系統(常規案例)。
  • 多GPU系統,如LDA(Crossfire/SLI)。
  • 非對稱多GPU系統,如分離、整合GPU協作系統。

將D3D12的核心類抽象成UML圖之後,如下所示:

classDiagram-v2
IRHIComputeContext <|.. IRHICommandContext
FDynamicRHI <|-- FD3D12DynamicRHI

FD3D12DynamicRHI o-- FD3D12Adapter
FNoncopyable <|-- FD3D12Adapter
ID3D12Device --* FD3D12Adapter
IDXGIAdapter --* FD3D12Adapter
IDXGIFactory --* FD3D12Adapter
FD3D12Device --o FD3D12Adapter
FD3D12Viewport --o FD3D12Adapter

FNoncopyable <|-- FD3D12Device
FD3D12AdapterChild <|-- FD3D12Device
FD3D12CommandListManager --o FD3D12Device
FD3D12CommandContext --o FD3D12Device

FRHIViewport <|-- FD3D12Viewport
FD3D12AdapterChild <|-- FD3D12Viewport

IRHICommandContext <|-- FD3D12CommandContextBase
FD3D12AdapterChild <|-- FD3D12CommandContextBase

FD3D12CommandContextBase <|-- FD3D12CommandContext
FD3D12DeviceChild <|-- FD3D12CommandContext

FD3D12CommandContextBase <|-- FD3D12CommandContextRedirector
FD3D12CommandContext --o FD3D12CommandContextRedirector

IRHICommandContextContainer <|-- FD3D12CommandContextContainer
FD3D12Adapter <-- FD3D12CommandContextContainer
FD3D12CommandContext <-- FD3D12CommandContextContainer
FD3D12CommandContextRedirector <-- FD3D12CommandContextContainer

看不清可以點選下面圖片版本:

10.3.3.4 FVulkanDynamicRHI

FVulkanDynamicRHI涉及的核心類如下:

// Engine\Source\Runtime\VulkanRHI\Public\VulkanDynamicRHI.h

class FVulkanDynamicRHI : public FDynamicRHI
{
public:
// FDynamicRHI interface.
virtual void Init() final override;
virtual void PostInit() final override;
virtual void Shutdown() final override;;
void InitInstance(); (......) protected:
// 例項.
VkInstance Instance; // 裝置.
TArray<FVulkanDevice*> Devices;
FVulkanDevice* Device; // 視口.
TArray<FVulkanViewport*> Viewports; (......)
}; // Engine\Source\Runtime\VulkanRHI\Private\VulkanDevice.h class FVulkanDevice
{
public:
FVulkanDevice(FVulkanDynamicRHI* InRHI, VkPhysicalDevice Gpu);
~FVulkanDevice(); bool QueryGPU(int32 DeviceIndex);
void InitGPU(int32 DeviceIndex);
void CreateDevice();
void PrepareForDestroy();
void Destroy(); void WaitUntilIdle();
void PrepareForCPURead();
void SubmitCommandsAndFlushGPU(); (......) private:
void SubmitCommands(FVulkanCommandListContext* Context); // vk裝置.
VkDevice Device;
// vk物理裝置.
VkPhysicalDevice Gpu; VkPhysicalDeviceProperties GpuProps;
VkPhysicalDeviceFeatures PhysicalFeatures; // 管理器.
VulkanRHI::FDeviceMemoryManager DeviceMemoryManager;
VulkanRHI::FMemoryManager MemoryManager;
VulkanRHI::FDeferredDeletionQueue2 DeferredDeletionQueue;
VulkanRHI::FStagingManager StagingManager;
VulkanRHI::FFenceManager FenceManager;
FVulkanDescriptorPoolsManager* DescriptorPoolsManager = nullptr; FVulkanDescriptorSetCache* DescriptorSetCache = nullptr;
FVulkanShaderFactory ShaderFactory; // 佇列.
FVulkanQueue* GfxQueue;
FVulkanQueue* ComputeQueue;
FVulkanQueue* TransferQueue;
FVulkanQueue* PresentQueue; // GPU品牌.
EGpuVendorId VendorId = EGpuVendorId::NotQueried; // 命令佇列上下文.
FVulkanCommandListContextImmediate* ImmediateContext;
FVulkanCommandListContext* ComputeContext;
TArray<FVulkanCommandListContext*> CommandContexts; FVulkanDynamicRHI* RHI = nullptr;
class FVulkanPipelineStateCacheManager* PipelineStateCache; (......)
}; // Engine\Source\Runtime\VulkanRHI\Private\VulkanQueue.h class FVulkanQueue
{
public:
FVulkanQueue(FVulkanDevice* InDevice, uint32 InFamilyIndex);
~FVulkanQueue(); void Submit(FVulkanCmdBuffer* CmdBuffer, uint32 NumSignalSemaphores = 0, VkSemaphore* SignalSemaphores = nullptr);
void Submit(FVulkanCmdBuffer* CmdBuffer, VkSemaphore SignalSemaphore); void GetLastSubmittedInfo(FVulkanCmdBuffer*& OutCmdBuffer, uint64& OutFenceCounter) const; (......) private:
// vk佇列
VkQueue Queue;
// 家族索引.
uint32 FamilyIndex;
// 佇列索引.
uint32 QueueIndex;
FVulkanDevice* Device; // vk命令緩衝.
FVulkanCmdBuffer* LastSubmittedCmdBuffer;
uint64 LastSubmittedCmdBufferFenceCounter;
uint64 SubmitCounter;
mutable FCriticalSection CS; void UpdateLastSubmittedCommandBuffer(FVulkanCmdBuffer* CmdBuffer);
}; // Engine\Source\Runtime\VulkanRHI\Public\VulkanMemory.h // 裝置子節點.
class FDeviceChild
{
public:
FDeviceChild(FVulkanDevice* InDevice = nullptr); (......) protected:
FVulkanDevice* Device;
}; // Engine\Source\Runtime\VulkanRHI\Private\VulkanContext.h class FVulkanCommandListContext : public IRHICommandContext
{
public:
FVulkanCommandListContext(FVulkanDynamicRHI* InRHI, FVulkanDevice* InDevice, FVulkanQueue* InQueue, FVulkanCommandListContext* InImmediate);
virtual ~FVulkanCommandListContext(); static inline FVulkanCommandListContext& GetVulkanContext(IRHICommandContext& CmdContext); inline bool IsImmediate() const; virtual void RHISetStreamSource(uint32 StreamIndex, FRHIVertexBuffer* VertexBuffer, uint32 Offset) final override;
virtual void RHISetViewport(float MinX, float MinY, float MinZ, float MaxX, float MaxY, float MaxZ) final override;
virtual void RHISetScissorRect(bool bEnable, uint32 MinX, uint32 MinY, uint32 MaxX, uint32 MaxY) final override; (......) inline FVulkanDevice* GetDevice() const;
void PrepareParallelFromBase(const FVulkanCommandListContext& BaseContext); protected:
FVulkanDynamicRHI* RHI;
FVulkanCommandListContext* Immediate;
FVulkanDevice* Device;
FVulkanQueue* Queue; FVulkanUniformBufferUploader* UniformBufferUploader;
FVulkanCommandBufferManager* CommandBufferManager;
static FVulkanLayoutManager LayoutManager; private:
FVulkanGPUProfiler GpuProfiler;
TArray<FRHIUniformBuffer*> GlobalUniformBuffers; (......)
}; // 立即模式的命令佇列上下文.
class FVulkanCommandListContextImmediate : public FVulkanCommandListContext
{
public:
FVulkanCommandListContextImmediate(FVulkanDynamicRHI* InRHI, FVulkanDevice* InDevice, FVulkanQueue* InQueue);
}; // 命令上下文容器.
struct FVulkanCommandContextContainer : public IRHICommandContextContainer, public VulkanRHI::FDeviceChild
{
FVulkanCommandListContext* CmdContext; FVulkanCommandContextContainer(FVulkanDevice* InDevice); virtual IRHICommandContext* GetContext() override final;
virtual void FinishContext() override final;
virtual void SubmitAndFreeContextContainer(int32 Index, int32 Num) override final; void* operator new(size_t Size);
void operator delete(void* RawMemory); (......)
}; // Engine\Source\Runtime\VulkanRHI\Private\VulkanViewport.h class FVulkanViewport : public FRHIViewport, public VulkanRHI::FDeviceChild
{
public:
FVulkanViewport(FVulkanDynamicRHI* InRHI, FVulkanDevice* InDevice, void* InWindowHandle, uint32 InSizeX,uint32 InSizeY,bool bInIsFullscreen, EPixelFormat InPreferredPixelFormat);
~FVulkanViewport(); void AdvanceBackBufferFrame(FRHICommandListImmediate& RHICmdList);
void WaitForFrameEventCompletion(); virtual void SetCustomPresent(FRHICustomPresent* InCustomPresent) override final;
virtual FRHICustomPresent* GetCustomPresent() const override final;
virtual void Tick(float DeltaTime) override final;
bool Present(FVulkanCommandListContext* Context, FVulkanCmdBuffer* CmdBuffer, FVulkanQueue* Queue, FVulkanQueue* PresentQueue, bool bLockToVsync); (......) protected:
TArray<VkImage, TInlineAllocator<NUM_BUFFERS*2>> BackBufferImages;
TArray<VulkanRHI::FSemaphore*, TInlineAllocator<NUM_BUFFERS*2>> RenderingDoneSemaphores;
TArray<FVulkanTextureView, TInlineAllocator<NUM_BUFFERS*2>> TextureViews;
TRefCountPtr<FVulkanBackBuffer> RHIBackBuffer;
TRefCountPtr<FVulkanTexture2D> RenderingBackBuffer; /** narrow-scoped section that locks access to back buffer during its recreation*/
FCriticalSection RecreatingSwapchain; FVulkanDynamicRHI* RHI;
FVulkanSwapChain* SwapChain;
void* WindowHandle;
VulkanRHI::FSemaphore* AcquiredSemaphore;
FCustomPresentRHIRef CustomPresent;
FVulkanCmdBuffer* LastFrameCommandBuffer = nullptr; (......)
};

若將Vulkan RHI的核心型別繪製成UML圖,則是如下圖所示:

classDiagram-v2
FDynamicRHI <|-- FVulkanDynamicRHI
VkInstance --* FVulkanDynamicRHI
FVulkanDevice --o FVulkanDynamicRHI
FVulkanViewport --o FVulkanDynamicRHI

FRHIResource <|-- FRHIViewport
FRHIViewport <|-- FVulkanViewport
FDeviceChild <|-- FVulkanViewport

VkDevice --* FVulkanDevice
VkPhysicalDevice --* FVulkanDevice
FVulkanQueue --o FVulkanDevice

FVulkanCommandListContext --o FVulkanDevice
FVulkanCommandListContextImmediate --* FVulkanDevice

VkQueue --* FVulkanQueue

IRHICommandContext <|-- FVulkanCommandListContext
FVulkanCommandListContext <|-- FVulkanCommandListContextImmediate

IRHICommandContextContainer <|-- FVulkanCommandContextContainer
FDeviceChild <|-- FVulkanCommandContextContainer
FVulkanCommandListContext <-- FVulkanCommandContextContainer

10.3.3.5 FMetalDynamicRHI

FMetalDynamicRHI的核心型別定義如下:

// Engine\Source\Runtime\Apple\MetalRHI\Private\MetalDynamicRHI.h

class FMetalDynamicRHI : public FDynamicRHI
{
public:
// FDynamicRHI interface.
virtual void Init();
virtual void Shutdown() {} (......) private:
// 立即模式上下文.
FMetalRHIImmediateCommandContext ImmediateContext;
// 非同步計算上下文.
FMetalRHICommandContext* AsyncComputeContext; (......)
}; // Engine\Source\Runtime\Apple\MetalRHI\Public\MetalRHIContext.h class FMetalRHICommandContext : public IRHICommandContext
{
public:
FMetalRHICommandContext(class FMetalProfiler* InProfiler, FMetalContext* WrapContext);
virtual ~FMetalRHICommandContext(); virtual void RHISetComputeShader(FRHIComputeShader* ComputeShader) override;
virtual void RHISetComputePipelineState(FRHIComputePipelineState* ComputePipelineState) override;
virtual void RHIDispatchComputeShader(uint32 ThreadGroupCountX, uint32 ThreadGroupCountY, uint32 ThreadGroupCountZ) final override; (......) protected:
// Metal上下文.
FMetalContext* Context; TSharedPtr<FMetalCommandBufferFence, ESPMode::ThreadSafe> CommandBufferFence;
class FMetalProfiler* Profiler;
FMetalBuffer PendingVertexBuffer; TArray<FRHIUniformBuffer*> GlobalUniformBuffers; (......)
}; class FMetalRHIComputeContext : public FMetalRHICommandContext
{
public:
FMetalRHIComputeContext(class FMetalProfiler* InProfiler, FMetalContext* WrapContext);
virtual ~FMetalRHIComputeContext(); virtual void RHISetAsyncComputeBudget(EAsyncComputeBudget Budget) final override;
virtual void RHISetComputeShader(FRHIComputeShader* ComputeShader) final override;
virtual void RHISetComputePipelineState(FRHIComputePipelineState* ComputePipelineState) final override;
virtual void RHISubmitCommandsHint() final override;
}; class FMetalRHIImmediateCommandContext : public FMetalRHICommandContext
{
public:
FMetalRHIImmediateCommandContext(class FMetalProfiler* InProfiler, FMetalContext* WrapContext); // FRHICommandContext API accessible only on the immediate device context
virtual void RHIBeginDrawingViewport(FRHIViewport* Viewport, FRHITexture* RenderTargetRHI) final override;
virtual void RHIEndDrawingViewport(FRHIViewport* Viewport, bool bPresent, bool bLockToVsync) final override; (......)
}; // Engine\Source\Runtime\Apple\MetalRHI\Private\MetalContext.h // 上下文.
class FMetalContext
{
public:
FMetalContext(mtlpp::Device InDevice, FMetalCommandQueue& Queue, bool const bIsImmediate);
virtual ~FMetalContext(); mtlpp::Device& GetDevice(); bool PrepareToDraw(uint32 PrimitiveType, EMetalIndexType IndexType = EMetalIndexType_None);
void SetRenderPassInfo(const FRHIRenderPassInfo& RenderTargetsInfo, bool const bRestart = false); void SubmitCommandsHint(uint32 const bFlags = EMetalSubmitFlagsCreateCommandBuffer);
void SubmitCommandBufferAndWait();
void ResetRenderCommandEncoder(); void DrawPrimitive(uint32 PrimitiveType, uint32 BaseVertexIndex, uint32 NumPrimitives, uint32 NumInstances);
void DrawPrimitiveIndirect(uint32 PrimitiveType, FMetalVertexBuffer* VertexBuffer, uint32 ArgumentOffset);
void DrawIndexedPrimitive(FMetalBuffer const& IndexBuffer, ...);
void DrawIndexedIndirect(FMetalIndexBuffer* IndexBufferRHI, ...);
void DrawIndexedPrimitiveIndirect(uint32 PrimitiveType, ...);
void DrawPatches(uint32 PrimitiveType, ...); (......) protected:
// Metal底層裝置.
mtlpp::Device Device; FMetalCommandQueue& CommandQueue;
FMetalCommandList CommandList; FMetalStateCache StateCache;
FMetalRenderPass RenderPass; dispatch_semaphore_t CommandBufferSemaphore;
TSharedPtr<FMetalQueryBufferPool, ESPMode::ThreadSafe> QueryBuffer;
TRefCountPtr<FMetalFence> StartFence;
TRefCountPtr<FMetalFence> EndFence; int32 NumParallelContextsInPass; (......)
}; // Engine\Source\Runtime\Apple\MetalRHI\Private\MetalCommandQueue.h class FMetalCommandQueue
{
public:
FMetalCommandQueue(mtlpp::Device Device, uint32 const MaxNumCommandBuffers = 0);
~FMetalCommandQueue(void); mtlpp::CommandBuffer CreateCommandBuffer(void);
void CommitCommandBuffer(mtlpp::CommandBuffer& CommandBuffer);
void SubmitCommandBuffers(TArray<mtlpp::CommandBuffer> BufferList, uint32 Index, uint32 Count);
FMetalFence* CreateFence(ns::String const& Label) const;
void GetCommittedCommandBufferFences(TArray<mtlpp::CommandBufferFence>& Fences); mtlpp::Device& GetDevice(void); static mtlpp::ResourceOptions GetCompatibleResourceOptions(mtlpp::ResourceOptions Options);
static inline bool SupportsFeature(EMetalFeatures InFeature);
static inline bool SupportsSeparateMSAAAndResolveTarget(); (......) private:
// 裝置.
mtlpp::Device Device;
// 命令佇列.
mtlpp::CommandQueue CommandQueue;
// 命令快取區列表.(注意是陣列的陣列)
TArray<TArray<mtlpp::CommandBuffer>> CommandBuffers; TLockFreePointerListLIFO<mtlpp::CommandBufferFence> CommandBufferFences;
uint64 ParallelCommandLists;
}; // Engine\Source\Runtime\Apple\MetalRHI\Private\MetalCommandList.h class FMetalCommandList
{
public:
FMetalCommandList(FMetalCommandQueue& InCommandQueue, bool const bInImmediate);
~FMetalCommandList(void); void Commit(mtlpp::CommandBuffer& Buffer, TArray<ns::Object<mtlpp::CommandBufferHandler>> CompletionHandlers, bool const bWait, bool const bIsLastCommandBuffer);
void Submit(uint32 Index, uint32 Count); bool IsImmediate(void) const;
bool IsParallel(void) const;
void SetParallelIndex(uint32 Index, uint32 Num);
uint32 GetParallelIndex(void) const;
uint32 GetParallelNum(void) const; (......) private:
// 所屬的FMetalCommandQueue.
FMetalCommandQueue& CommandQueue;
// 已提交的命令緩衝列表.
TArray<mtlpp::CommandBuffer> SubmittedBuffers;
};

相比其它現代圖形API而言,FMetalDynamicRHI的概念和介面都簡介多了。其UML圖如下:

classDiagram-v2
FDynamicRHI <|-- FMetalDynamicRHI
FMetalRHIImmediateCommandContext --* FMetalDynamicRHI
FMetalRHICommandContext --* FMetalDynamicRHI

IRHICommandContext <|-- FMetalRHICommandContext
FMetalRHICommandContext <|-- FMetalRHIComputeContext
FMetalRHIComputeContext <|-- FMetalRHIImmediateCommandContext

FMetalContext --* FMetalRHICommandContext
mtlpp_Device --* FMetalContext
FMetalCommandQueue --* FMetalContext
FMetalCommandList --* FMetalContext

mtlpp_CommandQueue --* FMetalCommandQueue
mtlpp_CommandBuffer --o TArray_CommandBuffer
TArray_CommandBuffer --o FMetalCommandQueue

mtlpp_CommandBuffer --o FMetalCommandList

10.3.4 RHI體系總覽

10.2和10.3章節詳細闡述了RHI體系下的基礎概念和繼承體系,包含渲染層的資源、RHI層的資源、命令、上下文和動態RHI。還詳細闡述了各個主流圖形API下的具體實現和RHI抽象層的關聯。

若拋開圖形API的具體實現細節和眾多的RHI具體子類,將RHI Context/CommandList/Command/Resource等的頂層概念彙總成UML關係圖,則是如下模樣:

classDiagram-v2
FRHIResource <-- FRenderResource
FRHICommandBase <|-- FRHICommand
FRHIResource <-- FRHICommand
FNoncopyable <|-- FRHICommandListBase
FRHICommandBase <-- FRHICommandListBase
IRHIComputeContext <-- FRHICommandListBase

FRHICommandListBase <|-- FRHIComputeCommandList
FRHIComputeCommandList <|-- FRHICommandList
FRHICommandList <|-- FRHICommandListImmediate

IRHIComputeContext <|.. IRHICommandContext
IRHICommandContext <|.. IRHICommandContextPSOFallback

IRHICommandContext <-- IRHICommandContextContainer

下圖是在上面的基礎上細化了子類的UML:

classDiagram-v2
FRHIResource <-- FRenderResource
FRenderResource <|-- FTexture
FRenderResource <|-- FVertexBuffer
FRenderResource <|-- FIndexBuffer

FRHIResource <|-- FRHITexture
FRHIResource <|-- FRHIShader
FRHIResource <|-- FRHIVertexBuffer

FRHICommandBase <|-- FRHICommand

FRHICommand <|-- FRHICommandDrawPrimitive
FRHICommand <|-- FRHICommandResourceTransition
FRHICommand <|-- FRHICommandSetShaderParameter

FRHIResource <-- FRHICommand

FNoncopyable <|-- FRHICommandListBase
class FRHICommandListBase{
FRHICommandBase* Root
IRHICommandContext* Context
}
FRHICommandBase <-- FRHICommandListBase
IRHIComputeContext <-- FRHICommandListBase

FRHICommandListBase <|-- FRHIComputeCommandList
FRHIComputeCommandList <|-- FRHICommandList
FRHICommandList <|-- FRHICommandListImmediate

IRHIComputeContext <|.. IRHICommandContext
IRHICommandContext <|.. IRHICommandContextPSOFallback
IRHICommandContextPSOFallback <|-- FOpenGLDynamicRHI
IRHICommandContextPSOFallback <|-- FD3D11DynamicRHI
IRHICommandContext <|-- FD3D12CommandContextBase
IRHICommandContext <|-- FMetalRHICommandContext
IRHICommandContext <|-- FVulkanCommandListContext

IRHICommandContext <-- IRHICommandContextContainer
IRHICommandContextContainer <|-- FMetalCommandContextContainer
IRHICommandContextContainer <|-- FD3D12CommandContextContainer
IRHICommandContextContainer <|-- FVulkanCommandContextContainer

若看不清,可點選下圖放大:

10.4 RHI機制

本章將講述RHI體系設計的執行機制和原理。

10.4.1 RHI命令執行

10.4.1.1 FRHICommandListExecutor

FRHICommandListExecutor負責將Renderer層的RHI中間指令轉譯(或直接呼叫)到目標平臺的圖形API,它在RHI體系中起著舉足輕重的作用,定義如下:

// Engine\Source\Runtime\RHI\Public\RHICommandList.h

class RHI_API FRHICommandListExecutor
{
public:
enum
{
DefaultBypass = PLATFORM_RHITHREAD_DEFAULT_BYPASS
};
FRHICommandListExecutor()
: bLatchedBypass(!!DefaultBypass)
, bLatchedUseParallelAlgorithms(false)
{
} // 靜態介面, 獲取立即命令列表.
static inline FRHICommandListImmediate& GetImmediateCommandList();
// 靜態介面, 獲取立即非同步計算命令列表.
static inline FRHIAsyncComputeCommandListImmediate& GetImmediateAsyncComputeCommandList(); // 執行命令列表.
void ExecuteList(FRHICommandListBase& CmdList);
void ExecuteList(FRHICommandListImmediate& CmdList);
void LatchBypass(); // 等待RHI執行緒柵欄.
static void WaitOnRHIThreadFence(FGraphEventRef& Fence); // 是否繞過命令生成模式, 如果是, 則直接呼叫目標平臺的圖形API.
FORCEINLINE_DEBUGGABLE bool Bypass()
{
#if CAN_TOGGLE_COMMAND_LIST_BYPASS
return bLatchedBypass;
#else
return !!DefaultBypass;
#endif
}
// 是否使用並行演算法.
FORCEINLINE_DEBUGGABLE bool UseParallelAlgorithms()
{
#if CAN_TOGGLE_COMMAND_LIST_BYPASS
return bLatchedUseParallelAlgorithms;
#else
return FApp::ShouldUseThreadingForPerformance() && !Bypass() && (GSupportsParallelRenderingTasksWithSeparateRHIThread || !IsRunningRHIInSeparateThread());
#endif
}
static void CheckNoOutstandingCmdLists();
static bool IsRHIThreadActive();
static bool IsRHIThreadCompletelyFlushed(); private:
// 內部執行.
void ExecuteInner(FRHICommandListBase& CmdList);
// 內部執行, 真正執行轉譯.
static void ExecuteInner_DoExecute(FRHICommandListBase& CmdList); bool bLatchedBypass;
bool bLatchedUseParallelAlgorithms; // 同步變數.
FThreadSafeCounter UIDCounter;
FThreadSafeCounter OutstandingCmdListCount; // 立即模式的命令佇列.
FRHICommandListImmediate CommandListImmediate;
// 立即模式的非同步計算命令佇列.
FRHIAsyncComputeCommandListImmediate AsyncComputeCmdListImmediate;
};

下面是FRHICommandListExecutor部分重要介面的實現程式碼:

// Engine\Source\Runtime\RHI\Private\RHICommandList.cpp

// 檢測RHI執行緒是否啟用狀態.
bool FRHICommandListExecutor::IsRHIThreadActive()
{
// 是否非同步提交.
bool bAsyncSubmit = CVarRHICmdAsyncRHIThreadDispatch.GetValueOnRenderThread() > 0;
// 1. 先檢測是否存在未完成的子命令列表提交任務.
if (bAsyncSubmit)
{
if (RenderThreadSublistDispatchTask.GetReference() && RenderThreadSublistDispatchTask->IsComplete())
{
RenderThreadSublistDispatchTask = nullptr;
}
if (RenderThreadSublistDispatchTask.GetReference())
{
return true; // it might become active at any time
}
// otherwise we can safely look at RHIThreadTask
} // 2. 再檢測是否存在未完成的RHI執行緒任務.
if (RHIThreadTask.GetReference() && RHIThreadTask->IsComplete())
{
RHIThreadTask = nullptr;
PrevRHIThreadTask = nullptr;
}
return !!RHIThreadTask.GetReference();
} // 檢測RHI執行緒是否完全重新整理了資料.
bool FRHICommandListExecutor::IsRHIThreadCompletelyFlushed()
{
if (IsRHIThreadActive() || GetImmediateCommandList().HasCommands())
{
return false;
}
if (RenderThreadSublistDispatchTask.GetReference() && RenderThreadSublistDispatchTask->IsComplete())
{
#if NEEDS_DEBUG_INFO_ON_PRESENT_HANG
bRenderThreadSublistDispatchTaskClearedOnRT = IsInActualRenderingThread();
bRenderThreadSublistDispatchTaskClearedOnGT = IsInGameThread();
#endif
RenderThreadSublistDispatchTask = nullptr;
}
return !RenderThreadSublistDispatchTask;
} void FRHICommandListExecutor::ExecuteList(FRHICommandListImmediate& CmdList)
{
{
SCOPE_CYCLE_COUNTER(STAT_ImmedCmdListExecuteTime);
ExecuteInner(CmdList);
}
} void FRHICommandListExecutor::ExecuteList(FRHICommandListBase& CmdList)
{
// 執行命令佇列轉換之前先重新整理已有的命令.
if (IsInRenderingThread() && !GetImmediateCommandList().IsExecuting())
{
GetImmediateCommandList().ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);
} // 內部執行.
ExecuteInner(CmdList);
} void FRHICommandListExecutor::ExecuteInner(FRHICommandListBase& CmdList)
{
// 是否在渲染執行緒中.
bool bIsInRenderingThread = IsInRenderingThread();
// 是否在遊戲執行緒中.
bool bIsInGameThread = IsInGameThread(); // 開啟了專用的RHI執行緒.
if (IsRunningRHIInSeparateThread())
{
bool bAsyncSubmit = false;
ENamedThreads::Type RenderThread_Local = ENamedThreads::GetRenderThread_Local();
if (bIsInRenderingThread)
{
if (!bIsInGameThread && !FTaskGraphInterface::Get().IsThreadProcessingTasks(RenderThread_Local))
{
// 把所有需要傳遞的東西都處理掉.
FTaskGraphInterface::Get().ProcessThreadUntilIdle(RenderThread_Local);
}
// 檢測子命令列表任務是否完成.
bAsyncSubmit = CVarRHICmdAsyncRHIThreadDispatch.GetValueOnRenderThread() > 0;
if (RenderThreadSublistDispatchTask.GetReference() && RenderThreadSublistDispatchTask->IsComplete())
{
RenderThreadSublistDispatchTask = nullptr;
if (bAsyncSubmit && RHIThreadTask.GetReference() && RHIThreadTask->IsComplete())
{
RHIThreadTask = nullptr;
PrevRHIThreadTask = nullptr;
}
}
// 檢測RHI執行緒任務是否完成.
if (!bAsyncSubmit && RHIThreadTask.GetReference() && RHIThreadTask->IsComplete())
{
RHIThreadTask = nullptr;
PrevRHIThreadTask = nullptr;
}
} if (CVarRHICmdUseThread.GetValueOnRenderThread() > 0 && bIsInRenderingThread && !bIsInGameThread)
{
// 交換前序和RT執行緒任務的列表.
FRHICommandList* SwapCmdList;
FGraphEventArray Prereq;
Exchange(Prereq, CmdList.RTTasks);
{
QUICK_SCOPE_CYCLE_COUNTER(STAT_FRHICommandListExecutor_SwapCmdLists);
SwapCmdList = new FRHICommandList(CmdList.GetGPUMask()); static_assert(sizeof(FRHICommandList) == sizeof(FRHICommandListImmediate), "We are memswapping FRHICommandList and FRHICommandListImmediate; they need to be swappable.");
SwapCmdList->ExchangeCmdList(CmdList);
CmdList.CopyContext(*SwapCmdList);
CmdList.GPUMask = SwapCmdList->GPUMask;
CmdList.InitialGPUMask = SwapCmdList->GPUMask;
CmdList.PSOContext = SwapCmdList->PSOContext;
CmdList.Data.bInsideRenderPass = SwapCmdList->Data.bInsideRenderPass;
CmdList.Data.bInsideComputePass = SwapCmdList->Data.bInsideComputePass;
} // 提交任務.
QUICK_SCOPE_CYCLE_COUNTER(STAT_FRHICommandListExecutor_SubmitTasks); // 建立FDispatchRHIThreadTask, 並將AllOutstandingTasks和RenderThreadSublistDispatchTask作為它的前序任務.
if (AllOutstandingTasks.Num() || RenderThreadSublistDispatchTask.GetReference())
{
Prereq.Append(AllOutstandingTasks);
AllOutstandingTasks.Reset();
if (RenderThreadSublistDispatchTask.GetReference())
{
Prereq.Add(RenderThreadSublistDispatchTask);
}
RenderThreadSublistDispatchTask = TGraphTask<FDispatchRHIThreadTask>::CreateTask(&Prereq, ENamedThreads::GetRenderThread()).ConstructAndDispatchWhenReady(SwapCmdList, bAsyncSubmit);
}
// 建立FExecuteRHIThreadTask, 並將RHIThreadTask作為它的前序任務.
else
{
if (RHIThreadTask.GetReference())
{
Prereq.Add(RHIThreadTask);
}
PrevRHIThreadTask = RHIThreadTask;
RHIThreadTask = TGraphTask<FExecuteRHIThreadTask>::CreateTask(&Prereq, ENamedThreads::GetRenderThread()).ConstructAndDispatchWhenReady(SwapCmdList);
} if (CVarRHICmdForceRHIFlush.GetValueOnRenderThread() > 0 )
{
// 檢測渲染執行緒是否死鎖.
if (FTaskGraphInterface::Get().IsThreadProcessingTasks(RenderThread_Local))
{
// this is a deadlock. RT tasks must be done by now or they won't be done. We could add a third queue...
UE_LOG(LogRHI, Fatal, TEXT("Deadlock in FRHICommandListExecutor::ExecuteInner 2."));
} // 檢測RenderThreadSublistDispatchTask是否完成.
if (RenderThreadSublistDispatchTask.GetReference())
{
FTaskGraphInterface::Get().WaitUntilTaskCompletes(RenderThreadSublistDispatchTask, RenderThread_Local);
RenderThreadSublistDispatchTask = nullptr;
} // 等待RHIThreadTask完成.
while (RHIThreadTask.GetReference())
{
FTaskGraphInterface::Get().WaitUntilTaskCompletes(RHIThreadTask, RenderThread_Local);
if (RHIThreadTask.GetReference() && RHIThreadTask->IsComplete())
{
RHIThreadTask = nullptr;
PrevRHIThreadTask = nullptr;
}
}
} return;
} // 執行RTTasks/RenderThreadSublistDispatchTask/RHIThreadTask等任務.
if (bIsInRenderingThread)
{
if (CmdList.RTTasks.Num())
{
if (FTaskGraphInterface::Get().IsThreadProcessingTasks(RenderThread_Local))
{
UE_LOG(LogRHI, Fatal, TEXT("Deadlock in FRHICommandListExecutor::ExecuteInner (RTTasks)."));
}
FTaskGraphInterface::Get().WaitUntilTasksComplete(CmdList.RTTasks, RenderThread_Local);
CmdList.RTTasks.Reset(); }
if (RenderThreadSublistDispatchTask.GetReference())
{
if (FTaskGraphInterface::Get().IsThreadProcessingTasks(RenderThread_Local))
{
// this is a deadlock. RT tasks must be done by now or they won't be done. We could add a third queue...
UE_LOG(LogRHI, Fatal, TEXT("Deadlock in FRHICommandListExecutor::ExecuteInner (RenderThreadSublistDispatchTask)."));
}
FTaskGraphInterface::Get().WaitUntilTaskCompletes(RenderThreadSublistDispatchTask, RenderThread_Local);
#if NEEDS_DEBUG_INFO_ON_PRESENT_HANG
bRenderThreadSublistDispatchTaskClearedOnRT = IsInActualRenderingThread();
bRenderThreadSublistDispatchTaskClearedOnGT = bIsInGameThread;
#endif
RenderThreadSublistDispatchTask = nullptr;
}
while (RHIThreadTask.GetReference())
{
if (FTaskGraphInterface::Get().IsThreadProcessingTasks(RenderThread_Local))
{
// this is a deadlock. RT tasks must be done by now or they won't be done. We could add a third queue...
UE_LOG(LogRHI, Fatal, TEXT("Deadlock in FRHICommandListExecutor::ExecuteInner (RHIThreadTask)."));
}
FTaskGraphInterface::Get().WaitUntilTaskCompletes(RHIThreadTask, RenderThread_Local);
if (RHIThreadTask.GetReference() && RHIThreadTask->IsComplete())
{
RHIThreadTask = nullptr;
PrevRHIThreadTask = nullptr;
}
}
}
}
// 非RHI專用執行緒.
else
{
if (bIsInRenderingThread && CmdList.RTTasks.Num())
{
ENamedThreads::Type RenderThread_Local = ENamedThreads::GetRenderThread_Local();
if (FTaskGraphInterface::Get().IsThreadProcessingTasks(RenderThread_Local))
{
// this is a deadlock. RT tasks must be done by now or they won't be done. We could add a third queue...
UE_LOG(LogRHI, Fatal, TEXT("Deadlock in FRHICommandListExecutor::ExecuteInner (RTTasks)."));
}
FTaskGraphInterface::Get().WaitUntilTasksComplete(CmdList.RTTasks, RenderThread_Local);
CmdList.RTTasks.Reset();
}
} // 內部執行命令.
ExecuteInner_DoExecute(CmdList);
} void FRHICommandListExecutor::ExecuteInner_DoExecute(FRHICommandListBase& CmdList)
{
FScopeCycleCounter ScopeOuter(CmdList.ExecuteStat); CmdList.bExecuting = true;
check(CmdList.Context || CmdList.ComputeContext); FMemMark Mark(FMemStack::Get()); // 設定多GPU的Mask.
#if WITH_MGPU
if (CmdList.Context != nullptr)
{
CmdList.Context->RHISetGPUMask(CmdList.InitialGPUMask);
}
if (CmdList.ComputeContext != nullptr && CmdList.ComputeContext != CmdList.Context)
{
CmdList.ComputeContext->RHISetGPUMask(CmdList.InitialGPUMask);
}
#endif FRHICommandListDebugContext DebugContext;
FRHICommandListIterator Iter(CmdList);
// 統計執行資訊.
#if STATS
bool bDoStats = CVarRHICmdCollectRHIThreadStatsFromHighLevel.GetValueOnRenderThread() > 0 && FThreadStats::IsCollectingData() && (IsInRenderingThread() || IsInRHIThread());
if (bDoStats)
{
while (Iter.HasCommandsLeft())
{
TStatIdData const* Stat = GCurrentExecuteStat.GetRawPointer();
FScopeCycleCounter Scope(GCurrentExecuteStat);
while (Iter.HasCommandsLeft() && Stat == GCurrentExecuteStat.GetRawPointer())
{
FRHICommandBase* Cmd = Iter.NextCommand();
Cmd->ExecuteAndDestruct(CmdList, DebugContext);
}
}
}
else
// 統計指定事件.
#elif ENABLE_STATNAMEDEVENTS
bool bDoStats = CVarRHICmdCollectRHIThreadStatsFromHighLevel.GetValueOnRenderThread() > 0 && GCycleStatsShouldEmitNamedEvents && (IsInRenderingThread() || IsInRHIThread());
if (bDoStats)
{
while (Iter.HasCommandsLeft())
{
PROFILER_CHAR const* Stat = GCurrentExecuteStat.StatString;
FScopeCycleCounter Scope(GCurrentExecuteStat);
while (Iter.HasCommandsLeft() && Stat == GCurrentExecuteStat.StatString)
{
FRHICommandBase* Cmd = Iter.NextCommand();
Cmd->ExecuteAndDestruct(CmdList, DebugContext);
}
}
}
else
#endif
// 不除錯或不統計資訊的版本.
{
// 迴圈所有命令, 執行並銷燬之.
while (Iter.HasCommandsLeft())
{
FRHICommandBase* Cmd = Iter.NextCommand();
GCurrentCommand = Cmd;
Cmd->ExecuteAndDestruct(CmdList, DebugContext);
}
}
// 充值命令列表.
CmdList.Reset();
}

由此可知,FRHICommandListExecutor處理了複雜的各類任務,並且要判定任務的前序、等待、依賴關係,還有各個執行緒之間的依賴和等待關係。上述程式碼中涉及到了兩個重要的任務型別:

// 派發RHI執行緒任務.
class FDispatchRHIThreadTask
{
FRHICommandListBase* RHICmdList; // 待派發的命令列表.
bool bRHIThread; // 是否在RHI執行緒中派發. public:
FDispatchRHIThreadTask(FRHICommandListBase* InRHICmdList, bool bInRHIThread)
: RHICmdList(InRHICmdList)
, bRHIThread(bInRHIThread)
{
}
FORCEINLINE TStatId GetStatId() const;
static ESubsequentsMode::Type GetSubsequentsMode() { return ESubsequentsMode::TrackSubsequents; } // 預期的執行緒由是否在RHI執行緒/是否在獨立的RHI執行緒等變數決定.
ENamedThreads::Type GetDesiredThread()
{
return bRHIThread ? (IsRunningRHIInDedicatedThread() ? ENamedThreads::RHIThread : CPrio_RHIThreadOnTaskThreads.Get()) : ENamedThreads::GetRenderThread_Local();
} void DoTask(ENamedThreads::Type CurrentThread, const FGraphEventRef& MyCompletionGraphEvent)
{
// 前序任務是RHIThreadTask.
FGraphEventArray Prereq;
if (RHIThreadTask.GetReference())
{
Prereq.Add(RHIThreadTask);
}
// 將當前任務放到PrevRHIThreadTask中.
PrevRHIThreadTask = RHIThreadTask;
// 建立FExecuteRHIThreadTask任務並賦值到RHIThreadTask.
RHIThreadTask = TGraphTask<FExecuteRHIThreadTask>::CreateTask(&Prereq, CurrentThread).ConstructAndDispatchWhenReady(RHICmdList);
}
}; // 執行RHI執行緒任務.
class FExecuteRHIThreadTask
{
FRHICommandListBase* RHICmdList; public:
FExecuteRHIThreadTask(FRHICommandListBase* InRHICmdList)
: RHICmdList(InRHICmdList)
{
} FORCEINLINE TStatId GetStatId() const;
static ESubsequentsMode::Type GetSubsequentsMode() { return ESubsequentsMode::TrackSubsequents; } // 根據是否在專用的RHI執行緒而選擇RHI或渲染執行緒.
ENamedThreads::Type GetDesiredThread()
{
return IsRunningRHIInDedicatedThread() ? ENamedThreads::RHIThread : CPrio_RHIThreadOnTaskThreads.Get();
} void DoTask(ENamedThreads::Type CurrentThread, const FGraphEventRef& MyCompletionGraphEvent)
{
// 設定全域性變數GRHIThreadId
if (IsRunningRHIInTaskThread())
{
GRHIThreadId = FPlatformTLS::GetCurrentThreadId();
} // 執行RHI命令佇列.
{
// 臨界區, 保證執行緒訪問安全.
FScopeLock Lock(&GRHIThreadOnTasksCritical); FRHICommandListExecutor::ExecuteInner_DoExecute(*RHICmdList);
delete RHICmdList;
} // 清空全域性變數GRHIThreadId
if (IsRunningRHIInTaskThread())
{
GRHIThreadId = 0;
}
}
};

由上可知,在派發和轉譯命令佇列時,可能在專用的RHI執行緒執行,也可能在渲染執行緒或工作執行緒執行。

10.4.1.2 GRHICommandList

GRHICommandList乍一看以為是FRHICommandListBase的例項,但實際型別是FRHICommandListExecutor。它的宣告和實現如下:

// Engine\Source\Runtime\RHI\Public\RHICommandList.h
extern RHI_API FRHICommandListExecutor GRHICommandList; // Engine\Source\Runtime\RHI\Private\RHICommandList.cpp
RHI_API FRHICommandListExecutor GRHICommandList;

有關GRHICommandList的全域性或靜態介面如下:

FRHICommandListImmediate& FRHICommandListExecutor::GetImmediateCommandList()
{
return GRHICommandList.CommandListImmediate;
} FRHIAsyncComputeCommandListImmediate& FRHICommandListExecutor::GetImmediateAsyncComputeCommandList()
{
return GRHICommandList.AsyncComputeCmdListImmediate;
}

在UE的渲染模組和RHI模組中擁有大量的GRHICommandList使用案例,取其中之一:

// Engine\Source\Runtime\Renderer\Private\DeferredShadingRenderer.cpp

void ServiceLocalQueue()
{
FTaskGraphInterface::Get().ProcessThreadUntilIdle(ENamedThreads::GetRenderThread_Local()); if (IsRunningRHIInSeparateThread())
{
FRHICommandListExecutor::GetImmediateCommandList().ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);
}
}

在RHI命令佇列模組,除了涉及GRHICommandList,還涉及諸多全域性的任務變數:

// Engine\Source\Runtime\RHI\Private\RHICommandList.cpp

static FGraphEventArray AllOutstandingTasks;
static FGraphEventArray WaitOutstandingTasks;
static FGraphEventRef RHIThreadTask;
static FGraphEventRef PrevRHIThreadTask;
static FGraphEventRef RenderThreadSublistDispatchTask;

它們的建立或新增任務的程式碼如下:

void FRHICommandListBase::QueueParallelAsyncCommandListSubmit(FGraphEventRef* AnyThreadCompletionEvents, ...)
{
(......) if (Num && IsRunningRHIInSeparateThread())
{
(......) // 建立FParallelTranslateSetupCommandList任務.
FGraphEventRef TranslateSetupCompletionEvent = TGraphTask<FParallelTranslateSetupCommandList>::CreateTask(&Prereq, ENamedThreads::GetRenderThread()).ConstructAndDispatchWhenReady(CmdList, &RHICmdLists[0], Num, bIsPrepass);
QueueCommandListSubmit(CmdList);
// 新增到AllOutstandingTasks.
AllOutstandingTasks.Add(TranslateSetupCompletionEvent); (......) FGraphEventArray Prereq;
FRHICommandListBase** RHICmdLists = (FRHICommandListBase**)Alloc(sizeof(FRHICommandListBase*) * (1 + Last - Start), alignof(FRHICommandListBase*));
// 將所有外部任務AnyThreadCompletionEvents加入到對應的列表中.
for (int32 Index = Start; Index <= Last; Index++)
{
FGraphEventRef& AnyThreadCompletionEvent = AnyThreadCompletionEvents[Index];
FRHICommandList* CmdList = CmdLists[Index];
RHICmdLists[Index - Start] = CmdList;
if (AnyThreadCompletionEvent.GetReference())
{
Prereq.Add(AnyThreadCompletionEvent);
AllOutstandingTasks.Add(AnyThreadCompletionEvent);
WaitOutstandingTasks.Add(AnyThreadCompletionEvent);
}
} (......) // 並行轉譯任務FParallelTranslateCommandList.
FGraphEventRef TranslateCompletionEvent = TGraphTask<FParallelTranslateCommandList>::CreateTask(&Prereq, ENamedThreads::GetRenderThread()).ConstructAndDispatchWhenReady(&RHICmdLists[0], 1 + Last - Start, ContextContainer, bIsPrepass);
AllOutstandingTasks.Add(TranslateCompletionEvent); (......)
} void FRHICommandListBase::QueueAsyncCommandListSubmit(FGraphEventRef& AnyThreadCompletionEvent, class FRHICommandList* CmdList)
{
(......) // 處理外部任務AnyThreadCompletionEvent
if (AnyThreadCompletionEvent.GetReference())
{
if (IsRunningRHIInSeparateThread())
{
AllOutstandingTasks.Add(AnyThreadCompletionEvent);
}
WaitOutstandingTasks.Add(AnyThreadCompletionEvent);
} (......)
} class FDispatchRHIThreadTask
{
void DoTask(ENamedThreads::Type CurrentThread, const FGraphEventRef& MyCompletionGraphEvent)
{
(......) // 建立RHI執行緒任務FExecuteRHIThreadTask.
PrevRHIThreadTask = RHIThreadTask;
RHIThreadTask = TGraphTask<FExecuteRHIThreadTask>::CreateTask(&Prereq, CurrentThread).ConstructAndDispatchWhenReady(RHICmdList);
}
}; class FParallelTranslateSetupCommandList
{
void DoTask(ENamedThreads::Type CurrentThread, const FGraphEventRef& MyCompletionGraphEvent)
{
(......) // 建立並行轉譯任務FParallelTranslateCommandList.
FGraphEventRef TranslateCompletionEvent = TGraphTask<FParallelTranslateCommandList>::CreateTask(nullptr, ENamedThreads::GetRenderThread()).ConstructAndDispatchWhenReady(&RHICmdLists[Start], 1 + Last - Start, ContextContainer, bIsPrepass);
MyCompletionGraphEvent->DontCompleteUntil(TranslateCompletionEvent);
// 利用RHICmdList的介面FRHICommandWaitForAndSubmitSubListParallel提交任務, 最終會進入AllOutstandingTasks和WaitOutstandingTasks.
ALLOC_COMMAND_CL(*RHICmdList, FRHICommandWaitForAndSubmitSubListParallel)(TranslateCompletionEvent, ContextContainer, EffectiveThreads, ThreadIndex++); }; void FRHICommandListExecutor::ExecuteInner(FRHICommandListBase& CmdList)
{
(......) if (IsRunningRHIInSeparateThread())
{
(......) if (AllOutstandingTasks.Num() || RenderThreadSublistDispatchTask.GetReference())
{
(......)
// 建立渲染執行緒子命令派發(提交)任務FDispatchRHIThreadTask.
RenderThreadSublistDispatchTask = TGraphTask<FDispatchRHIThreadTask>::CreateTask(&Prereq, ENamedThreads::GetRenderThread()).ConstructAndDispatchWhenReady(SwapCmdList, bAsyncSubmit);
}
else
{
(......)
PrevRHIThreadTask = RHIThreadTask;
// 建立渲染執行緒子命令轉譯任務FExecuteRHIThreadTask.
RHIThreadTask = TGraphTask<FExecuteRHIThreadTask>::CreateTask(&Prereq, ENamedThreads::GetRenderThread()).ConstructAndDispatchWhenReady(SwapCmdList);
} (......)
}

總結一下這些任務變數的作用:

任務變數 執行執行緒 描述
AllOutstandingTasks 渲染、RHI、工作 所有在處理或待處理的任務列表。型別是FParallelTranslateSetupCommandList、FParallelTranslateCommandList。
WaitOutstandingTasks 渲染、RHI、工作 待處理的任務列表。型別是FParallelTranslateSetupCommandList、FParallelTranslateCommandList。
RHIThreadTask RHI、工作 正在處理的RHI執行緒任務。型別是FExecuteRHIThreadTask。
PrevRHIThreadTask RHI、工作 上一次處理的RHIThreadTask。型別是FExecuteRHIThreadTask。
RenderThreadSublistDispatchTask 渲染、RHI、工作 正在派發(提交)的任務。型別是FDispatchRHIThreadTask。

10.4.1.3 D3D11命令執行

本節將研究UE4.26在PC平臺的通用RHI及D3D11命令執行過程和機制。由於UE4.26在PC平臺預設的RHI是D3D11,並且關鍵的幾個控制檯變數的預設值如下:

也就是說開啟了命令跳過模式,並且禁用了RHI執行緒。在此情況下,FRHICommandList的某個介面被呼叫時,不會生成單獨的FRHICommand,而是直接呼叫Context的方法。以FRHICommandList::DrawPrimitive為例:

class RHI_API FRHICommandList : public FRHIComputeCommandList
{
void DrawPrimitive(uint32 BaseVertexIndex, uint32 NumPrimitives, uint32 NumInstances)
{
// 預設情況下Bypass為1, 進入此分支.
if (Bypass())
{
// 直接呼叫圖形API的上下文的對應方法.
GetContext().RHIDrawPrimitive(BaseVertexIndex, NumPrimitives, NumInstances);
return;
} // 分配單獨的FRHICommandDrawPrimitive命令.
ALLOC_COMMAND(FRHICommandDrawPrimitive)(BaseVertexIndex, NumPrimitives, NumInstances);
}
}

因此,在PC的預設圖形API(D3D11)下,r.RHICmdBypass1且r.RHIThread.Enable0,FRHICommandList將直接呼叫圖形API的上下文的介面,相當於同步呼叫圖形API,此時的圖形API運行於渲染執行緒(如果開啟)。

接著將r.RHICmdBypass設為0,但保持r.RHIThread.Enable為0,此時不再直接呼叫Context的方法,而是通過生成一條條單獨的FRHICommand,然後由FRHICommandList相關的物件執行。還是以FRHICommandList::DrawPrimitive為例,呼叫堆疊如下所示:

class RHI_API FRHICommandList : public FRHIComputeCommandList
{
void FRHICommandList::DrawPrimitive(uint32 BaseVertexIndex, uint32 NumPrimitives, uint32 NumInstances)
{
// 預設情況下Bypass為1, 進入此分支.
if (Bypass())
{
// 直接呼叫圖形API的上下文的對應方法.
GetContext().RHIDrawPrimitive(BaseVertexIndex, NumPrimitives, NumInstances);
return;
} // 分配單獨的FRHICommandDrawPrimitive命令.
// ALLOC_COMMAND巨集會呼叫AllocCommand介面.
ALLOC_COMMAND(FRHICommandDrawPrimitive)(BaseVertexIndex, NumPrimitives, NumInstances);
} template <typename TCmd>
void* AllocCommand()
{
return AllocCommand(sizeof(TCmd), alignof(TCmd));
} void* AllocCommand(int32 AllocSize, int32 Alignment)
{
FRHICommandBase* Result = (FRHICommandBase*) MemManager.Alloc(AllocSize, Alignment);
++NumCommands;
// CommandLink指向了上一個命令節點的Next.
*CommandLink = Result;
// 將CommandLink賦值為當前節點的Next.
CommandLink = &Result->Next;
return Result;
}
}

利用ALLOC_COMMAND分配的命令例項會進入FRHICommandListBase的命令連結串列,但此時並未執行,而是等待其它合適的時機執行,例如在FRHICommandListImmediate::ImmediateFlush。下面是執行FRHICommandList的呼叫堆疊:

由呼叫堆疊可以得知,在此情況下,命令執行的過程變得複雜起來,多了很多中間執行步驟。還是以FRHICommandList::DrawPrimitive為例,呼叫流程示意圖如下:

graph TD
A[FRHICommandListImmediate::ImmediateFlush] --> B[FRHICommandListExecutor::ExecuteList]
B --> C[FRHICommandListExecutor::ExecuteInner]
C --> D[FRHICommandListExecutor::ExecuteInner_DoExecute]
D --> E[FRHICommand::ExecuteAndDestruct]
E --> F[FRHICommandDrawPrimitive::Execute]
F --> G[INTERNAL_DECORATOR]
G --> H[FD3D11DynamicRHI::RHIDrawPrimitive]

上圖的使用了巨集INTERNAL_DECORATOR,其和相關巨集的定義如下:

// Engine\Source\Runtime\RHI\Public\RHICommandListCommandExecutes.inl

#define INTERNAL_DECORATOR(Method) CmdList.GetContext().Method
#define INTERNAL_DECORATOR_COMPUTE(Method) CmdList.GetComputeContext().Method

相當於通過巨集來呼叫CommandList的Context介面。

在RHI禁用(r.RHIThread.Enable==0)情況下,以上的呼叫在渲染執行緒執行:

接下來將r.RHIThread.Enable設為1,以開啟RHI執行緒。此時執行命令的執行緒變成了RHI:

並且呼叫堆疊是從TaskGraph的RHI執行緒發起任務:

此時,命令執行的流程圖如下:

graph TD
A[FRHICommandListImmediate::ImmediateFlush] --> B[FRHICommandListExecutor::ExecuteList]
B --> C[FRHICommandListExecutor::ExecuteInner]

C --> C1(FExecuteRHIThreadTask::DoTask)
C1 --> D(FRHICommandListExecutor::ExecuteInner_DoExecute)
D --> E(FRHICommand::ExecuteAndDestruct)
E --> F(FRHICommandDrawPrimitive::Execute)
F --> G(INTERNAL_DECORATOR)
G --> H(FD3D11DynamicRHI::RHIDrawPrimitive)

上面流程圖中,方角表示在渲染執行緒執行,而圓角在RHI執行緒執行。開啟RHI執行緒後,將出現它的統計資料:

左:未開啟RHI執行緒的統計資料;右:開啟RHI執行緒後的統計資料。

下面繪製出開啟或關閉Bypass和RHI執行緒的流程圖(以呼叫D3D11的DrawPrimitive為例):

graph TD
a1[FRHICommandList::DrawPrimitive] --> a2{Bypass?}
a2 -->|No| a4[ALLOC_COMMAND_FRHICommandDrawPrimitive]
a2 -->|Yes| a3[FD3D11DynamicRHI::RHIDrawPrimitive]
a4 --> a5[FRHICommandListBase::AllocCommand]
a5 --> a6[......]
a6 --> A[FRHICommandListImmediate::ImmediateFlush]

A --> B[FRHICommandListExecutor::ExecuteList]
B --> C[FRHICommandListExecutor::ExecuteInner]

C --> C11{RHIThreadEnabled?}
C11 -->|No| D11[FRHICommandListExecutor::ExecuteInner_DoExecute]
D11 --> E11[FRHICommand::ExecuteAndDestruct]
E11 --> F11[FRHICommandDrawPrimitive::Execute]
F11 --> G11[INTERNAL_DECORATOR_RHIDrawPrimitive]
G11 --> H11[FD3D11DynamicRHI::RHIDrawPrimitive]

C11 -->|Yes| c0(.....)
c0 -->C1(FExecuteRHIThreadTask::DoTask)
C1 --> D(FRHICommandListExecutor::ExecuteInner_DoExecute)
D --> E(FRHICommand::ExecuteAndDestruct)
E --> F(FRHICommandDrawPrimitive::Execute)
F --> G(INTERNAL_DECORATOR_RHIDrawPrimitive)
G --> H(FD3D11DynamicRHI::RHIDrawPrimitive)

上面流程圖中,方角表示在渲染執行緒中執行,圓角表示在RHI執行緒中執行。

10.4.2 ImmediateFlush

在章節10.3.3 FDynamicRHI中,提及了重新整理型別(FlushType),是指EImmediateFlushType定義的型別:

// Engine\Source\Runtime\RHI\Public\RHICommandList.h

namespace EImmediateFlushType
{
enum Type
{
WaitForOutstandingTasksOnly = 0, // 等待僅正在處理的任務完成.
DispatchToRHIThread, // 派發到RHI執行緒.
WaitForDispatchToRHIThread, // 等待派發到RHI執行緒.
FlushRHIThread, // 重新整理RHI執行緒.
FlushRHIThreadFlushResources, // 重新整理RHI執行緒和資源
FlushRHIThreadFlushResourcesFlushDeferredDeletes // 重新整理RHI執行緒/資源和延遲刪除.
};
};

EImmediateFlushType中各個值的區別在FRHICommandListImmediate::ImmediateFlush的實現程式碼中體現出來:

// Engine\Source\Runtime\RHI\Public\RHICommandList.inl

void FRHICommandListImmediate::ImmediateFlush(EImmediateFlushType::Type FlushType)
{
switch (FlushType)
{
// 等待任務完成.
case EImmediateFlushType::WaitForOutstandingTasksOnly:
{
WaitForTasks();
}
break;
// 派發RHI執行緒(執行命令佇列)
case EImmediateFlushType::DispatchToRHIThread:
{
if (HasCommands())
{
GRHICommandList.ExecuteList(*this);
}
}
break;
// 等待RHI執行緒派發.
case EImmediateFlushType::WaitForDispatchToRHIThread:
{
if (HasCommands())
{
GRHICommandList.ExecuteList(*this);
}
WaitForDispatch();
}
break;
// 重新整理RHI執行緒.
case EImmediateFlushType::FlushRHIThread:
{
// 派發並等待RHI執行緒.
if (HasCommands())
{
GRHICommandList.ExecuteList(*this);
}
WaitForDispatch(); // 等待RHI執行緒任務.
if (IsRunningRHIInSeparateThread())
{
WaitForRHIThreadTasks();
} // 重置正在處理的任務列表.
WaitForTasks(true);
}
break;
case EImmediateFlushType::FlushRHIThreadFlushResources:
case EImmediateFlushType::FlushRHIThreadFlushResourcesFlushDeferredDeletes:
{
if (HasCommands())
{
GRHICommandList.ExecuteList(*this);
}
WaitForDispatch();
WaitForRHIThreadTasks();
WaitForTasks(true); // 重新整理管線狀態快取的資源.
PipelineStateCache::FlushResources();
// 重新整理將要刪除的資源.
FRHIResource::FlushPendingDeletes(FlushType == EImmediateFlushType::FlushRHIThreadFlushResourcesFlushDeferredDeletes);
}
break;
}
}

上面程式碼中涉及到了若干種處理和等待任務的介面,它們的實現如下:

// 等待任務完成.
void FRHICommandListBase::WaitForTasks(bool bKnownToBeComplete)
{
if (WaitOutstandingTasks.Num())
{
// 檢測是否存在未完成的等待任務.
bool bAny = false;
for (int32 Index = 0; Index < WaitOutstandingTasks.Num(); Index++)
{
if (!WaitOutstandingTasks[Index]->IsComplete())
{
bAny = true;
break;
}
}
// 存在就利用TaskGraph的介面開啟執行緒等待.
if (bAny)
{
ENamedThreads::Type RenderThread_Local = ENamedThreads::GetRenderThread_Local();
FTaskGraphInterface::Get().WaitUntilTasksComplete(WaitOutstandingTasks, RenderThread_Local);
}
// 重置等待任務列表.
WaitOutstandingTasks.Reset();
}
} // 等待渲染執行緒派發完成.
void FRHICommandListBase::WaitForDispatch()
{
// 如果RenderThreadSublistDispatchTask已完成, 則置空.
if (RenderThreadSublistDispatchTask.GetReference() && RenderThreadSublistDispatchTask->IsComplete())
{
RenderThreadSublistDispatchTask = nullptr;
} // RenderThreadSublistDispatchTask有未完成的任務.
while (RenderThreadSublistDispatchTask.GetReference())
{
ENamedThreads::Type RenderThread_Local = ENamedThreads::GetRenderThread_Local();
FTaskGraphInterface::Get().WaitUntilTaskCompletes(RenderThreadSublistDispatchTask, RenderThread_Local);
if (RenderThreadSublistDispatchTask.GetReference() && RenderThreadSublistDispatchTask->IsComplete())
{
RenderThreadSublistDispatchTask = nullptr;
}
}
} // 等待RHI執行緒任務完成.
void FRHICommandListBase::WaitForRHIThreadTasks()
{
bool bAsyncSubmit = CVarRHICmdAsyncRHIThreadDispatch.GetValueOnRenderThread() > 0;
ENamedThreads::Type RenderThread_Local = ENamedThreads::GetRenderThread_Local(); // 相當於執行FRHICommandListBase::WaitForDispatch()
if (bAsyncSubmit)
{
if (RenderThreadSublistDispatchTask.GetReference() && RenderThreadSublistDispatchTask->IsComplete())
{
RenderThreadSublistDispatchTask = nullptr;
}
while (RenderThreadSublistDispatchTask.GetReference())
{
if (FTaskGraphInterface::Get().IsThreadProcessingTasks(RenderThread_Local))
{
while (!RenderThreadSublistDispatchTask->IsComplete())
{
FPlatformProcess::SleepNoStats(0);
}
}
else
{
FTaskGraphInterface::Get().WaitUntilTaskCompletes(RenderThreadSublistDispatchTask, RenderThread_Local);
} if (RenderThreadSublistDispatchTask.GetReference() && RenderThreadSublistDispatchTask->IsComplete())
{
RenderThreadSublistDispatchTask = nullptr;
}
}
// now we can safely look at RHIThreadTask
} // 如果RHI執行緒任務已完成, 則置空任務.
if (RHIThreadTask.GetReference() && RHIThreadTask->IsComplete())
{
RHIThreadTask = nullptr;
PrevRHIThreadTask = nullptr;
} // 如果RHI執行緒有任務未完成, 則執行並等待.
while (RHIThreadTask.GetReference())
{
// 如果已在處理, 則用sleep(0)跳過此時間片.
if (FTaskGraphInterface::Get().IsThreadProcessingTasks(RenderThread_Local))
{
while (!RHIThreadTask->IsComplete())
{
FPlatformProcess::SleepNoStats(0);
}
}
// 任務尚未處理, 開始並等待之.
else
{
FTaskGraphInterface::Get().WaitUntilTaskCompletes(RHIThreadTask, RenderThread_Local);
} // 如果RHI執行緒任務已完成, 則置空任務.
if (RHIThreadTask.GetReference() && RHIThreadTask->IsComplete())
{
RHIThreadTask = nullptr;
PrevRHIThreadTask = nullptr;
}
}
}

10.4.3 並行渲染

本篇開頭也提到了在開啟RHI執行緒的情況下,RHI執行緒負責將渲染執行緒Push進來的RHI中間指令轉譯到對應圖形平臺的GPU指令。如果渲染執行緒是並行生成的RHI中間指令,那麼RHI執行緒也會並行轉譯。

在正式闡述並行渲染和轉譯之前,需要先了解一些基礎概念和型別。

10.4.3.1 FParallelCommandListSet

FParallelCommandListSet的定義如下:

// Engine\Source\Runtime\Renderer\Private\SceneRendering.h

class FParallelCommandListSet
{
public:
// 所屬的檢視.
const FViewInfo& View;
// 父命令佇列.
FRHICommandListImmediate& ParentCmdList;
// 場景RT快照.
FSceneRenderTargets* Snapshot; TStatId ExecuteStat;
int32 Width;
int32 NumAlloc;
int32 MinDrawsPerCommandList;
// 是否平衡命令佇列, 見r.RHICmdBalanceParallelLists
bool bBalanceCommands;
// see r.RHICmdSpewParallelListBalance
bool bSpewBalance; // 命令佇列列表.
TArray<FRHICommandList*,SceneRenderingAllocator> CommandLists;
// 同步事件.
TArray<FGraphEventRef,SceneRenderingAllocator> Events;
// 命令佇列的繪製次數, 若是-1則未知. 高估總比沒有好.
TArray<int32,SceneRenderingAllocator> NumDrawsIfKnown; FParallelCommandListSet(TStatId InExecuteStat, const FViewInfo& InView, FRHICommandListImmediate& InParentCmdList, bool bInCreateSceneContext);
virtual ~FParallelCommandListSet(); // 獲取數量.
int32 NumParallelCommandLists() const;
// 新建一個並行的命令佇列.
FRHICommandList* NewParallelCommandList();
// 獲取前序任務.
FORCEINLINE FGraphEventArray* GetPrereqs();
// 增加並行的命令佇列.
void AddParallelCommandList(FRHICommandList* CmdList, FGraphEventRef& CompletionEvent, int32 InNumDrawsIfKnown = -1);
virtual void SetStateOnCommandList(FRHICommandList& CmdList) {}
// 等待任務完成.
static void WaitForTasks(); protected:
// 派發, 須由子類呼叫.
void Dispatch(bool bHighPriority = false);
// 分配新的命令佇列.
FRHICommandList* AllocCommandList();
// 是否建立場景上下文.
bool bCreateSceneContext; private:
void WaitForTasksInternal();
};

下面是FParallelCommandListSet的重要介面的實現程式碼:

// Engine\Source\Runtime\Renderer\Private\SceneRendering.cpp

FRHICommandList* FParallelCommandListSet::AllocCommandList()
{
NumAlloc++;
return new FRHICommandList(ParentCmdList.GetGPUMask());
} void FParallelCommandListSet::Dispatch(bool bHighPriority)
{
ENamedThreads::Type RenderThread_Local = ENamedThreads::GetRenderThread_Local();
if (bSpewBalance)
{
// 等待之前的任務完成.
for (auto& Event : Events)
{
FTaskGraphInterface::Get().WaitUntilTaskCompletes(Event, RenderThread_Local);
}
} // 是否並行轉譯.
bool bActuallyDoParallelTranslate = GRHISupportsParallelRHIExecute && CommandLists.Num() >= CVarRHICmdMinCmdlistForParallelSubmit.GetValueOnRenderThread();
if (bActuallyDoParallelTranslate)
{
int32 Total = 0;
bool bIndeterminate = false;
for (int32 Count : NumDrawsIfKnown)
{
// 不能確定這裡面有多少, 假設應該進行平行轉譯.
if (Count < 0)
{
bIndeterminate = true;
break;
}
Total += Count;
} // 命令佇列數量太少, 不併行轉譯.
if (!bIndeterminate && Total < MinDrawsPerCommandList)
{
bActuallyDoParallelTranslate = false;
}
} if (bActuallyDoParallelTranslate)
{
// 確保支援並行的RHI執行.
check(GRHISupportsParallelRHIExecute);
NumAlloc -= CommandLists.Num(); // 用父命令佇列入隊並行非同步命令佇列提交.
ParentCmdList.QueueParallelAsyncCommandListSubmit(&Events[0], bHighPriority, &CommandLists[0], &NumDrawsIfKnown[0], CommandLists.Num(), (MinDrawsPerCommandList * 4) / 3, bSpewBalance);
SetStateOnCommandList(ParentCmdList);
// 結束Pass渲染.
ParentCmdList.EndRenderPass();
}
else // 非並行模式.
{
for (int32 Index = 0; Index < CommandLists.Num(); Index++)
{
ParentCmdList.QueueAsyncCommandListSubmit(Events[Index], CommandLists[Index]);
NumAlloc--;
}
} // 重置資料.
CommandLists.Reset();
Snapshot = nullptr;
Events.Reset(); // 等待渲染執行緒處理完成.
FTaskGraphInterface::Get().ProcessThreadUntilIdle(RenderThread_Local);
} FParallelCommandListSet::~FParallelCommandListSet()
{
GOutstandingParallelCommandListSet = nullptr;
} FRHICommandList* FParallelCommandListSet::NewParallelCommandList()
{
// 新建一個命令佇列.
FRHICommandList* Result = AllocCommandList();
Result->ExecuteStat = ExecuteStat;
SetStateOnCommandList(*Result);
if (bCreateSceneContext)
{
FSceneRenderTargets& SceneContext = FSceneRenderTargets::Get(ParentCmdList);
// 建立場景RT快照.
if (!Snapshot)
{
Snapshot = SceneContext.CreateSnapshot(View);
}
// 將RT快照設定到命令佇列上.
Snapshot->SetSnapshotOnCmdList(*Result);
}
return Result;
} // 增加並行命令佇列.
void FParallelCommandListSet::AddParallelCommandList(FRHICommandList* CmdList, FGraphEventRef& CompletionEvent, int32 InNumDrawsIfKnown)
{
// 增加命令佇列.
CommandLists.Add(CmdList);
// 增加等待事件.
Events.Add(CompletionEvent);
// 增加數量.
NumDrawsIfKnown.Add(InNumDrawsIfKnown);
} void FParallelCommandListSet::WaitForTasks()
{
if (GOutstandingParallelCommandListSet)
{
GOutstandingParallelCommandListSet->WaitForTasksInternal();
}
} void FParallelCommandListSet::WaitForTasksInternal()
{
// 收集等待處理的事件.
FGraphEventArray WaitOutstandingTasks;
for (int32 Index = 0; Index < Events.Num(); Index++)
{
if (!Events[Index]->IsComplete())
{
WaitOutstandingTasks.Add(Events[Index]);
}
} // 如果有正在處理的任務, 則等待其完成.
if (WaitOutstandingTasks.Num())
{
ENamedThreads::Type RenderThread_Local = ENamedThreads::GetRenderThread_Local();
FTaskGraphInterface::Get().WaitUntilTasksComplete(WaitOutstandingTasks, RenderThread_Local);
}
}

FParallelCommandListSet擁有以下子類,以滿足不同Pass或場合的並行渲染邏輯:

  • FAnisotropyPassParallelCommandListSet:各項異性Pass的並行渲染命令佇列集合。
  • FPrePassParallelCommandListSet:提前深度Pass的並行渲染命令佇列集合。
  • FShadowParallelCommandListSet:陰影渲染的並行渲染命令佇列集合。
  • FRDGParallelCommandListSet:RDG系統的並行渲染命令佇列集合。

下面以FPrePassParallelCommandListSet和FShadowParallelCommandListSet為剖析物件:

// Engine\Source\Runtime\Renderer\Private\DepthRendering.cpp

class FPrePassParallelCommandListSet : public FParallelCommandListSet
{
public:
FPrePassParallelCommandListSet(FRHICommandListImmediate& InParentCmdList, const FSceneRenderer& InSceneRenderer, const FViewInfo& InView, bool bInCreateSceneContext)
: FParallelCommandListSet(GET_STATID(STAT_CLP_Prepass), InView, InParentCmdList, bInCreateSceneContext)
, SceneRenderer(InSceneRenderer)
{
} virtual ~FPrePassParallelCommandListSet()
{
// 在解構函式內派發命令列表.
Dispatch(true);
} // 在命令列表上設定狀態.
virtual void SetStateOnCommandList(FRHICommandList& CmdList) override
{
FParallelCommandListSet::SetStateOnCommandList(CmdList);
FSceneRenderTargets::Get(CmdList).BeginRenderingPrePass(CmdList, false);
SetupPrePassView(CmdList, View, &SceneRenderer);
} private:
const FSceneRenderer& SceneRenderer;
}; class FShadowParallelCommandListSet : public FParallelCommandListSet
{
public:
FShadowParallelCommandListSet(
FRHICommandListImmediate& InParentCmdList,
const FViewInfo& InView,
bool bInCreateSceneContext,
FProjectedShadowInfo& InProjectedShadowInfo,
FBeginShadowRenderPassFunction InBeginShadowRenderPass)
: FParallelCommandListSet(GET_STATID(STAT_CLP_Shadow), InView, InParentCmdList, bInCreateSceneContext)
, ProjectedShadowInfo(InProjectedShadowInfo)
, BeginShadowRenderPass(InBeginShadowRenderPass)
{
bBalanceCommands = false;
} virtual ~FShadowParallelCommandListSet()
{
// 在解構函式內派發命令列表.
Dispatch();
} virtual void SetStateOnCommandList(FRHICommandList& CmdList) override
{
FParallelCommandListSet::SetStateOnCommandList(CmdList);
BeginShadowRenderPass(CmdList, false);
ProjectedShadowInfo.SetStateForView(CmdList);
} private:
// 投射陰影資訊.
FProjectedShadowInfo& ProjectedShadowInfo;
// 開始陰影渲染pass函式.
FBeginShadowRenderPassFunction BeginShadowRenderPass;
// 陰影深度渲染模式.
EShadowDepthRenderMode RenderMode;
};

使用以上的邏輯比較簡單,以PrePass為例:

// Engine\Source\Runtime\Renderer\Private\DepthRendering.cpp

bool FDeferredShadingSceneRenderer::RenderPrePassViewParallel(const FViewInfo& View, FRHICommandListImmediate& ParentCmdList, TFunctionRef<void()> AfterTasksAreStarted, bool bDoPrePre)
{
bool bDepthWasCleared = false; {
// 構造FPrePassParallelCommandListSet例項.
FPrePassParallelCommandListSet ParallelCommandListSet(ParentCmdList, *this, View,
CVarRHICmdFlushRenderThreadTasksPrePass.GetValueOnRenderThread() == 0 && CVarRHICmdFlushRenderThreadTasks.GetValueOnRenderThread() == 0); // 呼叫FParallelMeshDrawCommandPass::DispatchDraw.
View.ParallelMeshDrawCommandPasses[EMeshPass::DepthPass].DispatchDraw(&ParallelCommandListSet, ParentCmdList); if (bDoPrePre)
{
bDepthWasCleared = PreRenderPrePass(ParentCmdList);
}
} if (bDoPrePre)
{
AfterTasksAreStarted();
} return bDepthWasCleared;
} // Engine\Source\Runtime\Renderer\Private\MeshDrawCommands.cpp void FParallelMeshDrawCommandPass::DispatchDraw(FParallelCommandListSet* ParallelCommandListSet, FRHICommandList& RHICmdList) const
{
if (MaxNumDraws <= 0)
{
return;
} FRHIVertexBuffer* PrimitiveIdsBuffer = PrimitiveIdVertexBufferPoolEntry.BufferRHI;
const int32 BasePrimitiveIdsOffset = 0; // 並行模式.
if (ParallelCommandListSet)
{
if (TaskContext.bUseGPUScene)
{
// 在完成FMeshDrawCommandPassSetupTask後,RHI執行緒將上傳PrimitiveIdVertexBuffer命令.
FRHICommandListImmediate &RHICommandList = GetImmediateCommandList_ForRenderCommand(); if (TaskEventRef.IsValid())
{
RHICommandList.AddDispatchPrerequisite(TaskEventRef);
} RHICommandList.EnqueueLambda([
VertexBuffer = PrimitiveIdsBuffer,
VertexBufferData = TaskContext.PrimitiveIdBufferData,
VertexBufferDataSize = TaskContext.PrimitiveIdBufferDataSize,
PrimitiveIdVertexBufferPoolEntry = PrimitiveIdVertexBufferPoolEntry](FRHICommandListImmediate& CmdList)
{
// Upload vertex buffer data.
void* RESTRICT Data = (void* RESTRICT)CmdList.LockVertexBuffer(VertexBuffer, 0, VertexBufferDataSize, RLM_WriteOnly);
FMemory::Memcpy(Data, VertexBufferData, VertexBufferDataSize);
CmdList.UnlockVertexBuffer(VertexBuffer); FMemory::Free(VertexBufferData);
}); RHICommandList.RHIThreadFence(true); bPrimitiveIdBufferDataOwnedByRHIThread = true;
} const ENamedThreads::Type RenderThread = ENamedThreads::GetRenderThread(); // 處理前序任務
FGraphEventArray Prereqs;
if (ParallelCommandListSet->GetPrereqs())
{
Prereqs.Append(*ParallelCommandListSet->GetPrereqs());
}
if (TaskEventRef.IsValid())
{
Prereqs.Add(TaskEventRef);
} // 基於NumEstimatedDraws將工作平均分配給可用的task graph工作執行緒.
// 每個任務將根據FVisibleMeshDrawCommandProcessTask結果調整它的工作範圍.
const int32 NumThreads = FMath::Min<int32>(FTaskGraphInterface::Get().GetNumWorkerThreads(), ParallelCommandListSet->Width);
const int32 NumTasks = FMath::Min<int32>(NumThreads, FMath::DivideAndRoundUp(MaxNumDraws, ParallelCommandListSet->MinDrawsPerCommandList));
const int32 NumDrawsPerTask = FMath::DivideAndRoundUp(MaxNumDraws, NumTasks); // 建立NumTasks個FRHICommandList, 新增到ParallelCommandListSet.
for (int32 TaskIndex = 0; TaskIndex < NumTasks; TaskIndex++)
{
const int32 StartIndex = TaskIndex * NumDrawsPerTask;
const int32 NumDraws = FMath::Min(NumDrawsPerTask, MaxNumDraws - StartIndex);
checkSlow(NumDraws > 0); // 新建命令佇列.
FRHICommandList* CmdList = ParallelCommandListSet->NewParallelCommandList(); // 建立任務FDrawVisibleMeshCommandsAnyThreadTask, 獲得事件物件.
FGraphEventRef AnyThreadCompletionEvent = TGraphTask<FDrawVisibleMeshCommandsAnyThreadTask>::CreateTask(&Prereqs, RenderThread)
.ConstructAndDispatchWhenReady(*CmdList, TaskContext.MeshDrawCommands, TaskContext.MinimalPipelineStatePassSet, PrimitiveIdsBuffer, BasePrimitiveIdsOffset, TaskContext.bDynamicInstancing, TaskContext.InstanceFactor, TaskIndex, NumTasks);
// 新增命令/事件等資料到ParallelCommandListSet.
ParallelCommandListSet->AddParallelCommandList(CmdList, AnyThreadCompletionEvent, NumDraws);
}
}
else // 非並行模式.
{
(......)
}
}

以上可以知道,FParallelMeshDrawCommandPass::DispatchDraw呼叫之後,建立若干個FRHICommandList、FDrawVisibleMeshCommandsAnyThreadTask任務和任務同步事件,然後將它們全部加入到ParallelCommandListSet的列表中。這樣,當ParallelCommandListSet被析構時,就可以真正地派發命令佇列。

10.4.3.2 QueueParallelAsyncCommandListSubmit

上一小節呼叫FParallelCommandListSet::Dispatch之後,會進入FRHICommandListBase::QueueParallelAsyncCommandListSubmit的介面:

void FRHICommandListBase::QueueParallelAsyncCommandListSubmit(FGraphEventRef* AnyThreadCompletionEvents, bool bIsPrepass, FRHICommandList** CmdLists, int32* NumDrawsIfKnown, int32 Num, int32 MinDrawsPerTranslate, bool bSpewMerge)
{
if (IsRunningRHIInSeparateThread())
{
// 在提交併行構建的子列表之前,執行立即命令列表上排隊的所有命令.
FRHICommandListImmediate& ImmediateCommandList = FRHICommandListExecutor::GetImmediateCommandList();
ImmediateCommandList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread); // 清空柵欄.
if (RHIThreadBufferLockFence.GetReference() && RHIThreadBufferLockFence->IsComplete())
{
RHIThreadBufferLockFence = nullptr;
}
} #if !UE_BUILD_SHIPPING
// 處理前先重新整理命令,這樣就能知道這個平行集打碎了什麼東西,或是之前有什麼東西.
if (CVarRHICmdFlushOnQueueParallelSubmit.GetValueOnRenderThread())
{
CSV_SCOPED_TIMING_STAT(RHITFlushes, QueueParallelAsyncCommandListSubmit);
FRHICommandListExecutor::GetImmediateCommandList().ImmediateFlush(EImmediateFlushType::FlushRHIThread);
}
#endif // 確保開啟了RHI執行緒.
if (Num && IsRunningRHIInSeparateThread())
{
static const auto ICVarRHICmdBalanceParallelLists = IConsoleManager::Get().FindTConsoleVariableDataInt(TEXT("r.RHICmdBalanceParallelLists")); // r.RHICmdBalanceParallelLists==0 且 GRHISupportsParallelRHIExecute==true 且 使用延遲上下文.
// 不平衡命令佇列提交模式.
if (ICVarRHICmdBalanceParallelLists->GetValueOnRenderThread() == 0 && CVarRHICmdBalanceTranslatesAfterTasks.GetValueOnRenderThread() > 0 && GRHISupportsParallelRHIExecute && CVarRHICmdUseDeferredContexts.GetValueOnAnyThread() > 0)
{
// 處理前序任務.
FGraphEventArray Prereq;
FRHICommandListBase** RHICmdLists = (FRHICommandListBase**)Alloc(sizeof(FRHICommandListBase*) * Num, alignof(FRHICommandListBase*));
for (int32 Index = 0; Index < Num; Index++)
{
FGraphEventRef& AnyThreadCompletionEvent = AnyThreadCompletionEvents[Index];
FRHICommandList* CmdList = CmdLists[Index];
RHICmdLists[Index] = CmdList;
if (AnyThreadCompletionEvent.GetReference())
{
Prereq.Add(AnyThreadCompletionEvent);
WaitOutstandingTasks.Add(AnyThreadCompletionEvent);
}
} // 確保在開始任何並行轉譯之前,所有舊的緩衝區鎖都已完成.
if (RHIThreadBufferLockFence.GetReference())
{
Prereq.Add(RHIThreadBufferLockFence);
} // 新建FRHICommandList.
FRHICommandList* CmdList = new FRHICommandList(GetGPUMask());
// 拷貝渲染執行緒上下文.
CmdList->CopyRenderThreadContexts(*this);
// 建立設定轉譯任務(FParallelTranslateSetupCommandList).
FGraphEventRef TranslateSetupCompletionEvent = TGraphTask<FParallelTranslateSetupCommandList>::CreateTask(&Prereq, ENamedThreads::GetRenderThread()).ConstructAndDispatchWhenReady(CmdList, &RHICmdLists[0], Num, bIsPrepass);
// 入隊命令佇列提交.
QueueCommandListSubmit(CmdList);
// 新增設定轉譯事件到列表.
AllOutstandingTasks.Add(TranslateSetupCompletionEvent);
// 避免在非同步命令列表之後的東西被繫結到它.
if (IsRunningRHIInSeparateThread())
{
FRHICommandListExecutor::GetImmediateCommandList().ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);
}
// 重新整理命令到RHI執行緒.
#if !UE_BUILD_SHIPPING
if (CVarRHICmdFlushOnQueueParallelSubmit.GetValueOnRenderThread())
{
FRHICommandListExecutor::GetImmediateCommandList().ImmediateFlush(EImmediateFlushType::FlushRHIThread);
}
#endif
return;
} // 平衡命令佇列提交模式.
IRHICommandContextContainer* ContextContainer = nullptr;
bool bMerge = !!CVarRHICmdMergeSmallDeferredContexts.GetValueOnRenderThread();
int32 EffectiveThreads = 0;
int32 Start = 0;
int32 ThreadIndex = 0;
if (GRHISupportsParallelRHIExecute && CVarRHICmdUseDeferredContexts.GetValueOnAnyThread() > 0)
{
// 由於需要提前知道作業的數量,因此運行了兩次合併邏輯.(可改進)
while (Start < Num)
{
int32 Last = Start;
int32 DrawCnt = NumDrawsIfKnown[Start]; if (bMerge && DrawCnt >= 0)
{
while (Last < Num - 1 && NumDrawsIfKnown[Last + 1] >= 0 && DrawCnt + NumDrawsIfKnown[Last + 1] <= MinDrawsPerTranslate)
{
Last++;
DrawCnt += NumDrawsIfKnown[Last];
}
}
check(Last >= Start);
Start = Last + 1;
EffectiveThreads++;
} Start = 0;
ContextContainer = RHIGetCommandContextContainer(ThreadIndex, EffectiveThreads, GetGPUMask());
} if (ContextContainer)
{
// 又一次合併操作.
while (Start < Num)
{
int32 Last = Start;
int32 DrawCnt = NumDrawsIfKnown[Start];
int32 TotalMem = bSpewMerge ? CmdLists[Start]->GetUsedMemory() : 0; if (bMerge && DrawCnt >= 0)
{
while (Last < Num - 1 && NumDrawsIfKnown[Last + 1] >= 0 && DrawCnt + NumDrawsIfKnown[Last + 1] <= MinDrawsPerTranslate)
{
Last++;
DrawCnt += NumDrawsIfKnown[Last];
TotalMem += bSpewMerge ? CmdLists[Start]->GetUsedMemory() : 0;
}
} // 後面的邏輯和非平衡模式比較相似, 省略. (......) return;
}
} // 非並行模式.
(......)
}

以上可知,開啟並行命令佇列提交需要滿足以下條件:

  • 開啟了RHI執行緒,即IsRunningRHIInSeparateThread()為true。
  • 當前使用的圖形API支援並行執行,即GRHISupportsParallelRHIExecute要為true。
  • 開啟了延遲上下文,即CVarRHICmdUseDeferredContexts不為0。

無論是哪個圖形API,都需要指定一個主CommandList(即ParentCommandList),以便呼叫它的QueueParallelAsyncCommandListSubmit提交設定命令佇列的任務。上面提交到RHI執行緒的任務物件是FParallelTranslateSetupCommandList,由下一小節闡述。

10.4.3.3 FParallelTranslateSetupCommandList

FParallelTranslateSetupCommandList用於建立並行(或序列)提交子命令佇列的任務,定義如下:

class FParallelTranslateSetupCommandList
{
// 用於提交子命令列表的父命令列表.
FRHICommandList* RHICmdList;
// 待提交的子命令佇列列表.
FRHICommandListBase** RHICmdLists; int32 NumCommandLists;
bool bIsPrepass;
int32 MinSize;
int32 MinCount; public:
FParallelTranslateSetupCommandList(FRHICommandList* InRHICmdList, FRHICommandListBase** InRHICmdLists, int32 InNumCommandLists, bool bInIsPrepass)
: RHICmdList(InRHICmdList)
, RHICmdLists(InRHICmdLists)
, NumCommandLists(InNumCommandLists)
, bIsPrepass(bInIsPrepass)
{
// 單個子命令佇列的最小尺寸.
MinSize = CVarRHICmdMinCmdlistSizeForParallelTranslate.GetValueOnRenderThread() * 1024;
MinCount = CVarRHICmdMinCmdlistForParallelTranslate.GetValueOnRenderThread();
} static FORCEINLINE TStatId GetStatId();
// 預期的執行緒.
static FORCEINLINE ENamedThreads::Type GetDesiredThread()
{
return CPrio_FParallelTranslateSetupCommandList.Get();
}
static FORCEINLINE ESubsequentsMode::Type GetSubsequentsMode() { return ESubsequentsMode::TrackSubsequents; } // 執行設定任務.
void DoTask(ENamedThreads::Type CurrentThread, const FGraphEventRef& MyCompletionGraphEvent)
{
TArray<int32, TInlineAllocator<64> > Sizes;
Sizes.Reserve(NumCommandLists);
for (int32 Index = 0; Index < NumCommandLists; Index++)
{
Sizes.Add(RHICmdLists[Index]->GetUsedMemory());
} int32 EffectiveThreads = 0;
int32 Start = 0;
// 合併繪製指令, 計算所需的執行緒數量.
while (Start < NumCommandLists)
{
int32 Last = Start;
int32 DrawCnt = Sizes[Start]; while (Last < NumCommandLists - 1 && DrawCnt + Sizes[Last + 1] <= MinSize)
{
Last++;
DrawCnt += Sizes[Last];
}
check(Last >= Start);
Start = Last + 1;
EffectiveThreads++;
} // 如果需要的執行緒數量太少, 則序列提交子命令佇列.
if (EffectiveThreads < MinCount)
{
FGraphEventRef Nothing;
for (int32 Index = 0; Index < NumCommandLists; Index++)
{
FRHICommandListBase* CmdList = RHICmdLists[Index];
// 使用了ALLOC_COMMAND_CL分配子命令佇列提交介面.
ALLOC_COMMAND_CL(*RHICmdList, FRHICommandWaitForAndSubmitSubList)(Nothing, CmdList);
#if WITH_MGPU
ALLOC_COMMAND_CL(*RHICmdList, FRHICommandSetGPUMask)(RHICmdList->GetGPUMask());
#endif
}
}
// 並行提交.
else
{
Start = 0;
int32 ThreadIndex = 0; // 合併數量太少的命令佇列.
while (Start < NumCommandLists)
{
int32 Last = Start;
int32 DrawCnt = Sizes[Start]; while (Last < NumCommandLists - 1 && DrawCnt + Sizes[Last + 1] <= MinSize)
{
Last++;
DrawCnt += Sizes[Last];
} // 獲取ContextContainer
IRHICommandContextContainer* ContextContainer = RHIGetCommandContextContainer(ThreadIndex, EffectiveThreads, RHICmdList->GetGPUMask()); // 建立並行轉譯任務FParallelTranslateCommandList.
FGraphEventRef TranslateCompletionEvent = TGraphTask<FParallelTranslateCommandList>::CreateTask(nullptr, ENamedThreads::GetRenderThread()).ConstructAndDispatchWhenReady(&RHICmdLists[Start], 1 + Last - Start, ContextContainer, bIsPrepass);
// 此任務結束前須確保轉譯任務完成.
MyCompletionGraphEvent->DontCompleteUntil(TranslateCompletionEvent);
// 呼叫RHICmdList的FRHICommandWaitForAndSubmitSubListParallel介面.
ALLOC_COMMAND_CL(*RHICmdList, FRHICommandWaitForAndSubmitSubListParallel)(TranslateCompletionEvent, ContextContainer, EffectiveThreads, ThreadIndex++);
Start = Last + 1;
}
check(EffectiveThreads == ThreadIndex);
}
}
};

以上程式碼中,可以補充幾點:

  • 如果命令數量太少,所需的執行緒數量過少,直接使用序列轉譯介面FRHICommandWaitForAndSubmitSubList。

  • 並行邏輯分支中,RHIGetCommandContextContainer從具體的RHI子類中獲取上下文容器,只在D3D12、Vulkan、Metal等現代圖形平臺中有實現,其它圖形平臺皆返回nullptr。

  • 每個執行緒會提交1~N個子命令佇列,以確保它們的繪製命令總數不少於MinSize,提升每個執行緒的提交效率。

  • 每個執行緒會建立一個轉譯任務FParallelTranslateCommandList,然後利用RHICmdList的FRHICommandWaitForAndSubmitSubListParallel取等待子命令列表的並行提交。

  • 注意FParallelTranslateSetupCommandList的預期執行緒由CPrio_FParallelTranslateSetupCommandList決定:

    FAutoConsoleTaskPriority CPrio_FParallelTranslateSetupCommandList
    // 控制檯名稱.
    TEXT("TaskGraph.TaskPriorities.ParallelTranslateSetupCommandList"),
    // 描述.
    TEXT("Task and thread priority for FParallelTranslateSetupCommandList."),
    // 如果有高優先順序的執行緒, 使用之.
    ENamedThreads::HighThreadPriority,
    // 使用高任務優先順序.
    ENamedThreads::HighTaskPriority,
    // 如果沒有高優先順序的執行緒, 則使用普遍優先順序的執行緒, 但使用高任務優先順序代替之.
    ENamedThreads::HighTaskPriority
    );

    因此可知,設定轉譯的任務會被TaskGraph系統優先執行,但發起設定轉譯任務的執行緒還是渲染執行緒而非RHI執行緒。

10.4.3.4 FParallelTranslateCommandList

FParallelTranslateCommandList便是真正地轉譯命令佇列,它的定義如下:

class FParallelTranslateCommandList
{
// 待轉譯的命令列表.
FRHICommandListBase** RHICmdLists;
// 需轉譯的命令列表數量.
int32 NumCommandLists;
// 上下文容器.
IRHICommandContextContainer* ContextContainer;
// 是否提前深度pass.
bool bIsPrepass; public:
FParallelTranslateCommandList(FRHICommandListBase** InRHICmdLists, int32 InNumCommandLists, IRHICommandContextContainer* InContextContainer, bool bInIsPrepass)
: RHICmdLists(InRHICmdLists)
, NumCommandLists(InNumCommandLists)
, ContextContainer(InContextContainer)
, bIsPrepass(bInIsPrepass)
{
check(RHICmdLists && ContextContainer && NumCommandLists);
} static FORCEINLINE TStatId GetStatId(); // 預期的執行緒, 根據是否Prepass而定.
ENamedThreads::Type GetDesiredThread()
{
return bIsPrepass ? CPrio_FParallelTranslateCommandListPrepass.Get() : CPrio_FParallelTranslateCommandList.Get();
} static ESubsequentsMode::Type GetSubsequentsMode() { return ESubsequentsMode::TrackSubsequents; } // 執行任務.
void DoTask(ENamedThreads::Type CurrentThread, const FGraphEventRef& MyCompletionGraphEvent)
{
IRHICommandContext* Context = ContextContainer->GetContext();
for (int32 Index = 0; Index < NumCommandLists; Index++)
{
// 設定子命令佇列的上下文.
RHICmdLists[Index]->SetContext(Context);
// 刪除子命令佇列.
delete RHICmdLists[Index];
}
// 清理上下文.
ContextContainer->FinishContext();
}
};

上面的程式碼需要補充幾點說明:

  • GetDesiredThread根據是否prepass由兩個控制檯遍歷決定:

    FAutoConsoleTaskPriority CPrio_FParallelTranslateCommandListPrepass(
    TEXT("TaskGraph.TaskPriorities.ParallelTranslateCommandListPrepass"),
    TEXT("Task and thread priority for FParallelTranslateCommandList for the prepass, which we would like to get to the GPU asap."),
    ENamedThreads::NormalThreadPriority,
    ENamedThreads::HighTaskPriority
    ); FAutoConsoleTaskPriority CPrio_FParallelTranslateCommandList(
    TEXT("TaskGraph.TaskPriorities.ParallelTranslateCommandList"),
    TEXT("Task and thread priority for FParallelTranslateCommandList."),
    ENamedThreads::NormalThreadPriority,
    ENamedThreads::NormalTaskPriority
    );

    由此可知,如果是prepass,使用普通優先順序的執行緒但高任務優先順序,其它pass則使用普通優先順序的執行緒和普通的任務優先順序。

  • DoTask邏輯非常簡單,給命令佇列設定上下文,然後將命令佇列刪除,最後清理上下文。不過這裡有個疑問,轉譯任務在哪裡執行?幾番盤查之後,發現是在FRHICommandListBase的解構函式之中,呼叫堆疊如下:

    FRHICommandListBase::~FRHICommandListBase()
    {
    // 重新整理命令列表.
    Flush();
    GRHICommandList.OutstandingCmdListCount.Decrement();
    } void FRHICommandListBase::Flush()
    {
    // 如果存在命令.
    if (HasCommands())
    {
    check(!IsImmediate());
    // 用全域性命令列表執行之. GRHICommandList的型別是FRHICommandListExecutor.
    GRHICommandList.ExecuteList(*this);
    }
    } void FRHICommandListExecutor::ExecuteList(FRHICommandListBase& CmdList)
    {
    if (IsInRenderingThread() && !GetImmediateCommandList().IsExecuting())
    {
    GetImmediateCommandList().ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);
    } ExecuteInner(CmdList);
    } void FRHICommandListExecutor::ExecuteInner(FRHICommandListBase& CmdList)
    {
    (......)
    }

    到了FRHICommandListExecutor::ExecuteInner這一步,就交給FRHICommandListExecutor處理了,具體過程和解析見10.4.1 RHI命令執行

不過再次強調的是,需要圖形API支援並行提交和轉譯,才能開啟真正的並行渲染,否則就只能按照普通的任務放到渲染執行緒執行。

10.4.4 Pass渲染

10.4.4.1 普通Pass渲染

普通Pass的渲染涉及到以下介面和型別:

// Engine\Source\Runtime\RHI\Public\RHIResources.h

// 渲染通道資訊.
struct FRHIRenderPassInfo
{
// 渲染紋理資訊.
struct FColorEntry
{
FRHITexture* RenderTarget;
FRHITexture* ResolveTarget;
int32 ArraySlice;
uint8 MipIndex;
ERenderTargetActions Action;
};
FColorEntry ColorRenderTargets[MaxSimultaneousRenderTargets]; // 深度模板資訊.
struct FDepthStencilEntry
{
FRHITexture* DepthStencilTarget;
FRHITexture* ResolveTarget;
EDepthStencilTargetActions Action;
FExclusiveDepthStencil ExclusiveDepthStencil;
};
FDepthStencilEntry DepthStencilRenderTarget; // 解析引數.
FResolveParams ResolveParameters; // 部分RHI可以使用紋理來控制不同區域的取樣和/或陰影解析度
FTextureRHIRef FoveationTexture = nullptr; // 部分RHI需要一個提示,遮擋查詢將在這個渲染通道中使用
uint32 NumOcclusionQueries = 0;
bool bOcclusionQueries = false; // 部分RHI需要知道,在為部分資源轉換生成mip對映的情況下,這個渲染通道是否將讀取和寫入相同的紋理.
bool bGeneratingMips = false; // 如果這個renderpass應該是多檢視,則需要多少檢視.
uint8 MultiViewCount = 0; // 部分RHI的提示,渲染通道將有特定的子通道.
ESubpassHint SubpassHint = ESubpassHint::None; // 是否太多UAV.
bool bTooManyUAVs = false;
bool bIsMSAA = false; // 不同的建構函式. // Color, no depth, optional resolve, optional mip, optional array slice
explicit FRHIRenderPassInfo(FRHITexture* ColorRT, ERenderTargetActions ColorAction, FRHITexture* ResolveRT = nullptr, uint32 InMipIndex = 0, int32 InArraySlice = -1);
// Color MRTs, no depth
explicit FRHIRenderPassInfo(int32 NumColorRTs, FRHITexture* ColorRTs[], ERenderTargetActions ColorAction);
// Color MRTs, no depth
explicit FRHIRenderPassInfo(int32 NumColorRTs, FRHITexture* ColorRTs[], ERenderTargetActions ColorAction, FRHITexture* ResolveTargets[]);
// Color MRTs and depth
explicit FRHIRenderPassInfo(int32 NumColorRTs, FRHITexture* ColorRTs[], ERenderTargetActions ColorAction, FRHITexture* DepthRT, EDepthStencilTargetActions DepthActions, FExclusiveDepthStencil InEDS = FExclusiveDepthStencil::DepthWrite_StencilWrite);
// Color MRTs and depth
explicit FRHIRenderPassInfo(int32 NumColorRTs, FRHITexture* ColorRTs[], ERenderTargetActions ColorAction, FRHITexture* ResolveRTs[], FRHITexture* DepthRT, EDepthStencilTargetActions DepthActions, FRHITexture* ResolveDepthRT, FExclusiveDepthStencil InEDS = FExclusiveDepthStencil::DepthWrite_StencilWrite);
// Depth, no color
explicit FRHIRenderPassInfo(FRHITexture* DepthRT, EDepthStencilTargetActions DepthActions, FRHITexture* ResolveDepthRT = nullptr, FExclusiveDepthStencil InEDS = FExclusiveDepthStencil::DepthWrite_StencilWrite);
// Depth, no color, occlusion queries
explicit FRHIRenderPassInfo(FRHITexture* DepthRT, uint32 InNumOcclusionQueries, EDepthStencilTargetActions DepthActions, FRHITexture* ResolveDepthRT = nullptr, FExclusiveDepthStencil InEDS = FExclusiveDepthStencil::DepthWrite_StencilWrite);
// Color and depth
explicit FRHIRenderPassInfo(FRHITexture* ColorRT, ERenderTargetActions ColorAction, FRHITexture* DepthRT, EDepthStencilTargetActions DepthActions, FExclusiveDepthStencil InEDS = FExclusiveDepthStencil::DepthWrite_StencilWrite);
// Color and depth with resolve
explicit FRHIRenderPassInfo(FRHITexture* ColorRT, ERenderTargetActions ColorAction, FRHITexture* ResolveColorRT,
FRHITexture* DepthRT, EDepthStencilTargetActions DepthActions, FRHITexture* ResolveDepthRT, FExclusiveDepthStencil InEDS = FExclusiveDepthStencil::DepthWrite_StencilWrite);
// Color and depth with resolve and optional sample density
explicit FRHIRenderPassInfo(FRHITexture* ColorRT, ERenderTargetActions ColorAction, FRHITexture* ResolveColorRT,
FRHITexture* DepthRT, EDepthStencilTargetActions DepthActions, FRHITexture* ResolveDepthRT, FRHITexture* InFoveationTexture, FExclusiveDepthStencil InEDS = FExclusiveDepthStencil::DepthWrite_StencilWrite); enum ENoRenderTargets
{
NoRenderTargets,
};
explicit FRHIRenderPassInfo(ENoRenderTargets Dummy);
explicit FRHIRenderPassInfo(); inline int32 GetNumColorRenderTargets() const;
RHI_API void Validate() const;
RHI_API void ConvertToRenderTargetsInfo(FRHISetRenderTargetsInfo& OutRTInfo) const; (......)
}; // Engine\Source\Runtime\RHI\Public\RHICommandList.h class RHI_API FRHICommandList : public FRHIComputeCommandList
{
public:
void BeginRenderPass(const FRHIRenderPassInfo& InInfo, const TCHAR* Name)
{
if (InInfo.bTooManyUAVs)
{
UE_LOG(LogRHI, Warning, TEXT("RenderPass %s has too many UAVs"));
}
InInfo.Validate(); // 直接呼叫RHI的介面.
if (Bypass())
{
GetContext().RHIBeginRenderPass(InInfo, Name);
}
// 分配RHI命令.
else
{
TCHAR* NameCopy = AllocString(Name);
ALLOC_COMMAND(FRHICommandBeginRenderPass)(InInfo, NameCopy);
}
// 設定在RenderPass內標記.
Data.bInsideRenderPass = true; // 快取活動的RT.
CacheActiveRenderTargets(InInfo);
// 重置子Pass.
ResetSubpass(InInfo.SubpassHint);
Data.bInsideRenderPass = true;
} void EndRenderPass()
{
// 呼叫或分配RHI介面.
if (Bypass())
{
GetContext().RHIEndRenderPass();
}
else
{
ALLOC_COMMAND(FRHICommandEndRenderPass)();
}
// 重置在RenderPass內標記.
Data.bInsideRenderPass = false;
// 重置子Pass標記為None.
ResetSubpass(ESubpassHint::None);
}
};

它們的使用案例如下:

void FSceneRenderer::RenderShadowDepthMaps(FRHICommandListImmediate& RHICmdList)
{
(......) for (int32 AtlasIndex = 0; AtlasIndex < SortedShadowsForShadowDepthPass.TranslucencyShadowMapAtlases.Num(); AtlasIndex++)
{
const FSortedShadowMapAtlas& ShadowMapAtlas = SortedShadowsForShadowDepthPass.TranslucencyShadowMapAtlases[AtlasIndex];
FIntPoint TargetSize = ShadowMapAtlas.RenderTargets.ColorTargets[0]->GetDesc().Extent; FSceneRenderTargetItem ColorTarget0 = ShadowMapAtlas.RenderTargets.ColorTargets[0]->GetRenderTargetItem();
FSceneRenderTargetItem ColorTarget1 = ShadowMapAtlas.RenderTargets.ColorTargets[1]->GetRenderTargetItem(); FRHITexture* RenderTargetArray[2] =
{
ColorTarget0.TargetableTexture,
ColorTarget1.TargetableTexture
}; // 建立FRHIRenderPassInfo例項.
FRHIRenderPassInfo RPInfo(UE_ARRAY_COUNT(RenderTargetArray), RenderTargetArray, ERenderTargetActions::Load_Store);
TransitionRenderPassTargets(RHICmdList, RPInfo);
// 開始渲染Pass.
RHICmdList.BeginRenderPass(RPInfo, TEXT("RenderTranslucencyDepths"));
{
// 渲染陰影.
for (int32 ShadowIndex = 0; ShadowIndex < ShadowMapAtlas.Shadows.Num(); ShadowIndex++)
{
FProjectedShadowInfo* ProjectedShadowInfo = ShadowMapAtlas.Shadows[ShadowIndex];
ProjectedShadowInfo->SetupShadowUniformBuffers(RHICmdList, Scene);
ProjectedShadowInfo->RenderTranslucencyDepths(RHICmdList, this);
}
}
// 結束渲染Pass.
RHICmdList.EndRenderPass(); RHICmdList.Transition(FRHITransitionInfo(ColorTarget0.TargetableTexture, ERHIAccess::Unknown, ERHIAccess::SRVMask));
RHICmdList.Transition(FRHITransitionInfo(ColorTarget1.TargetableTexture, ERHIAccess::Unknown, ERHIAccess::SRVMask));
} (......)
}

10.4.4.2 Subpass渲染

先說一下Subpass的由來、作用和特點。

在傳統的多Pass渲染中,每個Pass結束時通常會渲染出一組渲染紋理,部分成為著色器引數提供給下一個Pass取樣讀取。這種紋理取樣方式不受任何限制,可以讀取任意的領域畫素,使用任意的紋理過濾方式。這種方式雖然使用靈活,但在TBR(Tile-Based Renderer)硬體架構的裝置中會有較大的消耗:渲染紋理的Pass通常會將渲染結果儲存在On-chip的Tile Memory中,待Pass結束後會寫回GPU視訊記憶體(VRAM)中,寫回GPU視訊記憶體是個耗時耗耗電的操作。

傳統多Pass之間的記憶體存取模型,多次發生於On-Chip和全域性儲存器之間。

如果出現一種特殊的紋理使用情況:上一個Pass渲染處理的紋理,立即被下一個Pass使用,並且下一個Pass只採樣畫素位置自身的資料,而不需要取樣鄰域畫素的位置。這種情況就符合了Subpass的使用情景。使用Subpass渲染的紋理結果只會儲存在Tile Memory中,在Subpass結束後不會寫回VRAM,而直接提供Tile Memory的資料給下一個Subpass取樣讀取。這樣就避免了傳統Pass結束寫回GPU視訊記憶體以及下一個Pass又從GPU視訊記憶體讀資料的耗時耗電操作,從而提升了效能。

Subpass之間的記憶體存取模型,都發生在On-Chip內。

UE涉及Subpass的介面和型別如下:

// Engine\Source\Runtime\RHI\Public\RHIResources.h

// 提供給RHI的Subpass標記.
enum class ESubpassHint : uint8
{
None, // 傳統渲染(非Subpass)
DepthReadSubpass, // 深度讀取Subpass.
DeferredShadingSubpass, // 移動端延遲著色Subpass.
}; // Engine\Source\Runtime\RHI\Public\RHICommandList.h class RHI_API FRHICommandListBase : public FNoncopyable
{
(......) protected:
// PSO上下文.
struct FPSOContext
{
uint32 CachedNumSimultanousRenderTargets = 0;
TStaticArray<FRHIRenderTargetView, MaxSimultaneousRenderTargets> CachedRenderTargets;
FRHIDepthRenderTargetView CachedDepthStencilTarget; // Subpass提示標記.
ESubpassHint SubpassHint = ESubpassHint::None;
uint8 SubpassIndex = 0;
uint8 MultiViewCount = 0;
bool HasFragmentDensityAttachment = false;
} PSOContext;
}; class RHI_API FRHICommandList : public FRHIComputeCommandList
{
public:
void BeginRenderPass(const FRHIRenderPassInfo& InInfo, const TCHAR* Name)
{
(......) CacheActiveRenderTargets(InInfo);
// 設定Subpass資料.
ResetSubpass(InInfo.SubpassHint);
Data.bInsideRenderPass = true;
} void EndRenderPass()
{
(......) // 重置Subpass標記為None.
ResetSubpass(ESubpassHint::None);
} // 下一個Subpass.
void NextSubpass()
{
// 分配或呼叫RHI介面.
if (Bypass())
{
GetContext().RHINextSubpass();
}
else
{
ALLOC_COMMAND(FRHICommandNextSubpass)();
} // 增加Subpass計數.
IncrementSubpass();
} // 增加subpass計數.
void IncrementSubpass()
{
PSOContext.SubpassIndex++;
} // 重置Subpass資料.
void ResetSubpass(ESubpassHint SubpassHint)
{
PSOContext.SubpassHint = SubpassHint;
PSOContext.SubpassIndex = 0;
}
};

UE的Subpass主要集中在移動端渲染器:

原因是移動端TBR架構的硬體裝置越來越多,佔比愈來愈大,Subpass成為移動端主渲染器的首選是必然且合理的。

在Subpass渲染中,還是涉及到了Pass的Overlap問題,採用Overlap可以提升GPU的使用率,提升渲染效能(下圖)。

上:未採用Overlap技術的Subpass管線;下:採用了Overlap技術的Subpass管線。

RHI有關Overlap的指令主要是UAV:

class RHI_API FRHIComputeCommandList : public FRHICommandListBase
{
(......) void BeginUAVOverlap()
{
if (Bypass())
{
GetContext().RHIBeginUAVOverlap();
return;
}
ALLOC_COMMAND(FRHICommandBeginUAVOverlap)();
} void EndUAVOverlap()
{
if (Bypass())
{
GetContext().RHIEndUAVOverlap();
return;
}
ALLOC_COMMAND(FRHICommandEndUAVOverlap)();
} void BeginUAVOverlap(FRHIUnorderedAccessView* UAV)
{
FRHIUnorderedAccessView* UAVs[1] = { UAV };
BeginUAVOverlap(MakeArrayView(UAVs, 1));
} void EndUAVOverlap(FRHIUnorderedAccessView* UAV)
{
FRHIUnorderedAccessView* UAVs[1] = { UAV };
EndUAVOverlap(MakeArrayView(UAVs, 1));
} void BeginUAVOverlap(TArrayView<FRHIUnorderedAccessView* const> UAVs)
{
if (Bypass())
{
GetContext().RHIBeginUAVOverlap(UAVs);
return;
} const uint32 AllocSize = UAVs.Num() * sizeof(FRHIUnorderedAccessView*);
FRHIUnorderedAccessView** InlineUAVs = (FRHIUnorderedAccessView**)Alloc(AllocSize, alignof(FRHIUnorderedAccessView*));
FMemory::Memcpy(InlineUAVs, UAVs.GetData(), AllocSize);
ALLOC_COMMAND(FRHICommandBeginSpecificUAVOverlap)(MakeArrayView(InlineUAVs, UAVs.Num()));
} void EndUAVOverlap(TArrayView<FRHIUnorderedAccessView* const> UAVs)
{
if (Bypass())
{
GetContext().RHIEndUAVOverlap(UAVs);
return;
} const uint32 AllocSize = UAVs.Num() * sizeof(FRHIUnorderedAccessView*);
FRHIUnorderedAccessView** InlineUAVs = (FRHIUnorderedAccessView**)Alloc(AllocSize, alignof(FRHIUnorderedAccessView*));
FMemory::Memcpy(InlineUAVs, UAVs.GetData(), AllocSize);
ALLOC_COMMAND(FRHICommandEndSpecificUAVOverlap)(MakeArrayView(InlineUAVs, UAVs.Num()));
}
}

10.4.5 RHI資源管理

10.2.2 FRHIResource章節已經闡述過RHI資源的基本介面,FRHIResource自身擁有引用計數和引用計數增加、減少的介面:

class RHI_API FRHIResource
{
public:
// 增加引用計數.
uint32 AddRef() const;
// 減少引用計數.
uint32 Release() const;
// 獲取引用計數.
uint32 GetRefCount() const;
};

當然,我們不需要直接引用和管理FRHIResource的例項和計數,而是結合TRefCountPtr的模板類實現自動化管理RHI資源:

// 各種RHI資源引用型別定義.
typedef TRefCountPtr<FRHISamplerState> FSamplerStateRHIRef;
typedef TRefCountPtr<FRHIRasterizerState> FRasterizerStateRHIRef;
typedef TRefCountPtr<FRHIDepthStencilState> FDepthStencilStateRHIRef;
typedef TRefCountPtr<FRHIBlendState> FBlendStateRHIRef;
typedef TRefCountPtr<FRHIVertexDeclaration> FVertexDeclarationRHIRef;
typedef TRefCountPtr<FRHIVertexShader> FVertexShaderRHIRef;
typedef TRefCountPtr<FRHIHullShader> FHullShaderRHIRef;
typedef TRefCountPtr<FRHIDomainShader> FDomainShaderRHIRef;
typedef TRefCountPtr<FRHIPixelShader> FPixelShaderRHIRef;
typedef TRefCountPtr<FRHIGeometryShader> FGeometryShaderRHIRef;
typedef TRefCountPtr<FRHIComputeShader> FComputeShaderRHIRef;
typedef TRefCountPtr<FRHIRayTracingShader> FRayTracingShaderRHIRef;
typedef TRefCountPtr<FRHIComputeFence> FComputeFenceRHIRef;
typedef TRefCountPtr<FRHIBoundShaderState> FBoundShaderStateRHIRef;
typedef TRefCountPtr<FRHIUniformBuffer> FUniformBufferRHIRef;
typedef TRefCountPtr<FRHIIndexBuffer> FIndexBufferRHIRef;
typedef TRefCountPtr<FRHIVertexBuffer> FVertexBufferRHIRef;
typedef TRefCountPtr<FRHIStructuredBuffer> FStructuredBufferRHIRef;
typedef TRefCountPtr<FRHITexture> FTextureRHIRef;
typedef TRefCountPtr<FRHITexture2D> FTexture2DRHIRef;
typedef TRefCountPtr<FRHITexture2DArray> FTexture2DArrayRHIRef;
typedef TRefCountPtr<FRHITexture3D> FTexture3DRHIRef;
typedef TRefCountPtr<FRHITextureCube> FTextureCubeRHIRef;
typedef TRefCountPtr<FRHITextureReference> FTextureReferenceRHIRef;
typedef TRefCountPtr<FRHIRenderQuery> FRenderQueryRHIRef;
typedef TRefCountPtr<FRHIRenderQueryPool> FRenderQueryPoolRHIRef;
typedef TRefCountPtr<FRHITimestampCalibrationQuery> FTimestampCalibrationQueryRHIRef;
typedef TRefCountPtr<FRHIGPUFence> FGPUFenceRHIRef;
typedef TRefCountPtr<FRHIViewport> FViewportRHIRef;
typedef TRefCountPtr<FRHIUnorderedAccessView> FUnorderedAccessViewRHIRef;
typedef TRefCountPtr<FRHIShaderResourceView> FShaderResourceViewRHIRef;
typedef TRefCountPtr<FRHIGraphicsPipelineState> FGraphicsPipelineStateRHIRef;
typedef TRefCountPtr<FRHIRayTracingPipelineState> FRayTracingPipelineStateRHIRef;

使用以上型別之後,RHI資源由TRefCountPtr自動管理引用計數,其中資源的釋放是在FRHIResource::Release中:

class RHI_API FRHIResource
{
uint32 Release() const
{
// 計數-1.
int32 NewValue = NumRefs.Decrement();
// 如果計數為0, 處理資源刪除.
if (NewValue == 0)
{
// 非延遲刪除, 直接delete.
if (!DeferDelete())
{
delete this;
}
// 延遲刪除模式.
else
{
// 使用平臺相關的原子對比, 為0則加入待刪除列表.
if (FPlatformAtomics::InterlockedCompareExchange(&MarkedForDelete, 1, 0) == 0)
{
PendingDeletes.Push(const_cast<FRHIResource*>(this));
}
}
} // 返回新的值.
return uint32(NewValue);
} bool DeferDelete() const
{
// 啟用了多執行緒渲染且GRHINeedsExtraDeletionLatency為true, 且資源沒有不延遲刪除的標記.
return !bDoNotDeferDelete && (GRHINeedsExtraDeletionLatency || !Bypass());
}
};

PendingDeletes是FRHIResource的靜態變數,與它相關的資料和介面有:

class RHI_API FRHIResource
{
public:
FRHIResource(bool InbDoNotDeferDelete = false)
: MarkedForDelete(0)
, bDoNotDeferDelete(InbDoNotDeferDelete)
, bCommitted(true)
{
}
virtual ~FRHIResource()
{
check(PlatformNeedsExtraDeletionLatency() || (NumRefs.GetValue() == 0 && (CurrentlyDeleting == this || bDoNotDeferDelete || Bypass()))); // this should not have any outstanding refs
} // 待刪除資源列表, 注意是無鎖無序的指標列表.
static TLockFreePointerListUnordered<FRHIResource, PLATFORM_CACHE_LINE_SIZE> PendingDeletes;
// 當前正在刪除的資源.
static FRHIResource* CurrentlyDeleting; // 平臺需要額外的刪除延遲.
static bool PlatformNeedsExtraDeletionLatency()
{
return GRHINeedsExtraDeletionLatency && GIsRHIInitialized;
} // 待刪除資源列表.
struct ResourcesToDelete
{
TArray<FRHIResource*> Resources;
uint32 FrameDeleted;
}; // 延遲刪除佇列.
static TArray<ResourcesToDelete> DeferredDeletionQueue;
static uint32 CurrentFrame;
}; void FRHIResource::FlushPendingDeletes(bool bFlushDeferredDeletes)
{
FRHICommandListImmediate& RHICmdList = FRHICommandListExecutor::GetImmediateCommandList(); // 在刪除RHI資源之前, 先確保命令列表已被重新整理到GPU.
RHICmdList.ImmediateFlush(EImmediateFlushType::FlushRHIThread);
// 確保沒有等待的任務.
FRHICommandListExecutor::CheckNoOutstandingCmdLists();
// 通知RHI重新整理完成.
if (GDynamicRHI)
{
GDynamicRHI->RHIPerFrameRHIFlushComplete();
} // 刪除匿名函式.
auto Delete = [](TArray<FRHIResource*>& ToDelete)
{
for (int32 Index = 0; Index < ToDelete.Num(); Index++)
{
FRHIResource* Ref = ToDelete[Index];
check(Ref->MarkedForDelete == 1);
if (Ref->GetRefCount() == 0) // caches can bring dead objects back to life
{
CurrentlyDeleting = Ref;
delete Ref;
CurrentlyDeleting = nullptr;
}
else
{
Ref->MarkedForDelete = 0;
FPlatformMisc::MemoryBarrier();
}
}
}; while (1)
{
if (PendingDeletes.IsEmpty())
{
break;
} // 平臺需要額外的刪除延遲.
if (PlatformNeedsExtraDeletionLatency())
{
const int32 Index = DeferredDeletionQueue.AddDefaulted();
// 加入延遲刪除佇列DeferredDeletionQueue.
ResourcesToDelete& ResourceBatch = DeferredDeletionQueue[Index];
ResourceBatch.FrameDeleted = CurrentFrame;
PendingDeletes.PopAll(ResourceBatch.Resources);
}
// 不需要額外的延遲, 刪除整個列表.
else
{
TArray<FRHIResource*> ToDelete;
PendingDeletes.PopAll(ToDelete);
Delete(ToDelete);
}
} const uint32 NumFramesToExpire = RHIRESOURCE_NUM_FRAMES_TO_EXPIRE; // 刪除DeferredDeletionQueue.
if (DeferredDeletionQueue.Num())
{
// 清空整個DeferredDeletionQueue佇列.
if (bFlushDeferredDeletes)
{
FRHICommandListExecutor::GetImmediateCommandList().BlockUntilGPUIdle(); for (int32 Idx = 0; Idx < DeferredDeletionQueue.Num(); ++Idx)
{
ResourcesToDelete& ResourceBatch = DeferredDeletionQueue[Idx];
Delete(ResourceBatch.Resources);
} DeferredDeletionQueue.Empty();
}
// 刪除過期的資源列表.
else
{
int32 DeletedBatchCount = 0;
while (DeletedBatchCount < DeferredDeletionQueue.Num())
{
ResourcesToDelete& ResourceBatch = DeferredDeletionQueue[DeletedBatchCount];
if (((ResourceBatch.FrameDeleted + NumFramesToExpire) < CurrentFrame) || !GIsRHIInitialized)
{
Delete(ResourceBatch.Resources);
++DeletedBatchCount;
}
else
{
break;
}
} if (DeletedBatchCount)
{
DeferredDeletionQueue.RemoveAt(0, DeletedBatchCount);
}
} ++CurrentFrame;
}
}

不過,需要特意指出,FRHIResource的解構函式並沒有釋放任何RHI資源,通常需要在FRHIResource的圖形平臺相關的子類解構函式中執行,以FD3D11UniformBuffer:

// Engine\Source\Runtime\Windows\D3D11RHI\Public\D3D11Resources.h

class FD3D11UniformBuffer : public FRHIUniformBuffer
{
public:
// D3D11固定緩衝資源.
TRefCountPtr<ID3D11Buffer> Resource;
// 包含了RHI引用的資源表.
TArray<TRefCountPtr<FRHIResource> > ResourceTable; FD3D11UniformBuffer(class FD3D11DynamicRHI* InD3D11RHI, const FRHIUniformBufferLayout& InLayout, ID3D11Buffer* InResource,const FRingAllocation& InRingAllocation);
virtual ~FD3D11UniformBuffer(); (......)
}; // Engine\Source\Runtime\Windows\D3D11RHI\Private\D3D11UniformBuffer.cpp FD3D11UniformBuffer::~FD3D11UniformBuffer()
{
if (!RingAllocation.IsValid() && Resource != nullptr)
{
D3D11_BUFFER_DESC Desc;
Resource->GetDesc(&Desc); // 將此統一緩衝區返回給空閒池.
if (Desc.CPUAccessFlags == D3D11_CPU_ACCESS_WRITE && Desc.Usage == D3D11_USAGE_DYNAMIC)
{
FPooledUniformBuffer NewEntry;
NewEntry.Buffer = Resource;
NewEntry.FrameFreed = GFrameNumberRenderThread;
NewEntry.CreatedSize = Desc.ByteWidth; // Add to this frame's array of free uniform buffers
const int32 SafeFrameIndex = (GFrameNumberRenderThread - 1) % NumSafeFrames;
const uint32 BucketIndex = GetPoolBucketIndex(Desc.ByteWidth);
int32 LastNum = SafeUniformBufferPools[SafeFrameIndex][BucketIndex].Num();
SafeUniformBufferPools[SafeFrameIndex][BucketIndex].Add(NewEntry); FPlatformMisc::MemoryBarrier(); // check for unwanted concurrency
}
}
}

上面的分析顯示,RHI資源的釋放主要在FlushPendingDeletes介面中,涉及它的呼叫有:

// Engine\Source\Runtime\RenderCore\Private\RenderingThread.cpp

void FlushPendingDeleteRHIResources_RenderThread()
{
if (!IsRunningRHIInSeparateThread())
{
FRHIResource::FlushPendingDeletes();
}
} // Engine\Source\Runtime\RHI\Private\RHICommandList.cpp void FRHICommandListExecutor::LatchBypass()
{
#if CAN_TOGGLE_COMMAND_LIST_BYPASS
if (IsRunningRHIInSeparateThread())
{
(......)
}
else
{
(......) if (NewBypass && !bLatchedBypass)
{
FRHIResource::FlushPendingDeletes();
}
}
#endif (......)
} // Engine\Source\Runtime\RHI\Public\RHICommandList.inl void FRHICommandListImmediate::ImmediateFlush(EImmediateFlushType::Type FlushType)
{
switch (FlushType)
{
(......) case EImmediateFlushType::FlushRHIThreadFlushResources:
case EImmediateFlushType::FlushRHIThreadFlushResourcesFlushDeferredDeletes:
{
(......) PipelineStateCache::FlushResources();
FRHIResource::FlushPendingDeletes(FlushType == EImmediateFlushType::FlushRHIThreadFlushResourcesFlushDeferredDeletes);
}
break;
(......)
}
}

RHI抽象層主要是以上幾處呼叫FlushPendingDeletes,但以下的圖形平臺相關的介面也會呼叫:

  • FD3D12Adapter::Cleanup()
  • FD3D12Device::Cleanup()
  • FVulkanDevice::Destroy()
  • FVulkanDynamicRHI::Shutdown()
  • FD3D11DynamicRHI::CleanupD3DDevice()

10.4.6 再論多執行緒渲染

剖析虛幻渲染體系(02)- 多執行緒渲染篇章中已經詳盡地闡述了UE多執行緒的體系和渲染機制,本節結合下圖補充一些說明。

UE的渲染流程中,最多存在4種工作執行緒:遊戲執行緒(Game Thread)、渲染執行緒(Render Thread)、RHI執行緒和GPU(含驅動)。

遊戲執行緒是整個引擎的驅動者,提供所有的源資料和事件,以驅動渲染執行緒和RHI執行緒。遊戲執行緒領先渲染執行緒不超過1幀,更具體地說如果第N幀的渲染執行緒在第N+1幀的遊戲執行緒的Tick結束時還沒有完成,那麼遊戲執行緒會被渲染執行緒卡住。反之,如果遊戲執行緒負載過重,沒能及時傳送事件和資料給渲染執行緒,也會導致渲染執行緒卡住。

渲染執行緒負責產生RHI的中間命令,在適當的時機派發、重新整理指令到RHI執行緒。因此,渲染執行緒的卡頓也可能導致RHI的卡頓。

RHI執行緒負責派發(可選)、轉譯、提交指令,且渲染的最後一步需要SwapBuffer,這一步需要等待GPU完成渲染工作。因此,渲染GPU的繁忙也會導致RHI執行緒的卡頓。

除了遊戲執行緒,渲染執行緒、RHI執行緒和GPU的工作都是存在間隙的,即遊戲執行緒提供給渲染任務的時機會影響渲染工作的密度,也會影響到渲染的時間,小量多次會浪費渲染效率。

10.4.7 RHI控制檯變數

前面章節的程式碼也顯示RHI體系涉及的控制檯變數非常多,下面列出部分控制檯變數,以便除錯、優化RHI渲染效果或效率:

名稱 描述
r.RHI.Name 顯示當前RHI的名字,如D3D11。
r.RHICmdAsyncRHIThreadDispatch 實驗選項,是否執行RHI排程非同步。可使資料更快地重新整理到RHI執行緒,避免幀末尾出現卡頓。
r.RHICmdBalanceParallelLists 允許啟用DrawList的預處理,以嘗試在命令列表之間均衡負載。0:關閉,1:開啟,2:實驗選項,使用上一幀的結果(在分屏等不做任何事情)。
r.RHICmdBalanceTranslatesAfterTasks 實驗選項,平衡並行翻譯後的渲染任務完成。可最小化延遲上下文的數量,但會增加啟動轉譯的延遲。
r.RHICmdBufferWriteLocks 僅與RHI執行緒相關。用於診斷緩衝鎖問題的除錯選項。
r.RHICmdBypass 是否繞過RHI命令列表,立即傳送RHI命令。0:禁用(需開啟多執行緒渲染),1:開啟。
r.RHICmdCollectRHIThreadStatsFromHighLevel 這將在執行的RHI執行緒上推送統計資訊,這樣就可以確定它們來自哪個高層級的Pass。對幀速率有不利影響。預設開啟。
r.RHICmdFlushOnQueueParallelSubmit 在提交後立即等待並行命令列表的完成。問題診斷。只適用於部分RHI。
r.RHICmdFlushRenderThreadTasks 如果為真,則每次呼叫時都重新整理渲染執行緒任務。問題診斷。這是一個更細粒度cvars的主開關。
r.RHICmdForceRHIFlush 對每個任務強制刷新發送給RHI執行緒。問題診斷。
r.RHICmdMergeSmallDeferredContexts 合併小的並行轉譯任務,基於r.RHICmdMinDrawsPerParallelCmdList。
r.RHICmdUseDeferredContexts 使用延遲上下文並行執行命令列表。只適用於部分RHI。
r.RHICmdUseParallelAlgorithms True使用並行演算法。如果r.RHICmdBypass為1則忽略。
r.RHICmdUseThread 使用RHI執行緒。問題診斷。
r.RHICmdWidth 控制並行渲染器中大量事物的任務粒度。
r.RHIThread.Enable 啟用/禁用RHI執行緒,並確定RHI工作是否在專用執行緒上執行。
RHI.GPUHitchThreshold GPU上檢測卡頓的閾值(毫秒)。
RHI.MaximumFrameLatency 可以排隊進行渲染的幀數。
RHI.SyncThreshold 在垂直同步功能啟用前的連續“快速”幀數。
RHI.TargetRefreshRate 如果非零,則顯示的更新頻率永遠不會超過目標重新整理率(以Hz為單位)。

需要注意的是,以上只列出部分RHI相關的變數,還有很多未列出,具體可以在下列選單中檢視全面命令:

10.5 本篇總結

本篇主要闡述了UE的RHI體系的基礎概念、型別、機制,希望童鞋們學習完本篇之後,對UE的RHI不再陌生,能夠輕鬆自如地掌握、應用、擴充套件它。

10.5.1 本篇思考

按慣例,本篇也佈置一些小思考,以助理解和加深UE RHI體系的掌握和理解:

  • RHI資源有哪些型別?和渲染層的資源有什麼關係和區別?渲染系統如何刪除RHI資源?

  • RHI的命令有哪些主要型別?命令列表的執行機制和流程是怎樣的?

  • 簡述RHI的上下文和DynamicRHI之間的關聯。簡述D3D11的實現架構。

  • UE的多執行緒之間的關聯如何?什麼因素會導致它們的卡頓?

特別說明

  • 感謝所有參考文獻的作者,部分圖片來自參考文獻和網路,侵刪。
  • 本系列文章為筆者原創,只發表在部落格園上,歡迎分享本文連結,但未經同意,不允許轉載
  • 系列文章,未完待續,完整目錄請戳內容綱目
  • 系列文章,未完待續,完整目錄請戳內容綱目
  • 系列文章,未完待續,完整目錄請戳內容綱目

參考文獻