8.1 本篇概述
Shader是在GPU側執行的邏輯指令,根據執行單元的不同,可分為頂點著色器(Vertex Shader)、畫素著色器(Pixel Shader)、計算著色器(Compute Shader),以及幾何著色器、網格著色器等等。
UE的Shader為了跨平臺、跨圖形API,做了很多封裝和抽象,由此闡述的型別和概念非常多,另外,為了優化,提升程式碼複用率,又增加了排列、PSO、DDC等概念和型別。
前面很多篇章都有涉及Shader的概念、型別和程式碼,本篇將更加深入且廣泛低闡述它的體系。主要闡述UE的以下內容:
- Shader的基礎概念。
- Shader的基礎型別。
- Shader的實現層級。
- Shader的使用方法和用例。
- Shader的實現和原理。
- Shader的跨平臺機制。
需要注意的是,本篇涉及的Shader既包含C++層的概念和型別,也包括GPU層的概念和型別。
8.2 Shader基礎
本章將分析Shader涉及的基礎概念和型別,闡述它們之間的基本關係和使用方法。
8.2.1 FShader
FShader是一個已經編譯好的著色器程式碼和它的引數繫結的型別,是我們在渲染程式碼中最基礎、核心、常見的一個型別。它的定義如下:
// Engine\Source\Runtime\RenderCore\Public\Shader.h
class RENDERCORE_API FShader
{
public:
(......)
// 在編譯觸發之前修改編譯環境引數, 可由子類覆蓋.
static void ModifyCompilationEnvironment(const FShaderPermutationParameters&, FShaderCompilerEnvironment&) {}
// 是否需要編譯指定的排列, 可由子類覆蓋.
static bool ShouldCompilePermutation(const FShaderPermutationParameters&) { return true; }
// 檢測編譯結果是否有效, 可由子類覆蓋.
static bool ValidateCompiledResult(EShaderPlatform InPlatform, const FShaderParameterMap& InParameterMap, TArray<FString>& OutError) { return true; }
// 獲取各類資料的Hash的介面.
const FSHAHash& GetHash() const;
const FSHAHash& GetVertexFactoryHash() const;
const FSHAHash& GetOutputHash() const;
// 儲存並檢測shader程式碼的編譯結果.
void Finalize(const FShaderMapResourceCode* Code);
// 資料獲取介面.
inline FShaderType* GetType(const FShaderMapPointerTable& InPointerTable) const { return Type.Get(InPointerTable.ShaderTypes); }
inline FShaderType* GetType(const FPointerTableBase* InPointerTable) const { return Type.Get(InPointerTable); }
inline FVertexFactoryType* GetVertexFactoryType(const FShaderMapPointerTable& InPointerTable) const { return VFType.Get(InPointerTable.VFTypes); }
inline FVertexFactoryType* GetVertexFactoryType(const FPointerTableBase* InPointerTable) const { return VFType.Get(InPointerTable); }
inline FShaderType* GetTypeUnfrozen() const { return Type.GetUnfrozen(); }
inline int32 GetResourceIndex() const { checkSlow(ResourceIndex != INDEX_NONE); return ResourceIndex; }
inline EShaderPlatform GetShaderPlatform() const { return Target.GetPlatform(); }
inline EShaderFrequency GetFrequency() const { return Target.GetFrequency(); }
inline const FShaderTarget GetTarget() const { return Target; }
inline bool IsFrozen() const { return Type.IsFrozen(); }
inline uint32 GetNumInstructions() const { return NumInstructions; }
#if WITH_EDITORONLY_DATA
inline uint32 GetNumTextureSamplers() const { return NumTextureSamplers; }
inline uint32 GetCodeSize() const { return CodeSize; }
inline void SetNumInstructions(uint32 Value) { NumInstructions = Value; }
#else
inline uint32 GetNumTextureSamplers() const { return 0u; }
inline uint32 GetCodeSize() const { return 0u; }
#endif
// 嘗試返回匹配指定型別的自動繫結的Uniform Buffer, 如果不存在則返回未繫結的.
template<typename UniformBufferStructType>
const TShaderUniformBufferParameter<UniformBufferStructType>& GetUniformBufferParameter() const;
const FShaderUniformBufferParameter& GetUniformBufferParameter(const FShaderParametersMetadata* SearchStruct) const;
const FShaderUniformBufferParameter& GetUniformBufferParameter(const FHashedName SearchName) const;
const FShaderParametersMetadata* FindAutomaticallyBoundUniformBufferStruct(int32 BaseIndex) const;
static inline const FShaderParametersMetadata* GetRootParametersMetadata();
(......)
public:
// 著色器引數繫結.
LAYOUT_FIELD(FShaderParameterBindings, Bindings);
// 著色器引數繫結的對映資訊.
LAYOUT_FIELD(FShaderParameterMapInfo, ParameterMapInfo);
protected:
LAYOUT_FIELD(TMemoryImageArray<FHashedName>, UniformBufferParameterStructs);
LAYOUT_FIELD(TMemoryImageArray<FShaderUniformBufferParameter>, UniformBufferParameters);
// 下面3個是編輯器引數.
// 著色器的編譯輸出和結果引數對映的雜湊值, 用於查詢匹配的資源.
LAYOUT_FIELD_EDITORONLY(FSHAHash, OutputHash);
// 頂點工廠資源雜湊值
LAYOUT_FIELD_EDITORONLY(FSHAHash, VFSourceHash);
// shader資源雜湊值.
LAYOUT_FIELD_EDITORONLY(FSHAHash, SourceHash);
private:
// 著色器型別.
LAYOUT_FIELD(TIndexedPtr<FShaderType>, Type);
// 頂點工廠型別.
LAYOUT_FIELD(TIndexedPtr<FVertexFactoryType>, VFType);
// 目標平臺和著色頻率(frequency).
LAYOUT_FIELD(FShaderTarget, Target);
// 在FShaderMapResource的shader索引.
LAYOUT_FIELD(int32, ResourceIndex);
// shader指令數.
LAYOUT_FIELD(uint32, NumInstructions);
// 紋理取樣器數量.
LAYOUT_FIELD_EDITORONLY(uint32, NumTextureSamplers);
// shader程式碼尺寸.
LAYOUT_FIELD_EDITORONLY(uint32, CodeSize);
};
以上可知,FShader儲存著Shader關聯的繫結引數、頂點工廠、編譯後的各類資源等資料,並提供了編譯器修改和檢測介面,還有各類資料獲取介面。
FShader實際上是個基礎父類,它的子類有:
FGlobalShader:全域性著色器,它的子類在記憶體中只有唯一的例項,常用於螢幕方塊繪製、後處理等。它的定義如下:
// Engine\Source\Runtime\RenderCore\Public\GlobalShader.h class FGlobalShader : public FShader
{
public:
(......) FGlobalShader() : FShader() {}
FGlobalShader(const ShaderMetaType::CompiledShaderInitializerType& Initializer); // 設定檢視著色器引數.
template<typename TViewUniformShaderParameters, typename ShaderRHIParamRef, typename TRHICmdList>
inline void SetParameters(TRHICmdList& RHICmdList, ...);
};
相比父類FShader,增加了SetParameters設定檢視統一緩衝的介面。
FMaterialShader:材質著色器,由FMaterialShaderType指定的材質引用的著色器,是材質藍圖在例項化後的一個shader子集。它的定義如下:
// Engine\Source\Runtime\Renderer\Public\MaterialShader.h class RENDERER_API FMaterialShader : public FShader
{
public:
(......) FMaterialShader() = default;
FMaterialShader(const FMaterialShaderType::CompiledShaderInitializerType& Initializer); // 設定檢視Uniform Buffer引數.
template<typename ShaderRHIParamRef>
void SetViewParameters(FRHICommandList& RHICmdList, ...);
// 設定材質相關但非FMeshBatch相關的畫素著色器引數
template< typename TRHIShader >
void SetParameters(FRHICommandList& RHICmdList, ...);
// 獲取著色器引數繫結.
void GetShaderBindings(const FScene* Scene, ...) const; private:
// 是否允許Uniform表示式快取.
static int32 bAllowCachedUniformExpressions;
// bAllowCachedUniformExpressions對應的控制檯遍歷.
static FAutoConsoleVariableRef CVarAllowCachedUniformExpressions; #if !(UE_BUILD_TEST || UE_BUILD_SHIPPING || !WITH_EDITOR)
// 驗證表示式和著色器圖的有效性.
void VerifyExpressionAndShaderMaps(const FMaterialRenderProxy* MaterialRenderProxy, const FMaterial& Material, const FUniformExpressionCache* UniformExpressionCache) const;
#endif
// 分配的引數Uniform Buffer.
LAYOUT_FIELD(TMemoryImageArray<FShaderUniformBufferParameter>, ParameterCollectionUniformBuffers);
// 材質的著色器Uniform Buffer.
LAYOUT_FIELD(FShaderUniformBufferParameter, MaterialUniformBuffer); (......)
};
下面是FShader繼承體系下的部分子類:
FShader
FGlobalShader
TMeshPaintVertexShader
TMeshPaintPixelShader
FDistanceFieldDownsamplingCS
FBaseGPUSkinCacheCS
TGPUSkinCacheCS
FBaseRecomputeTangentsPerTriangleShader
FBaseRecomputeTangentsPerVertexShader
FRadixSortUpsweepCS
FRadixSortDownsweepCS
FParticleTileVS
FBuildMipTreeCS
FScreenVS
FScreenPS
FScreenPSInvertAlpha
FSimpleElementVS
FSimpleElementPS
FStereoLayerVS
FStereoLayerPS_Base
FStereoLayerPS
FUpdateTexture2DSubresouceCS
FUpdateTexture3DSubresouceCS
FCopyTexture2DCS
TCopyDataCS
FLandscapeLayersVS
FLandscapeLayersHeightmapPS
FGenerateMipsCS
FGenerateMipsVS
FGenerateMipsPS
FCopyTextureCS
FMediaShadersVS
FRGBConvertPS
FYUVConvertPS
FYUY2ConvertPS
FRGB10toYUVv210ConvertPS
FInvertAlphaPS
FSetAlphaOnePS
FReadTextureExternalPS
FOculusVertexShader
FRasterizeToRectsVS
FResolveVS
FResolveDepthPS
FResolveDepth2XPS
FAmbientOcclusionPS
FGTAOSpatialFilterCS
FGTAOTemporalFilterCS
FDeferredDecalVS
FDitheredTransitionStencilPS
FObjectCullVS
FObjectCullPS
FDeferredLightPS
TDeferredLightHairVS
FFXAAVS
FFXAAPS
FMotionBlurShader
FSubsurfaceShader
FTonemapVS
FTonemapPS
FTonemapCS
FUpscalePS
FTAAStandaloneCS
FSceneCapturePS
FHZBTestPS
FOcclusionQueryVS
FOcclusionQueryPS
FHZBBuildPS
FHZBBuildCS
FDownsampleDepthPS
FTiledDeferredLightingCS
FShader_VirtualTextureCompress
FShader_VirtualTextureCopy
FPageTableUpdateVS
FPageTableUpdatePS
FSlateElementVS
FSlateElementPS
(......)
FMaterialShader
FDeferredDecalPS
FLightHeightfieldsPS
FLightFunctionVS
FLightFunctionPS
FPostProcessMaterialShader
TTranslucentLightingInjectPS
FVolumetricFogLightFunctionPS
FMeshMaterialShader
FLightmapGBufferVS
FLightmapGBufferPS
FVLMVoxelizationVS
FVLMVoxelizationGS
FVLMVoxelizationPS
FLandscapeGrassWeightVS
FLandscapeGrassWeightPS
FLandscapePhysicalMaterial
FAnisotropyVS
FAnisotropyPS
TBasePassVertexShaderPolicyParamType
TBasePassVertexShaderBaseType
TBasePassVS
TBasePassPixelShaderPolicyParamType
TBasePassPixelShaderBaseType
TBasePassPS
FMeshDecalsVS
FMeshDecalsPS
TDepthOnlyVS
TDepthOnlyPS
FDistortionMeshVS
FDistortionMeshPS
FHairMaterialVS
FHairMaterialPS
FHairVisibilityVS
FHairVisibilityPS
TLightMapDensityVS
TLightMapDensityPS
FShadowDepthVS
FShadowDepthBasePS
TShadowDepthPS
FTranslucencyShadowDepthVS
FTranslucencyShadowDepthPS
FVelocityVS
FVelocityPS
FRenderVolumetricCloudVS
FVolumetricCloudShadowPS
FVoxelizeVolumeVS
FVoxelizeVolumePS
FShader_VirtualTextureMaterialDraw
(......)
FSlateMaterialShaderVS
FSlateMaterialShaderPS
(......)
上述只是列出了FShader的部分繼承體系,包含了部分之前已經解析過的Shader型別,比如FDeferredLightPS、FFXAAPS、FTonemapPS、FUpscalePS、TBasePassPS、TDepthOnlyPS等等。
FGlobalShader包含了後處理、光照、工具類、視覺化、地形、虛擬紋理等方面的Shader程式碼,可以是VS、PS、CS,但CS必然是FGlobalShader的子類;FMaterialShader主要包含了模型、專用Pass、體素化等方面的Shader程式碼,可以是VS、PS、GS等,但不會有CS。
如果新定義了FShader的子類,需要藉助下面的巨集宣告和實現對應的程式碼(部分常見的巨集):
// ------ Shader宣告和實現巨集 ------
// 宣告指定型別(FShader子類)的Shader, 可以是Global, Material, MeshMaterial, ...
#define DECLARE_SHADER_TYPE(ShaderClass,ShaderMetaTypeShortcut,...)
// 實現指定型別的Shader, 可以是Global, Material, MeshMaterial, ...
#define IMPLEMENT_SHADER_TYPE(TemplatePrefix,ShaderClass,SourceFilename,FunctionName,Frequency)
// 宣告FGlobalShader及其子類.
#define DECLARE_GLOBAL_SHADER(ShaderClass)
// 實現FGlobalShader及其子類.
#define IMPLEMENT_GLOBAL_SHADER(ShaderClass,SourceFilename,FunctionName,Frequency)
// 實現Material著色器.
#define IMPLEMENT_MATERIAL_SHADER_TYPE(TemplatePrefix,ShaderClass,SourceFilename,FunctionName,Frequency)
// 其它不常見的巨集
(......)
// ------ 示例1 ------
class FDeferredLightPS : public FGlobalShader
{
// 在FDeferredLightPS類內宣告全域性著色器
DECLARE_SHADER_TYPE(FDeferredLightPS, Global)
(......)
};
// 實現FDeferredLightPS著色器, 讓它和程式碼檔案, 主入口及著色頻率關聯起來.
IMPLEMENT_GLOBAL_SHADER(FDeferredLightPS, "/Engine/Private/DeferredLightPixelShaders.usf", "DeferredLightPixelMain", SF_Pixel);
// ------ 示例2 ------
class FDeferredDecalPS : public FMaterialShader
{
// 在類內宣告材質著色器
DECLARE_SHADER_TYPE(FDeferredDecalPS,Material);
(......)
};
// 實現FDeferredDecalPS類, 讓它和程式碼檔案, 主入口以及著色頻率關聯起來.
IMPLEMENT_MATERIAL_SHADER_TYPE(,FDeferredDecalPS,TEXT("/Engine/Private/DeferredDecal.usf"),TEXT("MainPS"),SF_Pixel);
8.2.2 Shader Parameter
著色器引數是一組由CPU的C++層傳入GPU Shader並存儲於GPU暫存器或視訊記憶體的資料。下面是著色器引數常見型別的定義:
// Engine\Source\Runtime\RenderCore\Public\ShaderParameters.h
// 著色器的暫存器繫結引數, 它的型別可以是float1/2/3/4,陣列, UAV等.
class FShaderParameter
{
(......)
public:
// 繫結指定名稱的引數.
void Bind(const FShaderParameterMap& ParameterMap,const TCHAR* ParameterName, EShaderParameterFlags Flags = SPF_Optional);
// 是否已被著色器繫結.
bool IsBound() const;
// 是否初始化.
inline bool IsInitialized() const;
// 資料獲取介面.
uint32 GetBufferIndex() const;
uint32 GetBaseIndex() const;
uint32 GetNumBytes() const;
(......)
};
// 著色器資源繫結(紋理或採樣器)
class FShaderResourceParameter
{
(......)
public:
// 繫結指定名稱的引數.
void Bind(const FShaderParameterMap& ParameterMap,const TCHAR* ParameterName,EShaderParameterFlags Flags = SPF_Optional);
bool IsBound() const;
inline bool IsInitialized() const;
uint32 GetBaseIndex() const;
uint32 GetNumResources() const;
(......)
};
// 綁定了UAV或SRV資源的型別.
class FRWShaderParameter
{
(......)
public:
// 繫結指定名稱的引數.
void Bind(const FShaderParameterMap& ParameterMap,const TCHAR* BaseName);
bool IsBound() const;
bool IsUAVBound() const;
uint32 GetUAVIndex() const;
// 設定緩衝資料到RHI.
template<typename TShaderRHIRef, typename TRHICmdList>
inline void SetBuffer(TRHICmdList& RHICmdList, const TShaderRHIRef& Shader, const FRWBuffer& RWBuffer) const;
template<typename TShaderRHIRef, typename TRHICmdList>
inline void SetBuffer(TRHICmdList& RHICmdList, const TShaderRHIRef& Shader, const FRWBufferStructured& RWBuffer) const;
// 設定紋理資料到RHI.
template<typename TShaderRHIRef, typename TRHICmdList>
inline void SetTexture(TRHICmdList& RHICmdList, const TShaderRHIRef& Shader, FRHITexture* Texture, FRHIUnorderedAccessView* UAV) const;
// 從RHI取消設定UAV.
template<typename TRHICmdList>
inline void UnsetUAV(TRHICmdList& RHICmdList, FRHIComputeShader* ComputeShader) const;
(......)
};
// 建立指定平臺下的Uniform Buffer結構體的著色器程式碼宣告.
extern void CreateUniformBufferShaderDeclaration(const TCHAR* Name,const FShaderParametersMetadata& UniformBufferStruct, EShaderPlatform Platform, FString& OutDeclaration);
// 著色器統一緩衝引數.
class FShaderUniformBufferParameter
{
(......)
public:
// 修改編譯環境變數.
static void ModifyCompilationEnvironment(const TCHAR* ParameterName,const FShaderParametersMetadata& Struct,EShaderPlatform Platform,FShaderCompilerEnvironment& OutEnvironment);
// 繫結著色器引數.
void Bind(const FShaderParameterMap& ParameterMap,const TCHAR* ParameterName,EShaderParameterFlags Flags = SPF_Optional);
bool IsBound() const;
inline bool IsInitialized() const;
uint32 GetBaseIndex() const;
(......)
};
// 指定結構體的著色器統一緩衝引數
template<typename TBufferStruct>
class TShaderUniformBufferParameter : public FShaderUniformBufferParameter
{
public:
static void ModifyCompilationEnvironment(const TCHAR* ParameterName,EShaderPlatform Platform, FShaderCompilerEnvironment& OutEnvironment);
(......)
};
由此可見,著色器引數可以繫結任何GPU型別的資源或資料,但不同的類只能繫結特定的著色器型別,不能夠混用,比如FRWShaderParameter只能繫結UAV或SRV。有了以上型別,就可以在C++層的Shader類配合LAYOUT_FIELD的相關巨集宣告具體的Shader引數了。
LAYOUT_FIELD是可以宣告指定著色器引數的型別、名字、初始值、位域、寫入函式等資料的巨集,其相關定義如下:
// Engine\Source\Runtime\Core\Public\Serialization\MemoryLayout.h
// 普通佈局
#define LAYOUT_FIELD(T, Name, ...)
// 帶初始值
#define LAYOUT_FIELD_INITIALIZED(T, Name, Value, ...)
// 帶mutable和初始值
#define LAYOUT_MUTABLE_FIELD_INITIALIZED(T, Name, Value, ...)
// 陣列佈局
#define LAYOUT_ARRAY(T, Name, NumArray, ...)
#define LAYOUT_MUTABLE_BITFIELD(T, Name, BitFieldSize, ...)
// 位域
#define LAYOUT_BITFIELD(T, Name, BitFieldSize, ...)
// 帶寫入函式
#define LAYOUT_FIELD_WITH_WRITER(T, Name, Func)
#define LAYOUT_MUTABLE_FIELD_WITH_WRITER(T, Name, Func)
#define LAYOUT_WRITE_MEMORY_IMAGE(Func)
#define LAYOUT_TOSTRING(Func)
藉助LAYOUT_FIELD等巨集,就可以在C++類中宣告指定型別的著色器引數,示例:
struct FMyExampleParam
{
// 宣告非虛類.
DECLARE_TYPE_LAYOUT(FMyExampleParam, NonVirtual);
// 位域
LAYOUT_FIELD(FShaderParameter, ShaderParam); // 等價於: FShaderParameter ShaderParam;
LAYOUT_FIELD(FShaderResourceParameter, TextureParam); // 等價於: FShaderResourceParameter TextureParam;
LAYOUT_FIELD(FRWShaderParameter, OutputUAV); // 等價於: FRWShaderParameter OutputUAV;
// 陣列, 第3個引數是最大數量.
LAYOUT_ARRAY(FShaderResourceParameter, TextureArray, 5); // 等價於: FShaderResourceParameter TextureArray[5];
LAYOUT_ARRAY(int32, Ids, 64); // 等價於: int32 Ids[64];
LAYOUT_FIELD_INITIALIZED(uint32, Size, 0); // 等價於: int32 Size = 0;
void WriteDataFunc(FMemoryImageWriter& Writer, const TMemoryImagePtr<FOtherExampleParam>& InParameters) const;
// 帶寫入函式.
LAYOUT_FIELD_WITH_WRITER(TMemoryImagePtr<FOtherExampleParam>, Parameters, WriteDataFunc);
};
8.2.3 Uniform Buffer
UE的Uniform Buffer涉及了幾個核心的概念,最底層的是RHI層的FRHIUniformBuffer,封裝了各種圖形API的統一緩衝區(也叫Constant Buffer),它的定義如下(去掉了實現和除錯程式碼):
// Engine\Source\Runtime\RHI\Public\RHIResources.h
class FRHIUniformBuffer : public FRHIResource
{
public:
// 建構函式.
FRHIUniformBuffer(const FRHIUniformBufferLayout& InLayout);
// 引用計數操作.
uint32 AddRef() const;
uint32 Release() const;
// 資料獲取介面.
uint32 GetSize() const;
const FRHIUniformBufferLayout& GetLayout() const;
bool IsGlobal() const;
private:
// RHI Uniform Buffer的佈局.
const FRHIUniformBufferLayout* Layout;
// 緩衝區尺寸.
uint32 LayoutConstantBufferSize;
};
再往上一層就是TUniformBufferRef,會引用到上述的FRHIUniformBuffer:
// Engine\Source\Runtime\RHI\Public\RHIResources.h
// 定義FRHIUniformBuffer的引用型別.
typedef TRefCountPtr<FRHIUniformBuffer> FUniformBufferRHIRef;
// Engine\Source\Runtime\RenderCore\Public\ShaderParameterMacros.h
// 引用了指定型別的FRHIUniformBuffer的例項資源. 注意是繼承了FUniformBufferRHIRef.
template<typename TBufferStruct>
class TUniformBufferRef : public FUniformBufferRHIRef
{
public:
TUniformBufferRef();
// 根據給定的值建立Uniform Buffer, 並返回結構體引用. (模板)
static TUniformBufferRef<TBufferStruct> CreateUniformBufferImmediate(const TBufferStruct& Value, EUniformBufferUsage Usage, EUniformBufferValidation Validation = EUniformBufferValidation::ValidateResources);
// 根據給定的值建立[區域性]的Uniform Buffer, 並返回結構體引用.
static FLocalUniformBuffer CreateLocalUniformBuffer(FRHICommandList& RHICmdList, const TBufferStruct& Value, EUniformBufferUsage Usage);
// 立即重新整理緩衝區資料到RHI.
void UpdateUniformBufferImmediate(const TBufferStruct& Value);
private:
// 私有構造體, 只能給TUniformBuffer和TRDGUniformBuffer建立.
TUniformBufferRef(FRHIUniformBuffer* InRHIRef);
template<typename TBufferStruct2>
friend class TUniformBuffer;
friend class TRDGUniformBuffer<TBufferStruct>;
};
再往上一層就是引用了FUniformBufferRHIRef的TUniformBuffer和TRDGUniformBuffer,它們的定義如下:
// Engine\Source\Runtime\RenderCore\Public\UniformBuffer.h
// 引用了Uniform Buffer的資源.
template<typename TBufferStruct>
class TUniformBuffer : public FRenderResource
{
public:
// 建構函式.
TUniformBuffer()
: BufferUsage(UniformBuffer_MultiFrame)
, Contents(nullptr){}
// 解構函式.
~TUniformBuffer()
{
if (Contents)
{
FMemory::Free(Contents);
}
}
// 設定Uniform Buffer的內容資料.
void SetContents(const TBufferStruct& NewContents)
{
SetContentsNoUpdate(NewContents);
UpdateRHI();
}
// 清零Uniform Buffer的內容資料. (若內容為空會先建立)
void SetContentsToZero()
{
if (!Contents)
{
Contents = (uint8*)FMemory::Malloc(sizeof(TBufferStruct), SHADER_PARAMETER_STRUCT_ALIGNMENT);
}
FMemory::Memzero(Contents, sizeof(TBufferStruct));
UpdateRHI();
}
// 獲取內容.
const uint8* GetContents() const
{
return Contents;
}
// ----過載FRenderResource的介面----
// 初始化動態RHI資源.
virtual void InitDynamicRHI() override
{
check(IsInRenderingThread());
UniformBufferRHI.SafeRelease();
if (Contents)
{
// 根據二進位制流的內容資料建立RHI資源.
UniformBufferRHI = CreateUniformBufferImmediate<TBufferStruct>(*((const TBufferStruct*)Contents), BufferUsage);
}
}
// 釋放動態RHI資源.
virtual void ReleaseDynamicRHI() override
{
UniformBufferRHI.SafeRelease();
}
// 資料訪問介面.
FRHIUniformBuffer* GetUniformBufferRHI() const
{
return UniformBufferRHI;
}
const TUniformBufferRef<TBufferStruct>& GetUniformBufferRef() const
{
return UniformBufferRHI;
}
// Buffer標記.
EUniformBufferUsage BufferUsage;
protected:
// 設定Uniform Buffer的內容資料.
void SetContentsNoUpdate(const TBufferStruct& NewContents)
{
if (!Contents)
{
Contents = (uint8*)FMemory::Malloc(sizeof(TBufferStruct), SHADER_PARAMETER_STRUCT_ALIGNMENT);
}
FMemory::Memcpy(Contents,&NewContents,sizeof(TBufferStruct));
}
private:
// TUniformBufferRef的引用.
TUniformBufferRef<TBufferStruct> UniformBufferRHI;
// CPU側的內容資料.
uint8* Contents;
};
// Engine\Source\Runtime\RenderCore\Public\RenderGraphResources.h
class FRDGUniformBuffer : public FRDGResource
{
public:
bool IsGlobal() const;
const FRDGParameterStruct& GetParameters() const;
//////////////////////////////////////////////////////////////////////////
// 獲取RHI, 只可在Pass執行時呼叫.
FRHIUniformBuffer* GetRHI() const
{
return static_cast<FRHIUniformBuffer*>(FRDGResource::GetRHI());
}
//////////////////////////////////////////////////////////////////////////
protected:
// 建構函式.
template <typename TParameterStruct>
explicit FRDGUniformBuffer(TParameterStruct* InParameters, const TCHAR* InName)
: FRDGResource(InName)
, ParameterStruct(InParameters)
, bGlobal(ParameterStruct.HasStaticSlot())
{}
private:
const FRDGParameterStruct ParameterStruct;
// 引用了FRHIUniformBuffer的資源.
// 注意TUniformBufferRef<TBufferStruct>和FUniformBufferRHIRef時等價的.
TRefCountPtr<FRHIUniformBuffer> UniformBufferRHI;
FRDGUniformBufferHandle Handle;
// 是否被全域性Shader還是區域性Shader繫結.
uint8 bGlobal : 1;
friend FRDGBuilder;
friend FRDGUniformBufferRegistry;
friend FRDGAllocator;
};
// FRDGUniformBuffer的模板版本.
template <typename ParameterStructType>
class TRDGUniformBuffer : public FRDGUniformBuffer
{
public:
// 資料獲取介面.
const TRDGParameterStruct<ParameterStructType>& GetParameters() const;
TUniformBufferRef<ParameterStructType> GetRHIRef() const;
const ParameterStructType* operator->() const;
private:
explicit TRDGUniformBuffer(ParameterStructType* InParameters, const TCHAR* InName)
: FRDGUniformBuffer(InParameters, InName)
{}
friend FRDGBuilder;
friend FRDGUniformBufferRegistry;
friend FRDGAllocator;
};
將它們抽象成UML繼承圖之後,如下所示:
FRHIResource <|-- FRHIUniformBuffer
FUniformBufferRHIRef <|-- TUniformBufferRef
FRHIUniformBuffer <-- FUniformBufferRHIRef
class FRHIResource{
}
class FRHIUniformBuffer{
FRHIUniformBufferLayout* Layout
uint32 LayoutConstantBufferSize
}
class FUniformBufferRHIRef{
FRHIUniformBuffer* Reference
}
class TUniformBufferRef{
TUniformBufferRef(FRHIUniformBuffer* InRHIRef)
CreateUniformBufferImmediate()
CreateLocalUniformBuffer()
UpdateUniformBufferImmediate()
}
FRenderResource <|-- TUniformBuffer
TUniformBufferRef <-- TUniformBuffer
class FRenderResource{
}
class TUniformBuffer{
SetContents()
GetUniformBufferRHI()
GetUniformBufferRef()
uint8* Contents
EUniformBufferUsage BufferUsage
TUniformBufferRef<TBufferStruct> UniformBufferRHI
}
FRDGUniformBuffer <|-- TRDGUniformBuffer
FUniformBufferRHIRef <-- FRDGUniformBuffer
class FRDGUniformBuffer{
FUniformBufferRHIRef UniformBufferRHI
FRDGUniformBufferHandle Handle
}
class TRDGUniformBuffer{
GetRHIRef()
}
吐槽一下:文字繪圖語法Mermaid不能指定佈局,自動生成的圖形佈局不夠美觀,並且在window下放大UI之後,文字顯示不全了。湊合著看吧。
以上Uniform Buffer的型別可以通過SHADER_PARAMETER的相關巨集定義結構體和結構體成員。SHADER_PARAMETER的相關巨集定義如下:
// Engine\Source\Runtime\RenderCore\Public\ShaderParameterMacros.h
// Shader Parameter Struct: 開始/結束.
#define BEGIN_SHADER_PARAMETER_STRUCT(StructTypeName, PrefixKeywords)
#define END_SHADER_PARAMETER_STRUCT()
// Uniform Buffer Struct: 開始/結束/實現.
#define BEGIN_UNIFORM_BUFFER_STRUCT(StructTypeName, PrefixKeywords)
#define BEGIN_UNIFORM_BUFFER_STRUCT_WITH_CONSTRUCTOR(StructTypeName, PrefixKeywords)
#define END_UNIFORM_BUFFER_STRUCT()
#define IMPLEMENT_UNIFORM_BUFFER_STRUCT(StructTypeName,ShaderVariableName)
#define IMPLEMENT_UNIFORM_BUFFER_ALIAS_STRUCT(StructTypeName, UniformBufferAlias)
#define IMPLEMENT_STATIC_UNIFORM_BUFFER_STRUCT(StructTypeName,ShaderVariableName,StaticSlotName)
#define IMPLEMENT_STATIC_UNIFORM_BUFFER_SLOT(SlotName)
// Global Shader Parameter Struct: 開始/結束/實現.
#define BEGIN_GLOBAL_SHADER_PARAMETER_STRUCT
#define BEGIN_GLOBAL_SHADER_PARAMETER_STRUCT_WITH_CONSTRUCTOR
#define END_GLOBAL_SHADER_PARAMETER_STRUCT
#define IMPLEMENT_GLOBAL_SHADER_PARAMETER_STRUCT
#define IMPLEMENT_GLOBAL_SHADER_PARAMETER_ALIAS_STRUCT
// Shader Parameter: 單個, 陣列.
#define SHADER_PARAMETER(MemberType, MemberName)
#define SHADER_PARAMETER_EX(MemberType,MemberName,Precision)
#define SHADER_PARAMETER_ARRAY(MemberType,MemberName,ArrayDecl)
#define SHADER_PARAMETER_ARRAY_EX(MemberType,MemberName,ArrayDecl,Precision)
// Shader Parameter: 紋理, SRV, UAV, 取樣器及其陣列
#define SHADER_PARAMETER_TEXTURE(ShaderType,MemberName)
#define SHADER_PARAMETER_TEXTURE_ARRAY(ShaderType,MemberName, ArrayDecl)
#define SHADER_PARAMETER_SRV(ShaderType,MemberName)
#define SHADER_PARAMETER_UAV(ShaderType,MemberName)
#define SHADER_PARAMETER_SAMPLER(ShaderType,MemberName)
#define SHADER_PARAMETER_SAMPLER_ARRAY(ShaderType,MemberName, ArrayDecl)
// Shader Parameter Struct內的Shader Parameter Struct引數.
#define SHADER_PARAMETER_STRUCT(StructType,MemberName)
#define SHADER_PARAMETER_STRUCT_ARRAY(StructType,MemberName, ArrayDecl)
#define SHADER_PARAMETER_STRUCT_INCLUDE(StructType,MemberName)
// 引用一個[全域性]的著色器引數結構體.
#define SHADER_PARAMETER_STRUCT_REF(StructType,MemberName)
// RDG模式的Shader Parameter.
#define SHADER_PARAMETER_RDG_TEXTURE(ShaderType,MemberName)
#define SHADER_PARAMETER_RDG_TEXTURE_SRV(ShaderType,MemberName)
#define SHADER_PARAMETER_RDG_TEXTURE_UAV(ShaderType,MemberName)
#define SHADER_PARAMETER_RDG_TEXTURE_UAV_ARRAY(ShaderType,MemberName, ArrayDecl)
#define SHADER_PARAMETER_RDG_BUFFER(ShaderType,MemberName)
#define SHADER_PARAMETER_RDG_BUFFER_SRV(ShaderType,MemberName)
#define SHADER_PARAMETER_RDG_BUFFER_SRV_ARRAY(ShaderType,MemberName, ArrayDecl)
#define SHADER_PARAMETER_RDG_BUFFER_UAV(ShaderType,MemberName)
#define SHADER_PARAMETER_RDG_BUFFER_UAV(ShaderType,MemberName)
#define SHADER_PARAMETER_RDG_BUFFER_UAV_ARRAY(ShaderType,MemberName, ArrayDecl)
#define SHADER_PARAMETER_RDG_UNIFORM_BUFFER(StructType, MemberName)
注意區域性(普通)的Shader Parameter Struct沒有實現(IMPLEMENT_SHADER_PARAMETER_STRUCT)巨集,Global的才有(IMPLEMENT_GLOBAL_SHADER_PARAMETER_STRUCT)。
下面給出示例,展示如何用上述部分巨集來宣告著色器的各類引數:
// 定義全域性的著色器引數結構體(可在.h或.cpp, 不過一般在.h)
BEGIN_GLOBAL_SHADER_PARAMETER_STRUCT(FMyShaderParameterStruct, )
// 常規單個和陣列引數.
SHADER_PARAMETER(float, Intensity)
SHADER_PARAMETER_ARRAY(FVector3, Vertexes, [8])
// 取樣器, 紋理, SRV, UAV
SHADER_PARAMETER_SAMPLER(SamplerState, TextureSampler)
SHADER_PARAMETER_TEXTURE(Texture3D, Texture3d)
SHADER_PARAMETER_SRV(Buffer<float4>, VertexColorBuffer)
SHADER_PARAMETER_UAV(RWStructuredBuffer<float4>, OutputTexture)
// 著色器引數結構體
// 引用著色器引數結構體(全域性的才行)
SHADER_PARAMETER_STRUCT_REF(FViewUniformShaderParameters, View)
// 包含著色器引數結構體(區域性或全域性都行)
SHADER_PARAMETER_STRUCT_INCLUDE(FSceneTextureShaderParameters, SceneTextures)
END_GLOBAL_SHADER_PARAMETER_STRUCT()
// 實現全域性的著色器引數結構體(只能在.cpp)
IMPLEMENT_GLOBAL_SHADER_PARAMETER_STRUCT(FMyShaderParameterStruct, "MyShaderParameterStruct");
上面的著色器結構體是在C++側宣告和實現的,如果需要正確傳入到Shader中,還需要額外的C++程式碼來完成:
// 宣告結構體.
FMyShaderParameterStruct MyShaderParameterStruct;
// 建立RHI資源.
// 可以是多幀(UniformBuffer_MultiFrame)的, 這樣只需建立1次就可以快取指標, 後續有資料更新呼叫UpdateUniformBufferImmediate即可.
// 也可以是單幀的(UniformBuffer_SingleFrame), 則每幀需要建立和更新資料.
auto MyShaderParameterStructRHI = TUniformBufferRef<FMyShaderParameterStruct>::CreateUniformBufferImmediate(ShaderParameterStruct, EUniformBufferUsage::UniformBuffer_MultiFrame);
// 更新著色器引數結構體.
MyShaderParameterStruct.Intensity = 1.0f;
(......)
// 更新資料到RHI.
MyShaderParameterStructRHI.UpdateUniformBufferImmediate(MyShaderParameterStruct);
8.2.4 Vertex Factory
我們知道,在引擎中存在著靜態網格、矇騙骨骼、程式化網格以及地形等等型別的網格型別,而材質就是通過頂點工廠FVertexFactory來支援這些網格型別。實際上,頂點工廠要涉及各方面的資料和型別,包含但不限於:
- 頂點著色器。頂點著色器的輸入輸出需要頂點工廠來表明資料的佈局。
- 頂點工廠的引數和RHI資源。這些資料將從C++層傳入到頂點著色器中進行處理。
- 頂點緩衝和頂點佈局。通過頂點佈局,我們可以自定義和擴充套件頂點緩衝的輸入,從而實現定製化的Shader程式碼。
- 幾何預處理。頂點緩衝、網格資源、材質引數等等都可以在真正渲染前預處理它們。
頂點工廠在渲染層級中的關係。由圖可知,頂點工廠是渲染執行緒的物件,橫跨於CPU和GPU兩端。
FVertexFactory封裝了可以連結到頂點著色器的頂點資料資源,它和相關型別的定義如下:
// Engine\Source\Runtime\RHI\Public\RHI.h
// 頂點元素.
struct FVertexElement
{
uint8 StreamIndex; // 流索引
uint8 Offset; // 偏移
TEnumAsByte<EVertexElementType> Type; // 型別
uint8 AttributeIndex;// 屬性索引
uint16 Stride; // 步長
// 例項索引或頂點索引是否例項化的, 若是0, 則元素會對每個例項進行重複.
uint16 bUseInstanceIndex;
FVertexElement();
FVertexElement(uint8 InStreamIndex, ...);
void operator=(const FVertexElement& Other);
friend FArchive& operator<<(FArchive& Ar,FVertexElement& Element);
FString ToString() const;
void FromString(const FString& Src);
void FromString(const FStringView& Src);
};
// 頂點宣告元素列表的型別.
typedef TArray<FVertexElement,TFixedAllocator<MaxVertexElementCount> > FVertexDeclarationElementList;
// Engine\Source\Runtime\RHI\Public\RHIResources.h
// 頂點宣告的RHI資源
class FRHIVertexDeclaration : public FRHIResource
{
public:
virtual bool GetInitializer(FVertexDeclarationElementList& Init) { return false; }
};
// 頂點緩衝區
class FRHIVertexBuffer : public FRHIResource
{
public:
FRHIVertexBuffer(uint32 InSize,uint32 InUsage);
uint32 GetSize() const;
uint32 GetUsage() const;
protected:
FRHIVertexBuffer();
void Swap(FRHIVertexBuffer& Other);
void ReleaseUnderlyingResource();
private:
// 尺寸.
uint32 Size;
// 緩衝區標記, 如BUF_UnorderedAccess
uint32 Usage;
};
// Engine\Source\Runtime\RenderCore\Public\VertexFactory.h
// 頂點輸入流.
struct FVertexInputStream
{
// 頂點流索引
uint32 StreamIndex : 4;
// 在VertexBuffer的偏移.
uint32 Offset : 28;
// 頂點快取區
FRHIVertexBuffer* VertexBuffer;
FVertexInputStream();
FVertexInputStream(uint32 InStreamIndex, uint32 InOffset, FRHIVertexBuffer* InVertexBuffer);
inline bool operator==(const FVertexInputStream& rhs) const;
inline bool operator!=(const FVertexInputStream& rhs) const;
};
// 頂點輸入流陣列.
typedef TArray<FVertexInputStream, TInlineAllocator<4>> FVertexInputStreamArray;
// 頂點流標記
enum class EVertexStreamUsage : uint8
{
Default = 0 << 0, // 預設
Instancing = 1 << 0, // 例項化
Overridden = 1 << 1, // 覆蓋
ManualFetch = 1 << 2 // 手動獲取
};
// 頂點輸入流型別.
enum class EVertexInputStreamType : uint8
{
Default = 0, // 預設
PositionOnly, // 只有位置
PositionAndNormalOnly // 只有位置和法線
};
// 頂點流元件.
struct FVertexStreamComponent
{
// 流資料的頂點緩衝區, 如果為null, 則不會有資料從此頂點流被讀取.
const FVertexBuffer* VertexBuffer = nullptr;
// vertex buffer的偏移.
uint32 StreamOffset = 0;
// 資料的偏移, 相對於頂點緩衝區中每個元素的開頭.
uint8 Offset = 0;
// 資料的步長.
uint8 Stride = 0;
// 從流讀取的資料型別.
TEnumAsByte<EVertexElementType> Type = VET_None;
// 頂點流標記.
EVertexStreamUsage VertexStreamUsage = EVertexStreamUsage::Default;
(......)
};
// 著色器使用的頂點工廠的引數繫結介面.
class FVertexFactoryShaderParameters
{
public:
// 繫結引數到ParameterMap. 具體邏輯由子類完成.
void Bind(const class FShaderParameterMap& ParameterMap) {}
// 獲取頂點工廠的著色器繫結和頂點流. 具體邏輯由子類完成.
void GetElementShaderBindings(
const class FSceneInterface* Scene,
const class FSceneView* View,
const class FMeshMaterialShader* Shader,
const EVertexInputStreamType InputStreamType,
ERHIFeatureLevel::Type FeatureLevel,
const class FVertexFactory* VertexFactory,
const struct FMeshBatchElement& BatchElement,
class FMeshDrawSingleShaderBindings& ShaderBindings,
FVertexInputStreamArray& VertexStreams) const {}
(......)
};
// 用來表示頂點工廠型別的類.
class FVertexFactoryType
{
public:
// 型別定義
typedef FVertexFactoryShaderParameters* (*ConstructParametersType)(EShaderFrequency ShaderFrequency, const class FShaderParameterMap& ParameterMap);
typedef const FTypeLayoutDesc* (*GetParameterTypeLayoutType)(EShaderFrequency ShaderFrequency);
(......)
// 獲取頂點工廠型別數量.
static int32 GetNumVertexFactoryTypes();
// 獲取全域性的著色器工廠列表.
static RENDERCORE_API TLinkedList<FVertexFactoryType*>*& GetTypeList();
// 獲取已存的材質型別列表.
static RENDERCORE_API const TArray<FVertexFactoryType*>& GetSortedMaterialTypes();
// 通過名字查詢FVertexFactoryType
static RENDERCORE_API FVertexFactoryType* GetVFByName(const FHashedName& VFName);
// 初始化FVertexFactoryType靜態成員, 必須在VF型別建立之前呼叫.
static void Initialize(const TMap<FString, TArray<const TCHAR*> >& ShaderFileToUniformBufferVariables);
static void Uninitialize();
// 構造/解構函式.
RENDERCORE_API FVertexFactoryType(...);
virtual ~FVertexFactoryType();
// 資料獲取介面.
const TCHAR* GetName() const;
FName GetFName() const;
const FHashedName& GetHashedName() const;
const TCHAR* GetShaderFilename() const;
// 著色器引數介面.
FVertexFactoryShaderParameters* CreateShaderParameters(...) const;
const FTypeLayoutDesc* GetShaderParameterLayout(...) const;
void GetShaderParameterElementShaderBindings(...) const;
// 標記訪問.
bool IsUsedWithMaterials() const;
bool SupportsStaticLighting() const;
bool SupportsDynamicLighting() const;
bool SupportsPrecisePrevWorldPos() const;
bool SupportsPositionOnly() const;
bool SupportsCachingMeshDrawCommands() const;
bool SupportsPrimitiveIdStream() const;
// 獲取雜湊.
friend uint32 GetTypeHash(const FVertexFactoryType* Type);
// 基於頂點工廠型別的原始碼和包含計算出來的雜湊.
const FSHAHash& GetSourceHash(EShaderPlatform ShaderPlatform) const;
// 是否需要快取材質的著色器型別.
bool ShouldCache(const FVertexFactoryShaderPermutationParameters& Parameters) const;
void ModifyCompilationEnvironment(...);
void ValidateCompiledResult(EShaderPlatform Platform, ...);
bool SupportsTessellationShaders() const;
// 增加引用的Uniform Buffer包含.
void AddReferencedUniformBufferIncludes(...);
void FlushShaderFileCache(...);
const TMap<const TCHAR*, FCachedUniformBufferDeclaration>& GetReferencedUniformBufferStructsCache() const;
private:
static uint32 NumVertexFactories;
static bool bInitializedSerializationHistory;
// 頂點工廠型別的各類資料和標記.
const TCHAR* Name;
const TCHAR* ShaderFilename;
FName TypeName;
FHashedName HashedName;
uint32 bUsedWithMaterials : 1;
uint32 bSupportsStaticLighting : 1;
uint32 bSupportsDynamicLighting : 1;
uint32 bSupportsPrecisePrevWorldPos : 1;
uint32 bSupportsPositionOnly : 1;
uint32 bSupportsCachingMeshDrawCommands : 1;
uint32 bSupportsPrimitiveIdStream : 1;
ConstructParametersType ConstructParameters;
GetParameterTypeLayoutType GetParameterTypeLayout;
GetParameterTypeElementShaderBindingsType GetParameterTypeElementShaderBindings;
ShouldCacheType ShouldCacheRef;
ModifyCompilationEnvironmentType ModifyCompilationEnvironmentRef;
ValidateCompiledResultType ValidateCompiledResultRef;
SupportsTessellationShadersType SupportsTessellationShadersRef;
// 全域性頂點工廠型別列表.
TLinkedList<FVertexFactoryType*> GlobalListLink;
// 快取引用的Uniform Buffer的包含.
TMap<const TCHAR*, FCachedUniformBufferDeclaration> ReferencedUniformBufferStructsCache;
// 跟蹤ReferencedUniformBufferStructsCache快取了哪些平臺的宣告.
bool bCachedUniformBufferStructDeclarations;
};
// ------頂點工廠的工具巨集------
// 實現頂點工廠引數型別
#define IMPLEMENT_VERTEX_FACTORY_PARAMETER_TYPE(FactoryClass, ShaderFrequency, ParameterClass)
// 頂點工廠型別的宣告
#define DECLARE_VERTEX_FACTORY_TYPE(FactoryClass)
// 頂點工廠型別的實現
#define IMPLEMENT_VERTEX_FACTORY_TYPE(FactoryClass,ShaderFilename,bUsedWithMaterials,bSupportsStaticLighting,bSupportsDynamicLighting,bPrecisePrevWorldPos,bSupportsPositionOnly)
// 頂點工廠的虛擬函式表實現
#define IMPLEMENT_VERTEX_FACTORY_VTABLE(FactoryClass
// 頂點工廠
class FVertexFactory : public FRenderResource
{
public:
FVertexFactory(ERHIFeatureLevel::Type InFeatureLevel);
virtual FVertexFactoryType* GetType() const;
// 獲取頂點資料流.
void GetStreams(ERHIFeatureLevel::Type InFeatureLevel, EVertexInputStreamType VertexStreamType, FVertexInputStreamArray& OutVertexStreams) const
{
// Default頂點流型別
if (VertexStreamType == EVertexInputStreamType::Default)
{
bool bSupportsVertexFetch = SupportsManualVertexFetch(InFeatureLevel);
// 將頂點工廠的資料構造到FVertexInputStream中並新增到輸出列表
for (int32 StreamIndex = 0;StreamIndex < Streams.Num();StreamIndex++)
{
const FVertexStream& Stream = Streams[StreamIndex];
if (!(EnumHasAnyFlags(EVertexStreamUsage::ManualFetch, Stream.VertexStreamUsage) && bSupportsVertexFetch))
{
if (!Stream.VertexBuffer)
{
OutVertexStreams.Add(FVertexInputStream(StreamIndex, 0, nullptr));
}
else
{
if (EnumHasAnyFlags(EVertexStreamUsage::Overridden, Stream.VertexStreamUsage) && !Stream.VertexBuffer->IsInitialized())
{
OutVertexStreams.Add(FVertexInputStream(StreamIndex, 0, nullptr));
}
else
{
OutVertexStreams.Add(FVertexInputStream(StreamIndex, Stream.Offset, Stream.VertexBuffer->VertexBufferRHI));
}
}
}
}
}
// 只有位置和的頂點流型別
else if (VertexStreamType == EVertexInputStreamType::PositionOnly)
{
// Set the predefined vertex streams.
for (int32 StreamIndex = 0; StreamIndex < PositionStream.Num(); StreamIndex++)
{
const FVertexStream& Stream = PositionStream[StreamIndex];
OutVertexStreams.Add(FVertexInputStream(StreamIndex, Stream.Offset, Stream.VertexBuffer->VertexBufferRHI));
}
}
// 只有位置和法線的頂點流型別
else if (VertexStreamType == EVertexInputStreamType::PositionAndNormalOnly)
{
// Set the predefined vertex streams.
for (int32 StreamIndex = 0; StreamIndex < PositionAndNormalStream.Num(); StreamIndex++)
{
const FVertexStream& Stream = PositionAndNormalStream[StreamIndex];
OutVertexStreams.Add(FVertexInputStream(StreamIndex, Stream.Offset, Stream.VertexBuffer->VertexBufferRHI));
}
}
else
{
// NOT_IMPLEMENTED
}
}
// 偏移例項的資料流.
void OffsetInstanceStreams(uint32 InstanceOffset, EVertexInputStreamType VertexStreamType, FVertexInputStreamArray& VertexStreams) const;
static void ModifyCompilationEnvironment(...);
static void ValidateCompiledResult(...);
static bool SupportsTessellationShaders();
// FRenderResource介面, 釋放RHI資源.
virtual void ReleaseRHI();
// 設定/獲取頂點宣告的RHI引用.
FVertexDeclarationRHIRef& GetDeclaration();
void SetDeclaration(FVertexDeclarationRHIRef& NewDeclaration);
// 根據型別獲取頂點宣告的RHI引用.
const FVertexDeclarationRHIRef& GetDeclaration(EVertexInputStreamType InputStreamType) const
{
switch (InputStreamType)
{
case EVertexInputStreamType::Default: return Declaration;
case EVertexInputStreamType::PositionOnly: return PositionDeclaration;
case EVertexInputStreamType::PositionAndNormalOnly: return PositionAndNormalDeclaration;
}
return Declaration;
}
// 各類標記.
virtual bool IsGPUSkinned() const;
virtual bool SupportsPositionOnlyStream() const;
virtual bool SupportsPositionAndNormalOnlyStream() const;
virtual bool SupportsNullPixelShader() const;
// 用面向攝像機精靈的方式渲染圖元.
virtual bool RendersPrimitivesAsCameraFacingSprites() const;
// 是否需要頂點宣告.
bool NeedsDeclaration() const;
// 是否支援手動的頂點獲取.
inline bool SupportsManualVertexFetch(const FStaticFeatureLevel InFeatureLevel) const;
// 根據流型別獲取索引.
inline int32 GetPrimitiveIdStreamIndex(EVertexInputStreamType InputStreamType) const;
protected:
inline void SetPrimitiveIdStreamIndex(EVertexInputStreamType InputStreamType, int32 StreamIndex)
{
PrimitiveIdStreamIndex[static_cast<uint8>(InputStreamType)] = StreamIndex;
}
// 為頂點流元件建立頂點元素.
FVertexElement AccessStreamComponent(const FVertexStreamComponent& Component,uint8 AttributeIndex);
FVertexElement AccessStreamComponent(const FVertexStreamComponent& Component, uint8 AttributeIndex, EVertexInputStreamType InputStreamType);
// 初始化頂點宣告.
void InitDeclaration(const FVertexDeclarationElementList& Elements, EVertexInputStreamType StreamType = EVertexInputStreamType::Default)
{
if (StreamType == EVertexInputStreamType::PositionOnly)
{
PositionDeclaration = PipelineStateCache::GetOrCreateVertexDeclaration(Elements);
}
else if (StreamType == EVertexInputStreamType::PositionAndNormalOnly)
{
PositionAndNormalDeclaration = PipelineStateCache::GetOrCreateVertexDeclaration(Elements);
}
else // (StreamType == EVertexInputStreamType::Default)
{
// Create the vertex declaration for rendering the factory normally.
Declaration = PipelineStateCache::GetOrCreateVertexDeclaration(Elements);
}
}
// 頂點流, 需要設定到頂點流的資訊體.
struct FVertexStream
{
const FVertexBuffer* VertexBuffer = nullptr;
uint32 Offset = 0;
uint16 Stride = 0;
EVertexStreamUsage VertexStreamUsage = EVertexStreamUsage::Default;
uint8 Padding = 0;
friend bool operator==(const FVertexStream& A,const FVertexStream& B);
FVertexStream();
};
// 用於渲染頂點工廠的頂點流.
TArray<FVertexStream,TInlineAllocator<8> > Streams;
// VF(頂點工廠)可以顯式地將此設定為false,以避免在沒有宣告的情況下出現錯誤. 主要用於需要直接從緩衝區獲取資料的VF(如Niagara).
bool bNeedsDeclaration = true;
bool bSupportsManualVertexFetch = false;
int8 PrimitiveIdStreamIndex[3] = { -1, -1, -1 };
private:
// 只有位置的頂點流, 用於渲染深度Pass的頂點工廠.
TArray<FVertexStream,TInlineAllocator<2> > PositionStream;
// 只有位置和法線的頂點流.
TArray<FVertexStream, TInlineAllocator<3> > PositionAndNormalStream;
// 用於常規渲染頂點工廠的RHI頂點宣告.
FVertexDeclarationRHIRef Declaration;
// PositionStream和PositionAndNormalStream對應的RHI資源.
FVertexDeclarationRHIRef PositionDeclaration;
FVertexDeclarationRHIRef PositionAndNormalDeclaration;
};
上面展示了Vertex Factory的很多型別,有好幾個是核心類,比如FVertexFactory、FVertexElement、FRHIVertexDeclaration、FRHIVertexBuffer、FVertexFactoryType、FVertexStreamComponent、FVertexInputStream、FVertexFactoryShaderParameters等。那麼它們之間的關係是什麼呢?
為了更好地說明它們之間的關係,以靜態模型的FStaticMeshDataType為例:
FStaticMeshDataType會包含若干個FVertexStreamComponent例項,每個FVertexStreamComponent包含了一個在FVertexDeclarationElementList的FVertexElement例項索引和一個在FVertexInputStreamArray列表的FVertexStream例項索引。
此外,FVertexFactory是個基類,內建的子類主要有:
FGeometryCacheVertexVertexFactory:幾何快取頂點的頂點工廠,常用於預生成的布料、動作等網格型別。
FGPUBaseSkinVertexFactory:GPU蒙皮骨骼網格的父類,它的子類有:
- TGPUSkinVertexFactory:可指定骨骼權重方式的GPU蒙皮的頂點工廠。
FLocalVertexFactory:區域性頂點工廠,常用於靜態網格,它擁有數量較多的子類:
- FInstancedStaticMeshVertexFactory:例項化的靜態網格頂點工廠。
- FSplineMeshVertexFactory:樣條曲線網格頂點工廠。
- FGeometryCollectionVertexFactory:幾何收集頂點工廠。
- FGPUSkinPassthroughVertexFactory:啟用了Skin Cache模式的蒙皮骨骼頂點工廠。
- FSingleTriangleMeshVertexFactory:單個三角形網格的頂點工廠,用於體積雲渲染。
- ......
FParticleVertexFactoryBase:用於粒子渲染的頂點工廠基類。
FLandscapeVertexFactory:用於渲染地形的頂點工廠。
除了以上繼承自FVertexFactory,還有一些不是繼承自FVertexFactory的型別,如:
- FGPUBaseSkinAPEXClothVertexFactory:布料頂點工廠。
- TGPUSkinAPEXClothVertexFactory:可帶骨骼權重模式的布料頂點工廠。
除了FVertexFactory,相應的其它核心類也有繼承體系。比如FVertexFactoryShaderParameters的子類有:
- FGeometryCacheVertexFactoryShaderParameters
- FGPUSkinVertexFactoryShaderParameters
- FMeshParticleVertexFactoryShaderParameters
- FParticleSpriteVertexFactoryShaderParameters
- FGPUSpriteVertexFactoryShaderParametersVS
- FGPUSpriteVertexFactoryShaderParametersPS
- FSplineMeshVertexFactoryShaderParameters
- FLocalVertexFactoryShaderParametersBase
- FLandscapeVertexFactoryVertexShaderParameters
- FLandscapeVertexFactoryPixelShaderParameters
- ......
另外,有部分頂點工廠還會在內部派生FStaticMeshDataType的型別,以複用靜態網格相關的資料成員。
為了更好地說明頂點工廠的使用方式,下面就以最常見的FLocalVertexFactory和使用了FLocalVertexFactory的CableComponent為例:
// Engine\Source\Runtime\Engine\Public\LocalVertexFactory.h
class ENGINE_API FLocalVertexFactory : public FVertexFactory
{
public:
FLocalVertexFactory(ERHIFeatureLevel::Type InFeatureLevel, const char* InDebugName);
// 派生自FStaticMeshDataType的資料型別.
struct FDataType : public FStaticMeshDataType
{
FRHIShaderResourceView* PreSkinPositionComponentSRV = nullptr;
};
// 環境變數更改和校驗.
static bool ShouldCompilePermutation(const FVertexFactoryShaderPermutationParameters& Parameters);
static void ModifyCompilationEnvironment(const FVertexFactoryShaderPermutationParameters& Parameters, FShaderCompilerEnvironment& OutEnvironment);
static void ValidateCompiledResult(const FVertexFactoryType* Type, EShaderPlatform Platform, const FShaderParameterMap& ParameterMap, TArray<FString>& OutErrors);
// 由TSynchronizedResource從遊戲執行緒更新而來的資料.
void SetData(const FDataType& InData);
// 從其它頂點工廠複製資料.
void Copy(const FLocalVertexFactory& Other);
// FRenderResource介面.
virtual void InitRHI() override;
virtual void ReleaseRHI() override
{
UniformBuffer.SafeRelease();
FVertexFactory::ReleaseRHI();
}
// 頂點顏色介面.
void SetColorOverrideStream(FRHICommandList& RHICmdList, const FVertexBuffer* ColorVertexBuffer) const;
void GetColorOverrideStream(const FVertexBuffer* ColorVertexBuffer, FVertexInputStreamArray& VertexStreams) const;
// 著色器引數和其它資料介面.
inline FRHIShaderResourceView* GetPositionsSRV() const;
inline FRHIShaderResourceView* GetPreSkinPositionSRV() const;
inline FRHIShaderResourceView* GetTangentsSRV() const;
inline FRHIShaderResourceView* GetTextureCoordinatesSRV() const;
inline FRHIShaderResourceView* GetColorComponentsSRV() const;
inline const uint32 GetColorIndexMask() const;
inline const int GetLightMapCoordinateIndex() const;
inline const int GetNumTexcoords() const;
FRHIUniformBuffer* GetUniformBuffer() const;
(......)
protected:
// 從遊戲執行緒傳入的資料. FDataType是FStaticMeshDataType的子類.
FDataType Data;
// 區域性頂點工廠的著色器引數.
TUniformBufferRef<FLocalVertexFactoryUniformShaderParameters> UniformBuffer;
// 頂點顏色流索引.
int32 ColorStreamIndex;
(......)
};
// Engine\Source\Runtime\Engine\Public\LocalVertexFactory.cpp
void FLocalVertexFactory::InitRHI()
{
// 是否使用gpu場景.
const bool bCanUseGPUScene = UseGPUScene(GMaxRHIShaderPlatform, GMaxRHIFeatureLevel);
// 初始化位置流和位置宣告.
if (Data.PositionComponent.VertexBuffer != Data.TangentBasisComponents[0].VertexBuffer)
{
// 增加頂點宣告.
auto AddDeclaration = [this, bCanUseGPUScene](EVertexInputStreamType InputStreamType, bool bAddNormal)
{
// 頂點流元素.
FVertexDeclarationElementList StreamElements;
StreamElements.Add(AccessStreamComponent(Data.PositionComponent, 0, InputStreamType));
bAddNormal = bAddNormal && Data.TangentBasisComponents[1].VertexBuffer != NULL;
if (bAddNormal)
{
StreamElements.Add(AccessStreamComponent(Data.TangentBasisComponents[1], 2, InputStreamType));
}
const uint8 TypeIndex = static_cast<uint8>(InputStreamType);
PrimitiveIdStreamIndex[TypeIndex] = -1;
if (GetType()->SupportsPrimitiveIdStream() && bCanUseGPUScene)
{
// When the VF is used for rendering in normal mesh passes, this vertex buffer and offset will be overridden
StreamElements.Add(AccessStreamComponent(FVertexStreamComponent(&GPrimitiveIdDummy, 0, 0, sizeof(uint32), VET_UInt, EVertexStreamUsage::Instancing), 1, InputStreamType));
PrimitiveIdStreamIndex[TypeIndex] = StreamElements.Last().StreamIndex;
}
// 初始化宣告.
InitDeclaration(StreamElements, InputStreamType);
};
// 增加PositionOnly和PositionAndNormalOnly兩種頂點宣告, 其中前者不需要法線.
AddDeclaration(EVertexInputStreamType::PositionOnly, false);
AddDeclaration(EVertexInputStreamType::PositionAndNormalOnly, true);
}
// 頂點宣告元素列表.
FVertexDeclarationElementList Elements;
// 頂點位置
if(Data.PositionComponent.VertexBuffer != NULL)
{
Elements.Add(AccessStreamComponent(Data.PositionComponent,0));
}
// 圖元id
{
const uint8 Index = static_cast<uint8>(EVertexInputStreamType::Default);
PrimitiveIdStreamIndex[Index] = -1;
if (GetType()->SupportsPrimitiveIdStream() && bCanUseGPUScene)
{
// When the VF is used for rendering in normal mesh passes, this vertex buffer and offset will be overridden
Elements.Add(AccessStreamComponent(FVertexStreamComponent(&GPrimitiveIdDummy, 0, 0, sizeof(uint32), VET_UInt, EVertexStreamUsage::Instancing), 13));
PrimitiveIdStreamIndex[Index] = Elements.Last().StreamIndex;
}
}
// 切線和法線, 切線法線才需要被頂點流使用, 副法線由shader生成.
uint8 TangentBasisAttributes[2] = { 1, 2 };
for(int32 AxisIndex = 0;AxisIndex < 2;AxisIndex++)
{
if(Data.TangentBasisComponents[AxisIndex].VertexBuffer != NULL)
{
Elements.Add(AccessStreamComponent(Data.TangentBasisComponents[AxisIndex],TangentBasisAttributes[AxisIndex]));
}
}
if (Data.ColorComponentsSRV == nullptr)
{
Data.ColorComponentsSRV = GNullColorVertexBuffer.VertexBufferSRV;
Data.ColorIndexMask = 0;
}
// 頂點顏色
ColorStreamIndex = -1;
if(Data.ColorComponent.VertexBuffer)
{
Elements.Add(AccessStreamComponent(Data.ColorComponent,3));
ColorStreamIndex = Elements.Last().StreamIndex;
}
else
{
FVertexStreamComponent NullColorComponent(&GNullColorVertexBuffer, 0, 0, VET_Color, EVertexStreamUsage::ManualFetch);
Elements.Add(AccessStreamComponent(NullColorComponent, 3));
ColorStreamIndex = Elements.Last().StreamIndex;
}
// 紋理座標
if(Data.TextureCoordinates.Num())
{
const int32 BaseTexCoordAttribute = 4;
for(int32 CoordinateIndex = 0;CoordinateIndex < Data.TextureCoordinates.Num();CoordinateIndex++)
{
Elements.Add(AccessStreamComponent(
Data.TextureCoordinates[CoordinateIndex],
BaseTexCoordAttribute + CoordinateIndex
));
}
for (int32 CoordinateIndex = Data.TextureCoordinates.Num(); CoordinateIndex < MAX_STATIC_TEXCOORDS / 2; CoordinateIndex++)
{
Elements.Add(AccessStreamComponent(
Data.TextureCoordinates[Data.TextureCoordinates.Num() - 1],
BaseTexCoordAttribute + CoordinateIndex
));
}
}
// 光照圖
if(Data.LightMapCoordinateComponent.VertexBuffer)
{
Elements.Add(AccessStreamComponent(Data.LightMapCoordinateComponent,15));
}
else if(Data.TextureCoordinates.Num())
{
Elements.Add(AccessStreamComponent(Data.TextureCoordinates[0],15));
}
// 初始化頂點宣告
InitDeclaration(Elements);
const int32 DefaultBaseVertexIndex = 0;
const int32 DefaultPreSkinBaseVertexIndex = 0;
if (RHISupportsManualVertexFetch(GMaxRHIShaderPlatform) || bCanUseGPUScene)
{
SCOPED_LOADTIMER(FLocalVertexFactory_InitRHI_CreateLocalVFUniformBuffer);
UniformBuffer = CreateLocalVFUniformBuffer(this, Data.LODLightmapDataIndex, nullptr, DefaultBaseVertexIndex, DefaultPreSkinBaseVertexIndex);
}
}
// 實現FLocalVertexFactory的引數型別.
IMPLEMENT_VERTEX_FACTORY_PARAMETER_TYPE(FLocalVertexFactory, SF_Vertex, FLocalVertexFactoryShaderParameters);
// 實現FLocalVertexFactory.
IMPLEMENT_VERTEX_FACTORY_TYPE_EX(FLocalVertexFactory,"/Engine/Private/LocalVertexFactory.ush",true,true,true,true,true,true,true);
下面進入CableComponent相關型別關於FLocalVertexFactory的使用:
// Engine\Plugins\Runtime\CableComponent\Source\CableComponent\Private\CableComponent.cpp
class FCableSceneProxy final : public FPrimitiveSceneProxy
{
public:
FCableSceneProxy(UCableComponent* Component)
: FPrimitiveSceneProxy(Component)
, Material(NULL)
// 構造頂點工廠.
, VertexFactory(GetScene().GetFeatureLevel(), "FCableSceneProxy")
(......)
{
// 利用頂點工廠初始化緩衝區.
VertexBuffers.InitWithDummyData(&VertexFactory, GetRequiredVertexCount());
(......)
}
virtual ~FCableSceneProxy()
{
// 釋放頂點工廠.
VertexFactory.ReleaseResource();
(......)
}
// 構建Cable網格.
void BuildCableMesh(const TArray<FVector>& InPoints, TArray<FDynamicMeshVertex>& OutVertices, TArray<int32>& OutIndices)
{
(......)
}
// 設定動態資料(渲染執行緒呼叫)
void SetDynamicData_RenderThread(FCableDynamicData* NewDynamicData)
{
// 釋放舊資料.
if(DynamicData)
{
delete DynamicData;
DynamicData = NULL;
}
DynamicData = NewDynamicData;
// 從Cable點構建頂點.
TArray<FDynamicMeshVertex> Vertices;
TArray<int32> Indices;
BuildCableMesh(NewDynamicData->CablePoints, Vertices, Indices);
// 填充頂點緩衝區資料.
for (int i = 0; i < Vertices.Num(); i++)
{
const FDynamicMeshVertex& Vertex = Vertices[i];
VertexBuffers.PositionVertexBuffer.VertexPosition(i) = Vertex.Position;
VertexBuffers.StaticMeshVertexBuffer.SetVertexTangents(i, Vertex.TangentX.ToFVector(), Vertex.GetTangentY(), Vertex.TangentZ.ToFVector());
VertexBuffers.StaticMeshVertexBuffer.SetVertexUV(i, 0, Vertex.TextureCoordinate[0]);
VertexBuffers.ColorVertexBuffer.VertexColor(i) = Vertex.Color;
}
// 更新頂點緩衝區資料到RHI.
{
auto& VertexBuffer = VertexBuffers.PositionVertexBuffer;
void* VertexBufferData = RHILockVertexBuffer(VertexBuffer.VertexBufferRHI, 0, VertexBuffer.GetNumVertices() * VertexBuffer.GetStride(), RLM_WriteOnly);
FMemory::Memcpy(VertexBufferData, VertexBuffer.GetVertexData(), VertexBuffer.GetNumVertices() * VertexBuffer.GetStride());
RHIUnlockVertexBuffer(VertexBuffer.VertexBufferRHI);
}
(......)
}
virtual void GetDynamicMeshElements(const TArray<const FSceneView*>& Views, const FSceneViewFamily& ViewFamily, uint32 VisibilityMap, FMeshElementCollector& Collector) const override
{
(......)
for (int32 ViewIndex = 0; ViewIndex < Views.Num(); ViewIndex++)
{
if (VisibilityMap & (1 << ViewIndex))
{
const FSceneView* View = Views[ViewIndex];
// 構造FMeshBatch例項.
FMeshBatch& Mesh = Collector.AllocateMesh();
// 將頂點工廠例項傳給FMeshBatch例項.
Mesh.VertexFactory = &VertexFactory;
(......)
Collector.AddMesh(ViewIndex, Mesh);
}
}
}
(......)
private:
// 材質
UMaterialInterface* Material;
// 頂點和索引緩衝.
FStaticMeshVertexBuffers VertexBuffers;
FCableIndexBuffer IndexBuffer;
// 頂點工廠.
FLocalVertexFactory VertexFactory;
// 動態資料.
FCableDynamicData* DynamicData;
(......)
};
由上面的程式碼可知,使用已有的頂點工廠的步驟並複雜,主要在於初始化、賦值和傳遞給FMeshBatch例項等步驟。
不過,無論是使用已有的還是自定義的頂點工廠,頂點工廠的頂點宣告的順序、型別、元件數量和插槽需要和HLSL層的FVertexFactoryInput保持一致。比如說FLocalVertexFactory::InitRHI的頂點宣告順序是位置、切線、顏色、紋理座標、光照圖,那麼我們進入FLocalVertexFactory對應的HLSL檔案(由IMPLEMENT_VERTEX_FACTORY_TYPE等巨集指定)看看:
// Engine\Shaders\Private\LocalVertexFactory.ush
// 區域性頂點工廠對應的輸入結構體.
struct FVertexFactoryInput
{
// 位置
float4 Position : ATTRIBUTE0;
// 切線和顏色
#if !MANUAL_VERTEX_FETCH
#if METAL_PROFILE
float3 TangentX : ATTRIBUTE1;
// TangentZ.w contains sign of tangent basis determinant
float4 TangentZ : ATTRIBUTE2;
float4 Color : ATTRIBUTE3;
#else
half3 TangentX : ATTRIBUTE1;
// TangentZ.w contains sign of tangent basis determinant
half4 TangentZ : ATTRIBUTE2;
half4 Color : ATTRIBUTE3;
#endif
#endif
// 紋理座標
#if NUM_MATERIAL_TEXCOORDS_VERTEX
#if !MANUAL_VERTEX_FETCH
#if GPUSKIN_PASS_THROUGH
// These must match GPUSkinVertexFactory.usf
float2 TexCoords[NUM_MATERIAL_TEXCOORDS_VERTEX] : ATTRIBUTE4;
#if NUM_MATERIAL_TEXCOORDS_VERTEX > 4
#error Too many texture coordinate sets defined on GPUSkin vertex input. Max: 4.
#endif
#else
#if NUM_MATERIAL_TEXCOORDS_VERTEX > 1
float4 PackedTexCoords4[NUM_MATERIAL_TEXCOORDS_VERTEX/2] : ATTRIBUTE4;
#endif
#if NUM_MATERIAL_TEXCOORDS_VERTEX == 1
float2 PackedTexCoords2 : ATTRIBUTE4;
#elif NUM_MATERIAL_TEXCOORDS_VERTEX == 3
float2 PackedTexCoords2 : ATTRIBUTE5;
#elif NUM_MATERIAL_TEXCOORDS_VERTEX == 5
float2 PackedTexCoords2 : ATTRIBUTE6;
#elif NUM_MATERIAL_TEXCOORDS_VERTEX == 7
float2 PackedTexCoords2 : ATTRIBUTE7;
#endif
#endif
#endif
#elif USE_PARTICLE_SUBUVS && !MANUAL_VERTEX_FETCH
float2 TexCoords[1] : ATTRIBUTE4;
#endif
(......)
};
因此可知,FVertexFactoryInput結構體的資料順序和FLocalVertexFactory的頂點宣告是一一對應的。
8.2.5 Shader Permutation
UE的Shader程式碼是取樣的了全能著色器(Uber Shader)的設計架構,這就需要在同一個shader程式碼檔案裡增加許多各種各樣的巨集,以區分不同Pass、功能、Feature Level和質量等級的分支程式碼。在C++層,為了方便擴充套件、設定這些巨集定義的開啟及不同的值,UE採用了著色器排列(Shader Permutation)的概念。
每一個排列包含著一個唯一的雜湊鍵值,將這組排列的值填充到HLSL,編譯出對應的著色器程式碼。下面分析著色器排列的核心型別的定義:
// Engine\Source\Runtime\RenderCore\Public\ShaderPermutation.h
// Bool的著色器排列
struct FShaderPermutationBool
{
using Type = bool;
// 維度數量.
static constexpr int32 PermutationCount = 2;
// 是否多維的排列.
static constexpr bool IsMultiDimensional = false;
// 轉換bool到int值.
static int32 ToDimensionValueId(Type E)
{
return E ? 1 : 0;
}
// 轉換為定義的值.
static bool ToDefineValue(Type E)
{
return E;
}
// 從排列id轉成bool.
static Type FromDimensionValueId(int32 PermutationId)
{
checkf(PermutationId == 0 || PermutationId == 1, TEXT("Invalid shader permutation dimension id %i."), PermutationId);
return PermutationId == 1;
}
};
// 整型的著色器排列
template <typename TType, int32 TDimensionSize, int32 TFirstValue=0>
struct TShaderPermutationInt
{
using Type = TType;
static constexpr int32 PermutationCount = TDimensionSize;
static constexpr bool IsMultiDimensional = false;
// 最大最小值.
static constexpr Type MinValue = static_cast<Type>(TFirstValue);
static constexpr Type MaxValue = static_cast<Type>(TFirstValue + TDimensionSize - 1);
static int32 ToDimensionValueId(Type E)
static int32 ToDefineValue(Type E);
static Type FromDimensionValueId(int32 PermutationId);
};
// 可變維度的整型著色器排列.
template <int32... Ts>
struct TShaderPermutationSparseInt
{
using Type = int32;
static constexpr int32 PermutationCount = 0;
static constexpr bool IsMultiDimensional = false;
static int32 ToDimensionValueId(Type E);
static Type FromDimensionValueId(int32 PermutationId);
};
// 著色器排列域, 數量是可變的
template <typename... Ts>
struct TShaderPermutationDomain
{
using Type = TShaderPermutationDomain<Ts...>;
static constexpr bool IsMultiDimensional = true;
static constexpr int32 PermutationCount = 1;
// 建構函式.
TShaderPermutationDomain<Ts...>() {}
explicit TShaderPermutationDomain<Ts...>(int32 PermutationId)
{
checkf(PermutationId == 0, TEXT("Invalid shader permutation id %i."), PermutationId);
}
// 設定某個維度的值.
template<class DimensionToSet>
void Set(typename DimensionToSet::Type)
{
static_assert(sizeof(typename DimensionToSet::Type) == 0, "Unknown shader permutation dimension.");
}
// 獲取某個維度的值.
template<class DimensionToGet>
const typename DimensionToGet::Type Get() const
{
static_assert(sizeof(typename DimensionToGet::Type) == 0, "Unknown shader permutation dimension.");
return DimensionToGet::Type();
}
// 修改編譯環境變數.
void ModifyCompilationEnvironment(FShaderCompilerEnvironment& OutEnvironment) const {}
// 資料轉換.
static int32 ToDimensionValueId(const Type& PermutationVector)
{
return 0;
}
int32 ToDimensionValueId() const
{
return ToDimensionValueId(*this);
}
static Type FromDimensionValueId(const int32 PermutationId)
{
return Type(PermutationId);
}
bool operator==(const Type& Other) const
{
return true;
}
};
// 下面的巨集方便編寫shader的c++程式碼時實現和設定著色器排列.
// 宣告指定名字的bool型別著色器排列
#define SHADER_PERMUTATION_BOOL(InDefineName)
// 宣告指定名字的int型別著色器排列
#define SHADER_PERMUTATION_INT(InDefineName, Count)
// 宣告指定名字和範圍的int型別著色器排列
#define SHADER_PERMUTATION_RANGE_INT(InDefineName, Start, Count)
// 宣告指定名字的稀疏int型別著色器排列
#define SHADER_PERMUTATION_SPARSE_INT(InDefineName,...)
// 宣告指定名字的列舉型別著色器排列
#define SHADER_PERMUTATION_ENUM_CLASS(InDefineName, EnumName)
看上面的模板和巨集定義是不是有點懵、不知所以然?沒關係,結合FDeferredLightPS的使用案例,會發現著色器排列其實很簡單:
// 延遲光源的PS.
class FDeferredLightPS : public FGlobalShader
{
DECLARE_SHADER_TYPE(FDeferredLightPS, Global)
// 宣告各個維度的著色器排列, 注意用的是繼承, 且父類是用SHADER_PERMUTATION_xxx定義的型別.
// 注意父類的名詞(如LIGHT_SOURCE_SHAPE, USE_SOURCE_TEXTURE, USE_IES_PROFILE, ...)就是在HLSL程式碼中的巨集名稱.
class FSourceShapeDim : SHADER_PERMUTATION_ENUM_CLASS("LIGHT_SOURCE_SHAPE", ELightSourceShape);
class FSourceTextureDim : SHADER_PERMUTATION_BOOL("USE_SOURCE_TEXTURE");
class FIESProfileDim : SHADER_PERMUTATION_BOOL("USE_IES_PROFILE");
class FInverseSquaredDim : SHADER_PERMUTATION_BOOL("INVERSE_SQUARED_FALLOFF");
class FVisualizeCullingDim : SHADER_PERMUTATION_BOOL("VISUALIZE_LIGHT_CULLING");
class FLightingChannelsDim : SHADER_PERMUTATION_BOOL("USE_LIGHTING_CHANNELS");
class FTransmissionDim : SHADER_PERMUTATION_BOOL("USE_TRANSMISSION");
class FHairLighting : SHADER_PERMUTATION_INT("USE_HAIR_LIGHTING", 2);
class FAtmosphereTransmittance : SHADER_PERMUTATION_BOOL("USE_ATMOSPHERE_TRANSMITTANCE");
class FCloudTransmittance : SHADER_PERMUTATION_BOOL("USE_CLOUD_TRANSMITTANCE");
class FAnistropicMaterials : SHADER_PERMUTATION_BOOL("SUPPORTS_ANISOTROPIC_MATERIALS");
// 宣告著色器排列域, 包含了上面定義的所有維度.
using FPermutationDomain = TShaderPermutationDomain<
FSourceShapeDim,
FSourceTextureDim,
FIESProfileDim,
FInverseSquaredDim,
FVisualizeCullingDim,
FLightingChannelsDim,
FTransmissionDim,
FHairLighting,
FAtmosphereTransmittance,
FCloudTransmittance,
FAnistropicMaterials>;
// 是否需要編譯指定的著色器排列.
static bool ShouldCompilePermutation(const FGlobalShaderPermutationParameters& Parameters)
{
// 獲取著色器排列的值.
FPermutationDomain PermutationVector(Parameters.PermutationId);
// 如果是平行光, 那麼IES光照和逆反的衰減將沒有任何意義, 可以不編譯.
if( PermutationVector.Get< FSourceShapeDim >() == ELightSourceShape::Directional && (
PermutationVector.Get< FIESProfileDim >() ||
PermutationVector.Get< FInverseSquaredDim >() ) )
{
return false;
}
// 如果不是平行光, 那麼大氣和雲體透射將沒有任何意義, 可以不編譯.
if (PermutationVector.Get< FSourceShapeDim >() != ELightSourceShape::Directional && (PermutationVector.Get<FAtmosphereTransmittance>() || PermutationVector.Get<FCloudTransmittance>()))
{
return false;
}
(......)
return IsFeatureLevelSupported(Parameters.Platform, ERHIFeatureLevel::SM5);
}
(......)
};
// 渲染光源.
void FDeferredShadingSceneRenderer::RenderLight(FRHICommandList& RHICmdList, ...)
{
(......)
for (int32 ViewIndex = 0; ViewIndex < Views.Num(); ViewIndex++)
{
FViewInfo& View = Views[ViewIndex];
(......)
if (LightSceneInfo->Proxy->GetLightType() == LightType_Directional)
{
(......)
// 宣告FDeferredLightPS的著色器排列的例項.
FDeferredLightPS::FPermutationDomain PermutationVector;
// 根據渲染狀態填充排列值.
PermutationVector.Set< FDeferredLightPS::FSourceShapeDim >( ELightSourceShape::Directional );
PermutationVector.Set< FDeferredLightPS::FIESProfileDim >( false );
PermutationVector.Set< FDeferredLightPS::FInverseSquaredDim >( false );
PermutationVector.Set< FDeferredLightPS::FVisualizeCullingDim >( View.Family->EngineShowFlags.VisualizeLightCulling );
PermutationVector.Set< FDeferredLightPS::FLightingChannelsDim >( View.bUsesLightingChannels );
PermutationVector.Set< FDeferredLightPS::FAnistropicMaterials >(ShouldRenderAnisotropyPass());
PermutationVector.Set< FDeferredLightPS::FTransmissionDim >( bTransmission );
PermutationVector.Set< FDeferredLightPS::FHairLighting>(0);
PermutationVector.Set< FDeferredLightPS::FAtmosphereTransmittance >(bAtmospherePerPixelTransmittance);
PermutationVector.Set< FDeferredLightPS::FCloudTransmittance >(bLight0CloudPerPixelTransmittance || bLight1CloudPerPixelTransmittance);
// 用填充好的排列從檢視的ShaderMap獲取對應的PS例項.
TShaderMapRef< FDeferredLightPS > PixelShader( View.ShaderMap, PermutationVector );
// 填充PS的其它資料.
GraphicsPSOInit.BoundShaderState.VertexDeclarationRHI = GFilterVertexDeclaration.VertexDeclarationRHI;
GraphicsPSOInit.BoundShaderState.VertexShaderRHI = VertexShader.GetVertexShader();
GraphicsPSOInit.BoundShaderState.PixelShaderRHI = PixelShader.GetPixelShader();
SetGraphicsPipelineState(RHICmdList, GraphicsPSOInit);
PixelShader->SetParameters(RHICmdList, View, LightSceneInfo, ScreenShadowMaskTexture, LightingChannelsTexture, &RenderLightParams);
(......)
}
(......)
}
由此可知,著色器排列本質上只是一組擁有不定維度的鍵值,在編譯shader階段,shader編譯器會盡量為每個不同的排列生成對應的shader例項程式碼,當然也可以通過ShouldCompilePermutation排除掉部分無意義的排列。預編譯好的所有shader存放於檢視的ShaderMap中。每個維度的鍵值可在執行時動態生成,然後用它們組合成的排列域去檢視的ShaderMap獲取對應的編譯好的shader程式碼,從而進行後續的著色器資料設定和渲染。
另外,值得一提的是,排列維度父類的名詞(如LIGHT_SOURCE_SHAPE, USE_SOURCE_TEXTURE, USE_IES_PROFILE, ...)就是在HLSL程式碼中的巨集名稱。比如FSourceShapeDim正是控制著HLSL程式碼的LIGHT_SOURCE_SHAPE,根據FSourceShapeDim的值會選用不同片段的程式碼,從而控制不同版本和分支的shader程式碼。
8.3 Shader機制
本章主要分析Shader的部分底層機制,比如Shader Map的儲存機制,Shader的編譯和快取策略等。
8.3.1 Shader Map
ShaderMap是儲存編譯後的shader程式碼,分為FGlobalShaderMap、FMaterialShaderMap、FMeshMaterialShaderMap三種類型。
8.3.1.1 FShaderMapBase
本小節先闡述Shader Map相關的基礎型別和概念,如下:
// Engine\Source\Runtime\Core\Public\Serialization\MemoryImage.h
// 指標表基類.
class FPointerTableBase
{
public:
virtual ~FPointerTableBase() {}
virtual int32 AddIndexedPointer(const FTypeLayoutDesc& TypeDesc, void* Ptr) = 0;
virtual void* GetIndexedPointer(const FTypeLayoutDesc& TypeDesc, uint32 i) const = 0;
};
// Engine\Source\Runtime\RenderCore\Public\Shader.h
// 用以序列化, 反序列化, 編譯, 快取一個專用的shader類. 一個FShaderType可以跨多個維度管理FShader的多個例項,如EShaderPlatform,或permutation id. FShaderType的排列數量簡單地由GetPermutationCount()給出。
class FShaderType
{
public:
// 著色器種類, 有全域性, 材質, 網格材質, Niagara等.
enum class EShaderTypeForDynamicCast : uint32
{
Global,
Material,
MeshMaterial,
Niagara,
OCIO,
NumShaderTypes,
};
(......)
// 靜態資料獲取介面.
static TLinkedList<FShaderType*>*& GetTypeList();
static FShaderType* GetShaderTypeByName(const TCHAR* Name);
static TArray<const FShaderType*> GetShaderTypesByFilename(const TCHAR* Filename);
static TMap<FHashedName, FShaderType*>& GetNameToTypeMap();
static const TArray<FShaderType*>& GetSortedTypes(EShaderTypeForDynamicCast Type);
static void Initialize(const TMap<FString, TArray<const TCHAR*> >& ShaderFileToUniformBufferVariables);
static void Uninitialize();
// 建構函式.
FShaderType(...);
virtual ~FShaderType();
FShader* ConstructForDeserialization() const;
FShader* ConstructCompiled(const FShader::CompiledShaderInitializerType& Initializer) const;
bool ShouldCompilePermutation(...) const;
void ModifyCompilationEnvironment(..) const;
bool ValidateCompiledResult(...) const;
// 基於shader type的原始碼和包含計算雜湊值.
const FSHAHash& GetSourceHash(EShaderPlatform ShaderPlatform) const;
// 獲取FShaderType指標的雜湊值.
friend uint32 GetTypeHash(FShaderType* Ref);
// 訪問介面.
(......)
void AddReferencedUniformBufferIncludes(FShaderCompilerEnvironment& OutEnvironment, FString& OutSourceFilePrefix, EShaderPlatform Platform);
void FlushShaderFileCache(const TMap<FString, TArray<const TCHAR*> >& ShaderFileToUniformBufferVariables);
void GetShaderStableKeyParts(struct FStableShaderKeyAndValue& SaveKeyVal);
private:
EShaderTypeForDynamicCast ShaderTypeForDynamicCast;
const FTypeLayoutDesc* TypeLayout;
// 名稱.
const TCHAR* Name;
// 型別名.
FName TypeName;
// 雜湊名
FHashedName HashedName;
// 雜湊的原始碼檔名.
FHashedName HashedSourceFilename;
// 原始檔名.
const TCHAR* SourceFilename;
// 入口命.
const TCHAR* FunctionName;
// 著色頻率.
uint32 Frequency;
uint32 TypeSize;
// 排列數量.
int32 TotalPermutationCount;
(......)
// 全域性的列表.
TLinkedList<FShaderType*> GlobalListLink;
protected:
bool bCachedUniformBufferStructDeclarations;
// 引用的Uniform Buffer包含的快取.
TMap<const TCHAR*, FCachedUniformBufferDeclaration> ReferencedUniformBufferStructsCache;
};
// 著色器對映表指標表
class FShaderMapPointerTable : public FPointerTableBase
{
public:
virtual int32 AddIndexedPointer(const FTypeLayoutDesc& TypeDesc, void* Ptr) override;
virtual void* GetIndexedPointer(const FTypeLayoutDesc& TypeDesc, uint32 i) const override;
virtual void SaveToArchive(FArchive& Ar, void* FrozenContent, bool bInlineShaderResources) const;
virtual void LoadFromArchive(FArchive& Ar, void* FrozenContent, bool bInlineShaderResources, bool bLoadedByCookedMaterial);
// 著色器型別
TPtrTable<FShaderType> ShaderTypes;
// 頂點工廠型別
TPtrTable<FVertexFactoryType> VFTypes;
};
// 包含編譯期狀態的著色器管線例項.
class FShaderPipeline
{
public:
explicit FShaderPipeline(const FShaderPipelineType* InType);
~FShaderPipeline();
// 增加著色器.
void AddShader(FShader* Shader, int32 PermutationId);
// 獲取著色器數量.
inline uint32 GetNumShaders() const;
// 查詢shader.
template<typename ShaderType>
ShaderType* GetShader(const FShaderMapPointerTable& InPtrTable);
FShader* GetShader(EShaderFrequency Frequency);
const FShader* GetShader(EShaderFrequency Frequency) const;
inline TArray<TShaderRef<FShader>> GetShaders(const FShaderMapBase& InShaderMap) const;
// 校驗.
void Validate(const FShaderPipelineType* InPipelineType) const;
// 處理編譯好的著色器程式碼.
void Finalize(const FShaderMapResourceCode* Code);
(......)
enum EFilter
{
EAll, // All pipelines
EOnlyShared, // Only pipelines with shared shaders
EOnlyUnique, // Only pipelines with unique shaders
};
// 雜湊值.
LAYOUT_FIELD(FHashedName, TypeName);
// 所有著色頻率的FShader例項.
LAYOUT_ARRAY(TMemoryImagePtr<FShader>, Shaders, SF_NumGraphicsFrequencies);
// 排列id.
LAYOUT_ARRAY(int32, PermutationIds, SF_NumGraphicsFrequencies);
};
// 著色器對映表內容.
class FShaderMapContent
{
public:
struct FProjectShaderPipelineToKey
{
inline FHashedName operator()(const FShaderPipeline* InShaderPipeline)
{ return InShaderPipeline->TypeName; }
};
explicit FShaderMapContent(EShaderPlatform InPlatform);
~FShaderMapContent();
EShaderPlatform GetShaderPlatform() const;
// 校驗.
void Validate(const FShaderMapBase& InShaderMap);
// 查詢shader.
template<typename ShaderType>
ShaderType* GetShader(int32 PermutationId = 0) const;
template<typename ShaderType>
ShaderType* GetShader( const typename ShaderType::FPermutationDomain& PermutationVector ) const;
FShader* GetShader(FShaderType* ShaderType, int32 PermutationId = 0) const;
FShader* GetShader(const FHashedName& TypeName, int32 PermutationId = 0) const;
// 檢測是否有指定shader.
bool HasShader(const FHashedName& TypeName, int32 PermutationId) const;
bool HasShader(const FShaderType* Type, int32 PermutationId) const;
inline TArrayView<const TMemoryImagePtr<FShader>> GetShaders() const;
inline TArrayView<const TMemoryImagePtr<FShaderPipeline>> GetShaderPipelines() const;
// 增加, 查詢shader或Pipeline介面.
void AddShader(const FHashedName& TypeName, int32 PermutationId, FShader* Shader);
FShader* FindOrAddShader(const FHashedName& TypeName, int32 PermutationId, FShader* Shader);
void AddShaderPipeline(FShaderPipeline* Pipeline);
FShaderPipeline* FindOrAddShaderPipeline(FShaderPipeline* Pipeline);
// 刪除介面.
void RemoveShaderTypePermutaion(const FHashedName& TypeName, int32 PermutationId);
inline void RemoveShaderTypePermutaion(const FShaderType* Type, int32 PermutationId);
void RemoveShaderPipelineType(const FShaderPipelineType* ShaderPipelineType);
// 獲取著色器列表.
void GetShaderList(const FShaderMapBase& InShaderMap, const FSHAHash& InMaterialShaderMapHash, TMap<FShaderId, TShaderRef<FShader>>& OutShaders) const;
void GetShaderList(const FShaderMapBase& InShaderMap, TMap<FHashedName, TShaderRef<FShader>>& OutShaders) const;
// 獲取著色器管線列表.
void GetShaderPipelineList(const FShaderMapBase& InShaderMap, TArray<FShaderPipelineRef>& OutShaderPipelines, FShaderPipeline::EFilter Filter) const;
(.......)
// 獲取著色器最大的指令數.
uint32 GetMaxNumInstructionsForShader(const FShaderMapBase& InShaderMap, FShaderType* ShaderType) const;
// 儲存編譯好的shader程式碼.
void Finalize(const FShaderMapResourceCode* Code);
// 更新雜湊值.
void UpdateHash(FSHA1& Hasher) const;
protected:
using FMemoryImageHashTable = THashTable<FMemoryImageAllocator>;
// 著色器雜湊.
LAYOUT_FIELD(FMemoryImageHashTable, ShaderHash);
// 著色器型別.
LAYOUT_FIELD(TMemoryImageArray<FHashedName>, ShaderTypes);
// 著色器排列列表.
LAYOUT_FIELD(TMemoryImageArray<int32>, ShaderPermutations);
// 著色器例項列表.
LAYOUT_FIELD(TMemoryImageArray<TMemoryImagePtr<FShader>>, Shaders);
// 著色器管線列表.
LAYOUT_FIELD(TMemoryImageArray<TMemoryImagePtr<FShaderPipeline>>, ShaderPipelines);
// 著色器編譯所在的平臺.
LAYOUT_FIELD(TEnumAsByte<EShaderPlatform>, Platform);
};
// FShaderMa的基類.
class FShaderMapBase
{
public:
(......)
private:
const FTypeLayoutDesc& ContentTypeLayout;
// ShaderMap資源.
TRefCountPtr<FShaderMapResource> Resource;
// ShaderMap資原始碼.
TRefCountPtr<FShaderMapResourceCode> Code;
// ShaderMap指標表.
FShaderMapPointerTable* PointerTable;
// ShaderMap內容.
FShaderMapContent* Content;
// 內容尺寸.
uint32 FrozenContentSize;
// 著色器數量.
uint32 NumFrozenShaders;
};
// 著色器對映表. 需指定FShaderMapContent和FShaderMapPointerTable
template<typename ContentType, typename PointerTableType = FShaderMapPointerTable>
class TShaderMap : public FShaderMapBase
{
public:
inline const PointerTableType& GetPointerTable();
inline const ContentType* GetContent() const;
inline ContentType* GetMutableContent();
void FinalizeContent()
{
ContentType* LocalContent = this->GetMutableContent();
LocalContent->Finalize(this->GetResourceCode());
FShaderMapBase::FinalizeContent();
}
protected:
TShaderMap();
virtual FShaderMapPointerTable* CreatePointerTable();
};
// 著色器管線引用.
class FShaderPipelineRef
{
public:
FShaderPipelineRef();
FShaderPipelineRef(FShaderPipeline* InPipeline, const FShaderMapBase& InShaderMap);
(......)
// 獲取著色器
template<typename ShaderType>
TShaderRef<ShaderType> GetShader() const;
TShaderRef<FShader> GetShader(EShaderFrequency Frequency) const;
inline TArray<TShaderRef<FShader>> GetShaders() const;
// 獲取著色管線, 資源等介面.
inline FShaderPipeline* GetPipeline() const;
FShaderMapResource* GetResource() const;
const FShaderMapPointerTable& GetPointerTable() const;
inline FShaderPipeline* operator->() const;
private:
FShaderPipeline* ShaderPipeline; // 著色器管線.
const FShaderMapBase* ShaderMap; // 著色器對映表.
};
上面的很多型別是基類,具體的邏輯需要由子類完成。
8.3.1.2 FGlobalShaderMap
FGlobalShaderMap儲存並管理著所有編譯好的FGlobalShader程式碼,它的定義和相關型別如下所示:
// Engine\Source\Runtime\RenderCore\Public\GlobalShader.h
// 用於處理最簡單的著色器(沒有材質和頂點工廠連結)的shader meta type, 每個簡單的shader都應該只有一個例項.
class FGlobalShaderType : public FShaderType
{
friend class FGlobalShaderTypeCompiler;
public:
typedef FShader::CompiledShaderInitializerType CompiledShaderInitializerType;
FGlobalShaderType(...);
bool ShouldCompilePermutation(EShaderPlatform Platform, int32 PermutationId) const;
void SetupCompileEnvironment(EShaderPlatform Platform, int32 PermutationId, FShaderCompilerEnvironment& Environment);
};
// 全域性著色器子表.
class FGlobalShaderMapContent : public FShaderMapContent
{
(......)
public:
const FHashedName& GetHashedSourceFilename();
private:
inline FGlobalShaderMapContent(EShaderPlatform InPlatform, const FHashedName& InHashedSourceFilename);
// 雜湊的原始檔名.
LAYOUT_FIELD(FHashedName, HashedSourceFilename);
};
class FGlobalShaderMapSection : public TShaderMap<FGlobalShaderMapContent, FShaderMapPointerTable>
{
(......)
private:
inline FGlobalShaderMapSection();
inline FGlobalShaderMapSection(EShaderPlatform InPlatform, const FHashedName& InHashedSourceFilename);
TShaderRef<FShader> GetShader(FShaderType* ShaderType, int32 PermutationId = 0) const;
FShaderPipelineRef GetShaderPipeline(const FShaderPipelineType* PipelineType) const;
};
// 全域性ShaderMap.
class FGlobalShaderMap
{
public:
explicit FGlobalShaderMap(EShaderPlatform InPlatform);
~FGlobalShaderMap();
// 根據著色器型別和排列id獲取編譯後的shader程式碼.
TShaderRef<FShader> GetShader(FShaderType* ShaderType, int32 PermutationId = 0) const;
// 根據排列id獲取編譯後的shader程式碼.
template<typename ShaderType>
TShaderRef<ShaderType> GetShader(int32 PermutationId = 0) const
{
TShaderRef<FShader> Shader = GetShader(&ShaderType::StaticType, PermutationId);
return TShaderRef<ShaderType>::Cast(Shader);
}
// 根據著色器型別內的排列獲取編譯後的shader程式碼.
template<typename ShaderType>
TShaderRef<ShaderType> GetShader(const typename ShaderType::FPermutationDomain& PermutationVector) const
{
return GetShader<ShaderType>(PermutationVector.ToDimensionValueId());
}
// 檢測是否有指定的shader.
bool HasShader(FShaderType* Type, int32 PermutationId) const
{
return GetShader(Type, PermutationId).IsValid();
}
// 獲取著色器管線
FShaderPipelineRef GetShaderPipeline(const FShaderPipelineType* PipelineType) const;
// 是否有著色器管線.
bool HasShaderPipeline(const FShaderPipelineType* ShaderPipelineType) const
{
return GetShaderPipeline(ShaderPipelineType).IsValid();
}
bool IsEmpty() const;
void Empty();
void ReleaseAllSections();
// 查詢或增加shader.
FShader* FindOrAddShader(const FShaderType* ShaderType, int32 PermutationId, FShader* Shader);
// 查詢或增加shader管線.
FShaderPipeline* FindOrAddShaderPipeline(const FShaderPipelineType* ShaderPipelineType, FShaderPipeline* ShaderPipeline);
// 刪除介面
void RemoveShaderTypePermutaion(const FShaderType* Type, int32 PermutationId);
void RemoveShaderPipelineType(const FShaderPipelineType* ShaderPipelineType);
// ShaderMapSection操作.
void AddSection(FGlobalShaderMapSection* InSection);
FGlobalShaderMapSection* FindSection(const FHashedName& HashedShaderFilename);
FGlobalShaderMapSection* FindOrAddSection(const FShaderType* ShaderType);
// IO介面.
void LoadFromGlobalArchive(FArchive& Ar);
void SaveToGlobalArchive(FArchive& Ar);
// 清理所有shader.
void BeginCreateAllShaders();
(......)
private:
// 儲存了FGlobalShaderMapSection的對映表.
TMap<FHashedName, FGlobalShaderMapSection*> SectionMap;
EShaderPlatform Platform;
};
// 全域性ShaderMap的列表, 其中SP_NumPlatforms是49.
extern RENDERCORE_API FGlobalShaderMap* GGlobalShaderMap[SP_NumPlatforms];
上面涉及到了ShaderMap的Content、Section、PointerTable、ShaderType等等方面的型別和概念,資料多,關係複雜,不過抽象成UML圖之後就簡單明瞭多了:
FShaderType <|-- FGlobalShaderType
FPointerTableBase <|-- FShaderMapPointerTable
FShaderMapContent <|-- FGlobalShaderMapContent
FShaderMapBase <|-- TShaderMap
TShaderMap <|-- FGlobalShaderMapSection
FShaderPipeline <-- FShaderPipelineRef
以上類圖為了簡明,只展示了繼承關係,若是新增關聯、聚合、組合等關係之後,則是以下的模樣:
FShaderType <|-- FGlobalShaderType
FPointerTableBase <|-- FShaderMapPointerTable
FShaderMapContent <|-- FGlobalShaderMapContent
FShaderMapBase <|-- TShaderMap
TShaderMap <|-- FGlobalShaderMapSection
FShaderPipeline <-- FShaderPipelineRef
FShader o-- FShaderPipeline
class FShaderPipeline{
FShader Shaders[5]
}
FShaderPipeline <-- FShaderMapContent
FShaderType <-- FShaderMapContent
FShader o-- FShaderMapContent
class FShaderMapContent{
FHashedName ShaderTypes
FShader Shaders
FShaderPipeline ShaderPipelines
}
FShaderMapContent <-- FShaderMapBase
FShaderMapPointerTable <-- FShaderMapBase
class FShaderMapBase{
FShaderMapPointerTable* PointerTable
FShaderMapContent* Content
}
class FShaderPipelineRef{
FShaderPipeline* ShaderPipeline
}
class FGlobalShaderMapContent{
FHashedName HashedSourceFilename
}
FGlobalShaderMapSection o-- FGlobalShaderMap
class FGlobalShaderMap{
TMap<FHashedName, FGlobalShaderMapSection*> SectionMap
}
上面闡述完了FGlobalShaderMap及其核心類的關聯,下面再看看它是任何被應用到實際渲染中的。首先是在GlobalShader.h和GlobalShader.cpp宣告和定義了FGlobalShaderMap的例項和相關介面:
// Engine\Source\Runtime\RenderCore\Private\GlobalShader.h
// 宣告可外部訪問的FGlobalShaderMap列表.
extern RENDERCORE_API FGlobalShaderMap* GGlobalShaderMap[SP_NumPlatforms];
// 獲取指定著色平臺的FGlobalShaderMap.
extern RENDERCORE_API FGlobalShaderMap* GetGlobalShaderMap(EShaderPlatform Platform);
// 獲取指定FeatureLevel的FGlobalShaderMap.
inline FGlobalShaderMap* GetGlobalShaderMap(ERHIFeatureLevel::Type FeatureLevel)
{
return GetGlobalShaderMap(GShaderPlatformForFeatureLevel[FeatureLevel]);
}
// Engine\Source\Runtime\RenderCore\Private\GlobalShader.cpp
// 宣告所有著色平臺的FGlobalShaderMap.
FGlobalShaderMap* GGlobalShaderMap[SP_NumPlatforms] = {};
// 獲取FGlobalShaderMap.
FGlobalShaderMap* GetGlobalShaderMap(EShaderPlatform Platform)
{
return GGlobalShaderMap[Platform];
}
不過上面只是定義了GGlobalShaderMap,陣列內只是一個空的列表,真正的建立堆疊鏈如下所示:
// Engine\Source\Runtime\Launch\Private\LaunchEngineLoop.cpp
// 引擎預初始化.
int32 FEngineLoop::PreInitPreStartupScreen(const TCHAR* CmdLine)
{
(......)
// 是否開啟shader編譯, 一般情況下都會開啟.
bool bEnableShaderCompile = !FParse::Param(FCommandLine::Get(), TEXT("NoShaderCompile"));
(......)
if (bEnableShaderCompile && !IsRunningDedicatedServer() && !bIsCook)
{
(......)
// 編譯GlobalShaderMap
CompileGlobalShaderMap(false);
(......)
}
(......)
}
// Engine\Source\Runtime\Engine\Private\ShaderCompiler\ShaderCompiler.cpp
void CompileGlobalShaderMap(EShaderPlatform Platform, const ITargetPlatform* TargetPlatform, bool bRefreshShaderMap)
{
(......)
// 如果對應平臺的GlobalShaderMap未建立, 則建立之.
if (!GGlobalShaderMap[Platform])
{
(......)
// 建立對應平臺的FGlobalShaderMap.
GGlobalShaderMap[Platform] = new FGlobalShaderMap(Platform);
// Cooked模式.
if (FPlatformProperties::RequiresCookedData())
{
(......)
}
// Uncooked模式
else
{
// FGlobalShaderMap的id.
FGlobalShaderMapId ShaderMapId(Platform);
const int32 ShaderFilenameNum = ShaderMapId.GetShaderFilenameToDependeciesMap().Num();
const float ProgressStep = 25.0f / ShaderFilenameNum;
TArray<uint32> AsyncDDCRequestHandles;
AsyncDDCRequestHandles.SetNum(ShaderFilenameNum);
int32 HandleIndex = 0;
// 提交DDC請求.
for (const auto& ShaderFilenameDependencies : ShaderMapId.GetShaderFilenameToDependeciesMap())
{
SlowTask.EnterProgressFrame(ProgressStep);
const FString DataKey = GetGlobalShaderMapKeyString(ShaderMapId, Platform, TargetPlatform, ShaderFilenameDependencies.Value);
AsyncDDCRequestHandles[HandleIndex] = GetDerivedDataCacheRef().GetAsynchronous(*DataKey, TEXT("GlobalShaderMap"_SV));
++HandleIndex;
}
// 處理已經結束的DDC請求.
TArray<uint8> CachedData;
HandleIndex = 0;
for (const auto& ShaderFilenameDependencies : ShaderMapId.GetShaderFilenameToDependeciesMap())
{
SlowTask.EnterProgressFrame(ProgressStep);
CachedData.Reset();
GetDerivedDataCacheRef().WaitAsynchronousCompletion(AsyncDDCRequestHandles[HandleIndex]);
if (GetDerivedDataCacheRef().GetAsynchronousResults(AsyncDDCRequestHandles[HandleIndex], CachedData))
{
FMemoryReader MemoryReader(CachedData);
GGlobalShaderMap[Platform]->AddSection(FGlobalShaderMapSection::CreateFromArchive(MemoryReader));
}
else
{
// 沒有在DDC中找到, 忽略之.
}
++HandleIndex;
}
}
// 如果有shader沒有被載入, 編譯之.
VerifyGlobalShaders(Platform, bLoadedFromCacheFile);
// 建立所有著色器.
if (GCreateShadersOnLoad && Platform == GMaxRHIShaderPlatform)
{
GGlobalShaderMap[Platform]->BeginCreateAllShaders();
}
}
}
以上可知,FGlobalShaderMap是在引擎預初始化階段就被創建出例項,然後會嘗試從DDC中讀取已經編譯好的shader資料。在此之後,其它模組就可以正常訪問和操作FGlobalShaderMap的物件了。
另外,在FViewInfo內部,也存有FGlobalShaderMap的例項,不過它也是通過GetGlobalShaderMap獲取的例項:
// Engine\Source\Runtime\Renderer\Private\SceneRendering.h
class FViewInfo : public FSceneView
{
public:
(......)
FGlobalShaderMap* ShaderMap;
(......)
};
// Engine\Source\Runtime\Renderer\Private\SceneRendering.cpp
void FViewInfo::Init()
{
(......)
ShaderMap = GetGlobalShaderMap(FeatureLevel);
(......)
}
如此一來,渲染模組內的大多數邏輯都可以方便地獲取到FViewInfo的例項,因此也就可以方便地訪問FGlobalShaderMap的例項(還不需要指定FeatureLevel)。
8.3.1.3 FMaterialShaderMap
FMaterialShaderMap儲存和管理著一組FMaterialShader例項的物件。它和相關的型別定義如下:
// Engine\Source\Runtime\Engine\Public\MaterialShared.h
// 材質ShaderMap內容.
class FMaterialShaderMapContent : public FShaderMapContent
{
public:
(......)
inline uint32 GetNumShaders() const;
inline uint32 GetNumShaderPipelines() const;
private:
struct FProjectMeshShaderMapToKey
{
inline const FHashedName& operator()(const FMeshMaterialShaderMap* InShaderMap) { return InShaderMap->GetVertexFactoryTypeName(); }
};
// 獲取/增加/刪除操作.
FMeshMaterialShaderMap* GetMeshShaderMap(const FHashedName& VertexFactoryTypeName) const;
void AddMeshShaderMap(const FVertexFactoryType* VertexFactoryType, FMeshMaterialShaderMap* MeshShaderMap);
void RemoveMeshShaderMap(const FVertexFactoryType* VertexFactoryType);
// 有序的網格著色器對映表, 通過VFType->GetId()索引, 用於執行時快速查詢.
LAYOUT_FIELD(TMemoryImageArray<TMemoryImagePtr<FMeshMaterialShaderMap>>, OrderedMeshShaderMaps);
// 材質編譯輸出.
LAYOUT_FIELD(FMaterialCompilationOutput, MaterialCompilationOutput);
// 著色器內容雜湊.
LAYOUT_FIELD(FSHAHash, ShaderContentHash);
LAYOUT_FIELD_EDITORONLY(TMemoryImageArray<FMaterialProcessedSource>, ShaderProcessedSource);
LAYOUT_FIELD_EDITORONLY(FMemoryImageString, FriendlyName);
LAYOUT_FIELD_EDITORONLY(FMemoryImageString, DebugDescription);
LAYOUT_FIELD_EDITORONLY(FMemoryImageString, MaterialPath);
};
// 材質著色器對映表, 父類是TShaderMap.
class FMaterialShaderMap : public TShaderMap<FMaterialShaderMapContent, FShaderMapPointerTable>, public FDeferredCleanupInterface
{
public:
using Super = TShaderMap<FMaterialShaderMapContent, FShaderMapPointerTable>;
// 查詢指定id和平臺的FMaterialShaderMap例項.
static TRefCountPtr<FMaterialShaderMap> FindId(const FMaterialShaderMapId& ShaderMapId, EShaderPlatform Platform);
(......)
// ShaderMap interface
// 獲取著色器例項.
TShaderRef<FShader> GetShader(FShaderType* ShaderType, int32 PermutationId = 0) const;
template<typename ShaderType> TShaderRef<ShaderType> GetShader(int32 PermutationId = 0) const;
template<typename ShaderType> TShaderRef<ShaderType> GetShader(const typename ShaderType::FPermutationDomain& PermutationVector) const;
uint32 GetMaxNumInstructionsForShader(FShaderType* ShaderType) const;
void FinalizeContent();
// 編譯一個材質的著色器並快取到shader map中.
void Compile(FMaterial* Material,const FMaterialShaderMapId& ShaderMapId, TRefCountPtr<FShaderCompilerEnvironment> MaterialEnvironment, const FMaterialCompilationOutput& InMaterialCompilationOutput, EShaderPlatform Platform, bool bSynchronousCompile);
// 檢測是否有shader丟失.
bool IsComplete(const FMaterial* Material, bool bSilent);
// 嘗試增加已有的編譯任務.
bool TryToAddToExistingCompilationTask(FMaterial* Material);
// 構建在shader map的shader列表.
void GetShaderList(TMap<FShaderId, TShaderRef<FShader>>& OutShaders) const;
void GetShaderList(TMap<FHashedName, TShaderRef<FShader>>& OutShaders) const;
void GetShaderPipelineList(TArray<FShaderPipelineRef>& OutShaderPipelines) const;
uint32 GetShaderNum() const;
// 註冊一個材質著色器對映表到全域性表中, 那樣就可以被材質使用.
void Register(EShaderPlatform InShaderPlatform);
// Reference counting.
void AddRef();
void Release();
// 刪除指定shader type的所有在快取的入口.
void FlushShadersByShaderType(const FShaderType* ShaderType);
void FlushShadersByShaderPipelineType(const FShaderPipelineType* ShaderPipelineType);
void FlushShadersByVertexFactoryType(const FVertexFactoryType* VertexFactoryType);
static void RemovePendingMaterial(FMaterial* Material);
static const FMaterialShaderMap* GetShaderMapBeingCompiled(const FMaterial* Material);
// Accessors.
FMeshMaterialShaderMap* GetMeshShaderMap(FVertexFactoryType* VertexFactoryType) const;
FMeshMaterialShaderMap* GetMeshShaderMap(const FHashedName& VertexFactoryTypeName) const;
const FMaterialShaderMapId& GetShaderMapId() const;
(......)
private:
// 全域性的材質shader map.
static TMap<FMaterialShaderMapId,FMaterialShaderMap*> GIdToMaterialShaderMap[SP_NumPlatforms];
static FCriticalSection GIdToMaterialShaderMapCS;
// 正在編譯的材質.
static TMap<TRefCountPtr<FMaterialShaderMap>, TArray<FMaterial*> > ShaderMapsBeingCompiled;
// 著色器對映表id.
FMaterialShaderMapId ShaderMapId;
// 編譯期間的id.
uint32 CompilingId;
// 對應的平臺.
const ITargetPlatform* CompilingTargetPlatform;
// 被引用的數量.
mutable int32 NumRefs;
// 標記
bool bDeletedThroughDeferredCleanup;
uint32 bRegistered : 1;
uint32 bCompilationFinalized : 1;
uint32 bCompiledSuccessfully : 1;
uint32 bIsPersistent : 1;
(......)
};
FMaterialShaderMap和FGlobalShaderMap不一樣的是,它會額外關聯一個材質和一個頂點工廠。對於單個FMaterialShaderMap的內部資料內容,如下所示:
FMaterialShaderMap
FLightFunctionPixelShader - FMaterialShaderType
FLocalVertexFactory - FVertexFactoryType
TDepthOnlyPS - FMeshMaterialShaderType
TDepthOnlyVS - FMeshMaterialShaderType
TBasePassPS - FMeshMaterialShaderType
TBasePassVS - FMeshMaterialShaderType
(......)
FGPUSkinVertexFactory - FVertexFactoryType
(......)
由於FMaterialShaderMap跟材質藍圖繫結的,因為它是FMaterial的一個成員:
// Engine\Source\Runtime\Engine\Public\MaterialShared.h
class FMaterial
{
public:
// 獲取材質的shader例項.
TShaderRef<FShader> GetShader(class FMeshMaterialShaderType* ShaderType, FVertexFactoryType* VertexFactoryType, int32 PermutationId, bool bFatalIfMissing = true) const;
(......)
private:
// 遊戲執行緒的材質ShaderMap
TRefCountPtr<FMaterialShaderMap> GameThreadShaderMap;
// 渲染執行緒的材質ShaderMap
TRefCountPtr<FMaterialShaderMap> RenderingThreadShaderMap;
(......)
};
// Engine\Source\Runtime\Engine\Private\Materials\MaterialShared.cpp
TShaderRef<FShader> FMaterial::GetShader(FMeshMaterialShaderType* ShaderType, FVertexFactoryType* VertexFactoryType, int32 PermutationId, bool bFatalIfMissing) const
{
// 從RenderingThreadShaderMap獲取shader.
const FMeshMaterialShaderMap* MeshShaderMap = RenderingThreadShaderMap->GetMeshShaderMap(VertexFactoryType);
FShader* Shader = MeshShaderMap ? MeshShaderMap->GetShader(ShaderType, PermutationId) : nullptr;
(......)
// 返回FShader引用.
return TShaderRef<FShader>(Shader, *RenderingThreadShaderMap);
}
因此可以找到,每個FMaterial都有一個FMaterialShaderMap(遊戲執行緒一個,渲染執行緒一個),如果要獲取FMaterial的指定型別的Shader,就需要從該FMaterial的FMaterialShaderMap例項中獲取,從而完成了它們之間的連結。
8.3.1.4 FMeshMaterialShaderMap
以上小節闡述了,FGlobalShaderMap儲存和管理FGlobalShader,而FMaterialShaderMap儲存和管理FMaterialShader,相應地,FMeshMaterialShaderMap則儲存和管理FMeshMaterialShader。它的定義如下:
// Engine\Source\Runtime\Engine\Public\MaterialShared.h
class FMeshMaterialShaderMap : public FShaderMapContent
{
public:
FMeshMaterialShaderMap(EShaderPlatform InPlatform, FVertexFactoryType* InVFType);
// 開始編譯指定材質和頂點工廠型別的所有材質.
uint32 BeginCompile(
uint32 ShaderMapId,
const FMaterialShaderMapId& InShaderMapId,
const FMaterial* Material,
const FMeshMaterialShaderMapLayout& MeshLayout,
FShaderCompilerEnvironment* MaterialEnvironment,
EShaderPlatform Platform,
TArray<TSharedRef<FShaderCommonCompileJob, ESPMode::ThreadSafe>>& NewJobs,
FString DebugDescription,
FString DebugExtension
);
void FlushShadersByShaderType(const FShaderType* ShaderType);
void FlushShadersByShaderPipelineType(const FShaderPipelineType* ShaderPipelineType);
(......)
private:
// 頂點工廠型別名稱.
LAYOUT_FIELD(FHashedName, VertexFactoryTypeName);
};
FMeshMaterialShaderMap通常不能單獨被建立,而是附加在FMaterialShaderMapContent之中,隨著FMaterialShaderMapContent一起被建立和銷燬,具體細節和應用見上一小節。
8.3.2 Shader編譯
本節講的是如何將材質藍圖和usf檔案編譯成對應目標平臺的shader程式碼。為了便於闡述單個Shader檔案的編譯過程,我們不妨追蹤RecompileShaders
的命令的處理過程(編譯的是全域性shader):
// Engine\Source\Runtime\Engine\Private\ShaderCompiler\ShaderCompiler.cpp
bool RecompileShaders(const TCHAR* Cmd, FOutputDevice& Ar)
{
(......)
FString FlagStr(FParse::Token(Cmd, 0));
if( FlagStr.Len() > 0 )
{
// 重新整理著色器檔案快取.
FlushShaderFileCache();
// 重新整理渲染指令.
FlushRenderingCommands();
// 處理`RecompileShaders Changed`命令
if( FCString::Stricmp(*FlagStr,TEXT("Changed"))==0)
{
(......)
}
// 處理`RecompileShaders Global`命令
else if( FCString::Stricmp(*FlagStr,TEXT("Global"))==0)
{
(......)
}
// 處理`RecompileShaders Material`命令
else if( FCString::Stricmp(*FlagStr,TEXT("Material"))==0)
{
(......)
}
// 處理`RecompileShaders All`命令
else if( FCString::Stricmp(*FlagStr,TEXT("All"))==0)
{
(......)
}
// 處理`RecompileShaders <ShaderPath>`命令
else
{
// 根據檔名獲取FShaderType.
TArray<const FShaderType*> ShaderTypes = FShaderType::GetShaderTypesByFilename(*FlagStr);
// 根據FShaderType獲取FShaderPipelineType.
TArray<const FShaderPipelineType*> ShaderPipelineTypes = FShaderPipelineType::GetShaderPipelineTypesByFilename(*FlagStr);
if (ShaderTypes.Num() > 0 || ShaderPipelineTypes.Num() > 0)
{
FRecompileShadersTimer TestTimer(TEXT("RecompileShaders SingleShader"));
TArray<const FVertexFactoryType*> FactoryTypes;
// 遍歷材質所有啟用的FeatureLevel, 逐個編譯它們.
UMaterialInterface::IterateOverActiveFeatureLevels([&](ERHIFeatureLevel::Type InFeatureLevel) {
auto ShaderPlatform = GShaderPlatformForFeatureLevel[InFeatureLevel];
// 開始編譯指定ShaderTypes,ShaderPipelineTypes,ShaderPlatform的shader.
BeginRecompileGlobalShaders(ShaderTypes, ShaderPipelineTypes, ShaderPlatform);
// 結束編譯.
FinishRecompileGlobalShaders();
});
}
}
return 1;
}
(......)
}
上面程式碼進入了關鍵介面BeginRecompileGlobalShaders開始編譯指定的shader:
void BeginRecompileGlobalShaders(const TArray<const FShaderType*>& OutdatedShaderTypes, const TArray<const FShaderPipelineType*>& OutdatedShaderPipelineTypes, EShaderPlatform ShaderPlatform, const ITargetPlatform* TargetPlatform)
{
if (!FPlatformProperties::RequiresCookedData())
{
// 重新整理對現有全域性著色器的掛起訪問.
FlushRenderingCommands();
// 編譯全域性的ShaderMap.
CompileGlobalShaderMap(ShaderPlatform, TargetPlatform, false);
// 檢測有效性.
FGlobalShaderMap* GlobalShaderMap = GetGlobalShaderMap(ShaderPlatform);
if (OutdatedShaderTypes.Num() > 0 || OutdatedShaderPipelineTypes.Num() > 0)
{
VerifyGlobalShaders(ShaderPlatform, false, &OutdatedShaderTypes, &OutdatedShaderPipelineTypes);
}
}
}
// 編譯單個全域性著色器對映表.
void CompileGlobalShaderMap(EShaderPlatform Platform, const ITargetPlatform* TargetPlatform, bool bRefreshShaderMap)
{
(......)
// 刪除舊的資源.
if (bRefreshShaderMap || GGlobalShaderTargetPlatform[Platform] != TargetPlatform)
{
delete GGlobalShaderMap[Platform];
GGlobalShaderMap[Platform] = nullptr;
GGlobalShaderTargetPlatform[Platform] = TargetPlatform;
// 確保我們查詢更新的shader原始檔.
FlushShaderFileCache();
}
// 建立並編譯shader.
if (!GGlobalShaderMap[Platform])
{
(......)
GGlobalShaderMap[Platform] = new FGlobalShaderMap(Platform);
(......)
// 檢測是否有shader未載入, 是則編譯之.
VerifyGlobalShaders(Platform, bLoadedFromCacheFile);
if (GCreateShadersOnLoad && Platform == GMaxRHIShaderPlatform)
{
GGlobalShaderMap[Platform]->BeginCreateAllShaders();
}
}
}
// 檢測是否有shader未載入, 是則編譯之.
void VerifyGlobalShaders(EShaderPlatform Platform, bool bLoadedFromCacheFile, const TArray<const FShaderType*>* OutdatedShaderTypes, const TArray<const FShaderPipelineType*>* OutdatedShaderPipelineTypes)
{
(......)
// 獲取FGlobalShaderMap例項.
FGlobalShaderMap* GlobalShaderMap = GetGlobalShaderMap(Platform);
(......)
// 所有作業, 包含single和pipeline.
TArray<TSharedRef<FShaderCommonCompileJob, ESPMode::ThreadSafe>> GlobalShaderJobs;
// 先新增single jobs.
TMap<TShaderTypePermutation<const FShaderType>, FShaderCompileJob*> SharedShaderJobs;
for (TLinkedList<FShaderType*>::TIterator ShaderTypeIt(FShaderType::GetTypeList()); ShaderTypeIt; ShaderTypeIt.Next())
{
FGlobalShaderType* GlobalShaderType = ShaderTypeIt->GetGlobalShaderType();
if (!GlobalShaderType)
{
continue;
}
int32 PermutationCountToCompile = 0;
for (int32 PermutationId = 0; PermutationId < GlobalShaderType->GetPermutationCount(); PermutationId++)
{
if (GlobalShaderType->ShouldCompilePermutation(Platform, PermutationId)
&& (!GlobalShaderMap->HasShader(GlobalShaderType, PermutationId) || (OutdatedShaderTypes && OutdatedShaderTypes->Contains(GlobalShaderType))))
{
// 如果是過期的shader型別, 刪除之.
if (OutdatedShaderTypes)
{
GlobalShaderMap->RemoveShaderTypePermutaion(GlobalShaderType, PermutationId);
}
// 建立編譯global shader type的作業
auto* Job = FGlobalShaderTypeCompiler::BeginCompileShader(GlobalShaderType, PermutationId, Platform, nullptr, GlobalShaderJobs);
TShaderTypePermutation<const FShaderType> ShaderTypePermutation(GlobalShaderType, PermutationId);
// 新增到作業列表.
SharedShaderJobs.Add(ShaderTypePermutation, Job);
PermutationCountToCompile++;
}
}
(......)
}
// 處理FShaderPipeline, 如果是可共享的pipeline, 則不需要重複新增作業.
for (TLinkedList<FShaderPipelineType*>::TIterator ShaderPipelineIt(FShaderPipelineType::GetTypeList()); ShaderPipelineIt; ShaderPipelineIt.Next())
{
const FShaderPipelineType* Pipeline = *ShaderPipelineIt;
if (Pipeline->IsGlobalTypePipeline())
{
if (!GlobalShaderMap->HasShaderPipeline(Pipeline) || (OutdatedShaderPipelineTypes && OutdatedShaderPipelineTypes->Contains(Pipeline)))
{
auto& StageTypes = Pipeline->GetStages();
TArray<FGlobalShaderType*> ShaderStages;
for (int32 Index = 0; Index < StageTypes.Num(); ++Index)
{
FGlobalShaderType* GlobalShaderType = ((FShaderType*)(StageTypes[Index]))->GetGlobalShaderType();
if (GlobalShaderType->ShouldCompilePermutation(Platform, kUniqueShaderPermutationId))
{
ShaderStages.Add(GlobalShaderType);
}
else
{
break;
}
}
// 刪除過期的PipelineType
if (OutdatedShaderPipelineTypes)
{
GlobalShaderMap->RemoveShaderPipelineType(Pipeline);
}
if (ShaderStages.Num() == StageTypes.Num())
{
(......)
if (Pipeline->ShouldOptimizeUnusedOutputs(Platform))
{
// Make a pipeline job with all the stages
FGlobalShaderTypeCompiler::BeginCompileShaderPipeline(Platform, Pipeline, ShaderStages, GlobalShaderJobs);
}
else
{
for (const FShaderType* ShaderType : StageTypes)
{
TShaderTypePermutation<const FShaderType> ShaderTypePermutation(ShaderType, kUniqueShaderPermutationId);
FShaderCompileJob** Job = SharedShaderJobs.Find(ShaderTypePermutation);
auto* SingleJob = (*Job)->GetSingleShaderJob();
auto& SharedPipelinesInJob = SingleJob->SharingPipelines.FindOrAdd(nullptr);
// 新增pipeline作業.
SharedPipelinesInJob.Add(Pipeline);
}
}
}
}
}
}
if (GlobalShaderJobs.Num() > 0)
{
GetOnGlobalShaderCompilation().Broadcast();
// 新增編譯作業到GShaderCompilingManager中.
GShaderCompilingManager->AddJobs(GlobalShaderJobs, true, false, "Globals");
// 部分平臺不支援非同步shader編譯.
const bool bAllowAsynchronousGlobalShaderCompiling =
!IsOpenGLPlatform(GMaxRHIShaderPlatform) && !IsVulkanPlatform(GMaxRHIShaderPlatform) &&
!IsMetalPlatform(GMaxRHIShaderPlatform) && !IsSwitchPlatform(GMaxRHIShaderPlatform) &&
GShaderCompilingManager->AllowAsynchronousShaderCompiling();
if (!bAllowAsynchronousGlobalShaderCompiling)
{
TArray<int32> ShaderMapIds;
ShaderMapIds.Add(GlobalShaderMapId);
GShaderCompilingManager->FinishCompilation(TEXT("Global"), ShaderMapIds);
}
}
}
由此可知,shader的編譯作業由全域性物件GShaderCompilingManager完成,下面進入FShaderCompilingManager的型別定義:
// Engine\Source\Runtime\Engine\Public\ShaderCompiler.h
class FShaderCompilingManager
{
(......)
private:
//////////////////////////////////////////////////////
// 執行緒共享的屬性: 只有當CompileQueueSection獲得時才能讀寫.
bool bCompilingDuringGame;
// 正在編譯的作業列表.
TArray<TSharedRef<FShaderCommonCompileJob, ESPMode::ThreadSafe>> CompileQueue;
TMap<int32, FShaderMapCompileResults> ShaderMapJobs;
int32 NumOutstandingJobs;
int32 NumExternalJobs;
FCriticalSection CompileQueueSection;
//////////////////////////////////////////////////////
// 主執行緒狀態 - 只有主執行緒可訪問.
TMap<int32, FShaderMapFinalizeResults> PendingFinalizeShaderMaps;
TUniquePtr<FShaderCompileThreadRunnableBase> Thread;
//////////////////////////////////////////////////////
// 配置屬性
uint32 NumShaderCompilingThreads;
uint32 NumShaderCompilingThreadsDuringGame;
int32 MaxShaderJobBatchSize;
int32 NumSingleThreadedRunsBeforeRetry;
uint32 ProcessId;
(......)
public:
// 資料訪問和設定介面.
bool ShouldDisplayCompilingNotification() const;
bool AllowAsynchronousShaderCompiling() const;
bool IsCompiling() const;
bool HasShaderJobs() const;
int32 GetNumRemainingJobs() const;
void SetExternalJobs(int32 NumJobs);
enum class EDumpShaderDebugInfo : int32
{
Never = 0,
Always = 1,
OnError = 2,
OnErrorOrWarning = 3
};
(......)
// 增加編譯作業.
ENGINE_API void AddJobs(TArray<TSharedRef<FShaderCommonCompileJob, ESPMode::ThreadSafe>>& NewJobs, bool bOptimizeForLowLatency, bool bRecreateComponentRenderStateOnCompletion, const FString MaterialBasePath, FString PermutationString = FString(""), bool bSkipResultProcessing = false);
// 刪除編譯作業.
ENGINE_API void CancelCompilation(const TCHAR* MaterialName, const TArray<int32>& ShaderMapIdsToCancel);
// 結束編譯作業, 會阻塞執行緒直到指定的材質編譯完成.
ENGINE_API void FinishCompilation(const TCHAR* MaterialName, const TArray<int32>& ShaderMapIdsToFinishCompiling);
// 阻塞所有shader編譯, 直到完成.
ENGINE_API void FinishAllCompilation();
// 關閉編譯管理器.
ENGINE_API void Shutdown();
// 處理已經完成的非同步結果, 將它們附加到關聯的材質上.
ENGINE_API void ProcessAsyncResults(bool bLimitExecutionTime, bool bBlockOnGlobalShaderCompletion);
static bool IsShaderCompilerWorkerRunning(FProcHandle & WorkerHandle);
};
// Engine\Source\Runtime\Engine\Private\ShaderCompiler\ShaderCompiler.cpp
void FShaderCompilingManager::AddJobs(TArray<TSharedRef<FShaderCommonCompileJob, ESPMode::ThreadSafe>>& NewJobs, bool bOptimizeForLowLatency, bool bRecreateComponentRenderStateOnCompletion, const FString MaterialBasePath, const FString PermutationString, bool bSkipResultProcessing)
{
(......)
// 註冊作業到GShaderCompilerStats.
if(NewJobs.Num())
{
FShaderCompileJob* Job = NewJobs[0]->GetSingleShaderJob();
if(Job) //assume that all jobs are for the same platform
{
GShaderCompilerStats->RegisterCompiledShaders(NewJobs.Num(), Job->Input.Target.GetPlatform(), MaterialBasePath, PermutationString);
}
else
{
GShaderCompilerStats->RegisterCompiledShaders(NewJobs.Num(), SP_NumPlatforms, MaterialBasePath, PermutationString);
}
}
// 入隊編譯列表.
if (bOptimizeForLowLatency)
{
int32 InsertIndex = 0;
for (; InsertIndex < CompileQueue.Num(); InsertIndex++)
{
if (!CompileQueue[InsertIndex]->bOptimizeForLowLatency)
{
break;
}
}
CompileQueue.InsertZeroed(InsertIndex, NewJobs.Num());
for (int32 JobIndex = 0; JobIndex < NewJobs.Num(); JobIndex++)
{
CompileQueue[InsertIndex + JobIndex] = NewJobs[JobIndex];
}
}
else
{
CompileQueue.Append(NewJobs);
}
// 增加作業數量.
FPlatformAtomics::InterlockedAdd(&NumOutstandingJobs, NewJobs.Num());
// 增加著色器對映表的作業數量.
for (int32 JobIndex = 0; JobIndex < NewJobs.Num(); JobIndex++)
{
NewJobs[JobIndex]->bOptimizeForLowLatency = bOptimizeForLowLatency;
FShaderMapCompileResults& ShaderMapInfo = ShaderMapJobs.FindOrAdd(NewJobs[JobIndex]->Id);
ShaderMapInfo.bRecreateComponentRenderStateOnCompletion = bRecreateComponentRenderStateOnCompletion;
ShaderMapInfo.bSkipResultProcessing = bSkipResultProcessing;
auto* PipelineJob = NewJobs[JobIndex]->GetShaderPipelineJob();
if (PipelineJob)
{
ShaderMapInfo.NumJobsQueued += PipelineJob->StageJobs.Num();
}
else
{
ShaderMapInfo.NumJobsQueued++;
}
}
}
void FShaderCompilingManager::FinishCompilation(const TCHAR* MaterialName, const TArray<int32>& ShaderMapIdsToFinishCompiling)
{
(......)
TMap<int32, FShaderMapFinalizeResults> CompiledShaderMaps;
CompiledShaderMaps.Append( PendingFinalizeShaderMaps );
PendingFinalizeShaderMaps.Empty();
// 阻塞編譯.
BlockOnShaderMapCompletion(ShaderMapIdsToFinishCompiling, CompiledShaderMaps);
// 重試並獲取潛在的錯誤.
bool bRetry = false;
do
{
bRetry = HandlePotentialRetryOnError(CompiledShaderMaps);
}
while (bRetry);
// 處理編譯好的ShaderMap.
ProcessCompiledShaderMaps(CompiledShaderMaps, FLT_MAX);
(......)
}
以上可知,最終的shader編譯作業例項型別是FShaderCommonCompileJob,它的例項對進入一個全域性的佇列,以便多執行緒非同步地編譯。下面是FShaderCommonCompileJob及其相關型別的定義:
// Engine\Source\Runtime\Engine\Public\ShaderCompiler.h
// 儲存了用於編譯shader或shader pipeline的通用資料.
class FShaderCommonCompileJob
{
public:
uint32 Id;
// 是否完成了編譯.
bool bFinalized;
// 是否成功.
bool bSucceeded;
bool bOptimizeForLowLatency;
FShaderCommonCompileJob(uint32 InId);
virtual ~FShaderCommonCompileJob();
// 資料介面.
virtual FShaderCompileJob* GetSingleShaderJob();
virtual const FShaderCompileJob* GetSingleShaderJob() const;
virtual FShaderPipelineCompileJob* GetShaderPipelineJob();
virtual const FShaderPipelineCompileJob* GetShaderPipelineJob() const;
// 未著色編譯器作業獲取一個全域性的id.
ENGINE_API static uint32 GetNextJobId();
private:
// 作業id的計數器.
static FThreadSafeCounter JobIdCounter;
};
// 用於編譯單個shader的所有輸入和輸出資訊.
class FShaderCompileJob : public FShaderCommonCompileJob
{
public:
// 著色器的頂點工廠, 可能是null.
FVertexFactoryType* VFType;
// 著色器型別.
FShaderType* ShaderType;
// 排列id.
int32 PermutationId;
// 編譯的輸入和輸出.
FShaderCompilerInput Input;
FShaderCompilerOutput Output;
// 共享此作業的Pipeline列表.
TMap<const FVertexFactoryType*, TArray<const FShaderPipelineType*>> SharingPipelines;
FShaderCompileJob(uint32 InId, FVertexFactoryType* InVFType, FShaderType* InShaderType, int32 InPermutationId);
virtual FShaderCompileJob* GetSingleShaderJob() override;
virtual const FShaderCompileJob* GetSingleShaderJob() const override;
};
// 用於編譯ShaderPipeline的資訊.
class FShaderPipelineCompileJob : public FShaderCommonCompileJob
{
public:
// 作業列表.
TArray<TSharedRef<FShaderCommonCompileJob, ESPMode::ThreadSafe>> StageJobs;
bool bFailedRemovingUnused;
// 所屬的ShaderPipeline
const FShaderPipelineType* ShaderPipeline;
FShaderPipelineCompileJob(uint32 InId, const FShaderPipelineType* InShaderPipeline, int32 NumStages);
virtual FShaderPipelineCompileJob* GetShaderPipelineJob() override;
virtual const FShaderPipelineCompileJob* GetShaderPipelineJob() const override;
};
以上作業經過FShaderCompilingManager::AddJobs等介面加入到FShaderCompilingManager::CompileQueue佇列中,然後主要由FShaderCompileThreadRunnable::PullTasksFromQueue介面拉取作業並執行(多生產者多消費者模式):
// Engine\Source\Runtime\Engine\Private\ShaderCompiler\ShaderCompiler.cpp
int32 FShaderCompileThreadRunnable::PullTasksFromQueue()
{
int32 NumActiveThreads = 0;
{
// 進入臨界區, 以便訪問輸入和輸出佇列.
FScopeLock Lock(&Manager->CompileQueueSection);
const int32 NumWorkersToFeed = Manager->bCompilingDuringGame ? Manager->NumShaderCompilingThreadsDuringGame : WorkerInfos.Num();
// 計算每個工作執行緒的作業數量.
const auto NumJobsPerWorker = (Manager->CompileQueue.Num() / NumWorkersToFeed) + 1;
// 遍歷所有WorkerInfos.
for (int32 WorkerIndex = 0; WorkerIndex < WorkerInfos.Num(); WorkerIndex++)
{
FShaderCompileWorkerInfo& CurrentWorkerInfo = *WorkerInfos[WorkerIndex];
// 如果本工作執行緒沒有任何佇列作業, 從其它輸入佇列查詢.
if (CurrentWorkerInfo.QueuedJobs.Num() == 0 && WorkerIndex < NumWorkersToFeed)
{
if (Manager->CompileQueue.Num() > 0)
{
bool bAddedLowLatencyTask = false;
const auto MaxNumJobs = FMath::Min3(NumJobsPerWorker, Manager->CompileQueue.Num(), Manager->MaxShaderJobBatchSize);
int32 JobIndex = 0;
// Don't put more than one low latency task into a batch
for (; JobIndex < MaxNumJobs && !bAddedLowLatencyTask; JobIndex++)
{
bAddedLowLatencyTask |= Manager->CompileQueue[JobIndex]->bOptimizeForLowLatency;
// 從管理器的CompileQueue新增到本工作執行緒的QueuedJobs.
CurrentWorkerInfo.QueuedJobs.Add(Manager->CompileQueue[JobIndex]);
}
CurrentWorkerInfo.bIssuedTasksToWorker = false;
CurrentWorkerInfo.bLaunchedWorker = false;
CurrentWorkerInfo.StartTime = FPlatformTime::Seconds();
NumActiveThreads++;
// 從從管理器的CompileQueue刪除已經劫取的作業. 其中CompileQueue是ThreadSafe模式的TArray.
Manager->CompileQueue.RemoveAt(0, JobIndex);
}
}
// 本工作執行緒有作業.
else
{
if (CurrentWorkerInfo.QueuedJobs.Num() > 0)
{
NumActiveThreads++;
}
// 增加已經完成的作業到輸出佇列(ShaderMapJobs)
if (CurrentWorkerInfo.bComplete)
{
for (int32 JobIndex = 0; JobIndex < CurrentWorkerInfo.QueuedJobs.Num(); JobIndex++)
{
FShaderMapCompileResults& ShaderMapResults = Manager->ShaderMapJobs.FindChecked(CurrentWorkerInfo.QueuedJobs[JobIndex]->Id);
ShaderMapResults.FinishedJobs.Add(CurrentWorkerInfo.QueuedJobs[JobIndex]);
ShaderMapResults.bAllJobsSucceeded = ShaderMapResults.bAllJobsSucceeded && CurrentWorkerInfo.QueuedJobs[JobIndex]->bSucceeded;
}
(......)
// 更新NumOutstandingJobs數量.
FPlatformAtomics::InterlockedAdd(&Manager->NumOutstandingJobs, -CurrentWorkerInfo.QueuedJobs.Num());
// 清空作業資料.
CurrentWorkerInfo.bComplete = false;
CurrentWorkerInfo.QueuedJobs.Empty();
}
}
}
}
return NumActiveThreads;
}
以上工作執行緒資訊CurrentWorkerInfo的型別是FShaderCompileWorkerInfo:
// 著色器編譯工作執行緒資訊.
struct FShaderCompileWorkerInfo
{
// 工作程序的handle. 可能是非法的.
FProcHandle WorkerProcess;
// 追蹤是否存在有問題的任何.
bool bIssuedTasksToWorker;
// 是否已啟動.
bool bLaunchedWorker;
// 是否所有任務問題都已收到.
bool bComplete;
// 最近啟動任務批次的時間.
double StartTime;
// 工作程序需負責編譯的工作.(注意是執行緒安全模式)
TArray<TSharedRef<FShaderCommonCompileJob, ESPMode::ThreadSafe>> QueuedJobs;
// 建構函式.
FShaderCompileWorkerInfo();
// 解構函式, 不是Virtual的.
~FShaderCompileWorkerInfo()
{
if(WorkerProcess.IsValid())
{
FPlatformProcess::TerminateProc(WorkerProcess);
FPlatformProcess::CloseProc(WorkerProcess);
}
}
};
至此,Shader的編譯流程和機制已經闡述得差不多了,剩下的細節和機理可以自行研究。
8.3.3 Shader跨平臺
我們在開發的時候,只會編寫一種UE Style的HLSL,那麼UE背後是如何將它們編譯成不同圖形API(下表)和FeatureLevel的Shader指令呢?
圖形API | 著色語言 | 解析 |
---|---|---|
Direct3D | HLSL(High Level Shading Language) | 高階著色語言,只能用於windows平臺 |
OpenGL | GLSL(OpenGL Shading Language) | 可跨平臺,但基於狀態機的設計和現代GPU架構格格不入 |
OpenGL ES | ES GLSL | 專用於移動平臺 |
Metal | MSL(Metal Shading Language) | 只能用於Apple系統 |
Vulkan | SPIR-V | SPIR-V是中間語言,可方便且完整地轉譯其它平臺的shader |
SPIR-V由Khronos(也是OpenGL和Vulkan的締造者)掌管,它實際上是個龐大的生態系統,包含了著色語言、工具鏈及執行時庫:
SPIR-V的生態系統一覽,Shader跨平臺只是其中一部分。
SPIR-V也是目前不少商業引擎或渲染器的shader跨平臺方案。那麼UE是不是也是使用SPIR-V,還是選擇了其它方案?本節將解答此問題,挖掘UE使用的Shader跨平臺方案。
對於Shader跨平臺,通常需要考慮以下幾點:
- 單次編碼多平臺使用。這個是基本要求,不能實現此特性,則無從談起跨平臺,也增加開發人員的工作量,降低工作效率。
- 可離線編譯。目前多數shader編譯器都支援這個功能。
- 需要反射來建立在執行時渲染器使用的元資料。 比如紋理被繫結到哪個索引,Uniform是否被使用使用等等。
- 特定的優化措施。如離線校驗,內聯化,無用的指令和資料檢測、刪除,指令合併和簡化,離線編譯的是中間語言還是目標機器碼等等。
UE早期在Shader跨平臺方案考慮了幾種思路:
- 純粹用巨集封裝各種著色語言的差異。簡單的著色邏輯應該可行,但實際上,各種著色語言存在巨大的差異,幾乎無法用巨集抽象。因此不可行。
- 使用FXC編譯HLSL,然後轉換位元組碼。良好的效果,但致命缺點是無法支援Mac OS平臺,因此被棄用。
- 第三方跨平臺編譯器。在當時(2014年),沒有一個能夠支援SM5.0的語法和Coumte Shader的編譯器。
面對當時(2014年前後)的現狀,UE4.3受glsl-optimizer的啟發,基於Mesa GLSL parser and IR造了個自己的輪子HLSLCC(HLSL Cross Compiler)。HLSLCC將分析器用來分析SM5.0(而非GLSL),實現Mesa IR到GLSL的轉換器(類似於glsl-optimizer)。另外,Mesa天然支援IR優化,因此HLSLCC也支援IR優化。
HLSLCC在GLSL下的管線示意圖。Shader編譯器的輸入是HLSL原始碼,會先轉成MCPP,然後經過HLSLCC處理成GLSL原始碼和引數表。
HLSLCC的主要工作步驟如下所述:
- Preprocessing,預處理階段。通過類似C風格的前處理器執行,在編譯之前,UE使用MCPP進行預處理,因此跳過了這一步。
- Parsing,語法分析階段。通過Mesa的_mesa_hlsl_parse介面,HLSL將被分析成抽象語法樹,Lexer(語法分析)和Parser分別由flex和bison生成。
- Compilation,編譯階段。利用 _mesa_ast_to_hir,將AST(抽象語法樹)編譯為Mesa IR。在此階段,編譯器執行隱式轉換、函式過載解析、生成內部函式的指令等功能,也將生成 GLSL 主入口點,會將輸入及輸出變數的全域性宣告新增到IR,同時計算HLSL入口點的輸入,呼叫HLSL入口點,並將輸出寫入全域性輸出變數。
- Optimization,優化階段。主要通過do_optimization_pass對IR執行多遍優化,包括直接插入函式、消除無用程式碼、傳播常量、消除公共的子表示式等等。
- Uniform packing,全域性變數打包。將全域性統一變數打包成陣列並保留對映資訊,以便引擎可將引數與一致變數陣列的相關部分繫結。
- Final optimization,最終優化階段。打包統一變數之後,將對IR執行第二遍優化,以簡化打包統一變數時生成的程式碼。
- Generate GLSL,生成GLSL。最後步驟,將已經優化的IR轉換為GLSL原始碼。除了生成所有構造及統一變數緩衝區的定義以及原始碼本身以外,還會在檔案開頭的註釋中寫入一個對映表。
以上的闡述涉及的原始碼在Engine\Source\ThirdParty\hlslcc目錄下面,核心檔案有:
- ast.h
- glcpp-parse.h
- glsl_parser_extras.h
- hlsl_parser.h
- ir_optimization.h
下面是編譯階段涉及到的核心函式:
函式名 | 解析 |
---|---|
apply_type_conversion | 此函式將一種型別的值轉換為另一種型別(如果有可能的話)。是執行隱式轉換還是顯式轉換由引數控制。 |
arithmetic_result_type | 這組函式確定對輸入值應用操作的結果型別。 |
validate_assignment | 確定某個 rvalue 是否可賦予特定型別的 lvalue。必要時,將應用允許的隱式轉換。 |
do_assignment | 將 rvalue 賦予 lvalue(如果可使用 validate_assignment 完成)。 |
ast_expression::hir | 將AST中的表示式節點轉換為一組IR指令。 |
process_initializer | 將初始化表示式應用於變數。 |
ast_struct_specifier::hir | 構建聚合型別,以表示所宣告的結構。 |
ast_cbuffer_declaration::hir | 構建常量緩衝區佈局的結構體,並將其儲存為統一變數塊。 |
process_mul | 處理HLSL內部乘法的特殊程式碼。 |
match_function_by_name | 根據輸入引數的名稱和列表來查詢函式特徵符。 |
rank_parameter_lists | 對兩個引數列表進行比較,並指定數字排名以指示這兩個列表的匹配程度。是一個輔助函式,用於執行過載解析:排名最低的特徵符將勝出,如果有任何特徵符的排名與排名最低的特徵符相同,那麼將函式呼叫宣告為具有歧義。排名為零表示精確匹配。 |
gen_texture_op | 處理內建HLSL紋理和取樣物件的方法呼叫。 |
_mesa_glsl_initialize_functions | 生成HLSL內部函式的內建函式。大部分函式(例如 sin 和 cos)會生成IR程式碼以執行操作,但某些函式(例如 transpose 和 determinant)會保留函式呼叫以推遲操作,使其由驅動程式的 GLSL 編譯器執行。 |
HLSLCC從UE4.3的首個版本開始,到至今的4.26,經歷了數次迭代。例如在UE4.22,Shader的跨平臺示意圖如下:
UE4.22的shader跨平臺示意圖,其中Metal SL由Mesa IR轉譯而來,Vulkan由Mesa IR-GLSL-GLSlang-SPIR-V多重轉義而來。
在UE4.25,Shader的跨平臺示意圖如下:
UE4.25的shader跨平臺示意圖,最大的改變在於增加了Shader Conductor,從而通過DXC->SPIR-V再轉譯到Metal、Vulkan、DX等平臺。
因此,UE4.25的最大改變在於新增了Shader Conductor,轉換成SPIR-V,以實現Metal、Vulkan等平臺的轉移。
其中Shader Conductor也是第三方庫,位於引擎的Engine\Source\ThirdParty\ShaderConductor目錄下。它的核心模組有:
- ShaderConductor.hpp
- ShaderConductor.cpp
- Native.h
- Native.cpp
Shader Conductor內部還包含了DirectXShaderCompiler、SPIRV-Cross、SPIRV-Headers、SPIRV-Tools等元件。
UE4.25的思路跟叛逆者(龔敏敏)的KlayGE的Shader跨平臺方案如出一轍:
Vulkan不但擁有全新的API,還帶來了一個新的shader中間格式SPIR-V。這正是通往統一的跨平臺shader編譯路上最更要的一級臺階。從趨勢來看,未來將會越來越多引擎和渲染器以SPIR-V做為首選的跨平臺技術解決方案。
另外提一個小細節,Direct3D和OpenGL雖然在標準化裝置座標一致,但在UV空間的座標是不一致的:
UE為了不讓shader的開發人員察覺到這一差異,採用了翻轉的圖片,強制使得UV座標用統一的正規化:
這樣做的後果就是OpenGL的紋理實際上是垂直翻轉的(從RenderDoc擷取的UE在OpenGL平臺下的應用也可佐證),不過渲染後期可以再次翻轉就行了。但是,UE採用顛倒(Upside down)的渲染方式,並且將顛倒的引數整合到投影矩陣:
因此,看起來標準化裝置座標和D3D下的紋理都是垂直翻轉的。
8.3.4 Shader快取
Shader快取有兩種,一種是存於DDC的離線資料,常用來加速編輯器階段和開發階段的效率,具體可參見8.3.1.2 FGlobalShaderMap。另一種是執行時的Shader快取,早期的UE由FShaderCache承擔,但UE4.26已經取消了FShaderCache,由FShaderPipelineCache取而代之。
FShaderPipelineCache提供了新的管道狀態物件(PSO)日誌記錄、序列化和預編譯機制 。快取管道狀態物件並將初始化器序列化到磁碟,允許在下次遊戲執行時預編譯這些狀態,這可以減少卡頓。但FShaderPipelineCache依賴於FShaderCodeLibrary、Share Material Shader Code和RHI側的PipelineFileCache。
下面是FShaderPipelineCache的定義:
// Engine\Source\Runtime\RenderCore\Public\ShaderPipelineCache.h
class FShaderPipelineCache : public FTickableObjectRenderThread
{
// 編譯作業結構體.
struct CompileJob
{
FPipelineCacheFileFormatPSO PSO;
FShaderPipelineCacheArchive* ReadRequests;
};
public:
// 初始化FShaderPipelineCache.
static void Initialize(EShaderPlatform Platform);
// 銷燬FShaderPipelineCache
static void Shutdown();
// 暫停/繼續打包預編譯.
static void PauseBatching();
static void ResumeBatching();
// 打包模式
enum class BatchMode
{
Background, // 最大打包尺寸由r.ShaderPipelineCache.BackgroundBatchSize決定.
Fast, // 最大打包尺寸由r.ShaderPipelineCache.BatchSize決定.
Precompile // 最大打包尺寸由r.ShaderPipelineCache.PrecompileBatchSize決定.
};
// 設定和獲取資料介面.
static void SetBatchMode(BatchMode Mode);
static uint32 NumPrecompilesRemaining();
static uint32 NumPrecompilesActive();
static int32 GetGameVersionForPSOFileCache();
static bool SetGameUsageMaskWithComparison(uint64 Mask, FPSOMaskComparisonFn InComparisonFnPtr);
static bool IsBatchingPaused();
// 開啟FShaderPipelineCache
static bool OpenPipelineFileCache(EShaderPlatform Platform);
static bool OpenPipelineFileCache(FString const& Name, EShaderPlatform Platform);
// 儲存/關閉FShaderPipelineCache
static bool SavePipelineFileCache(FPipelineFileCache::SaveMode Mode);
static void ClosePipelineFileCache();
// 構造/解構函式.
FShaderPipelineCache(EShaderPlatform Platform);
virtual ~FShaderPipelineCache();
// Tick相關介面.
bool IsTickable() const;
// 幀Tick
void Tick( float DeltaTime );
bool NeedsRenderingResumedForRenderingThreadTick() const;
TStatId GetStatId() const;
enum ELibraryState
{
Opened,
Closed
};
// 狀態變換通知.
static void ShaderLibraryStateChanged(ELibraryState State, EShaderPlatform Platform, FString const& Name);
// 預編譯上下文.
class FShaderCachePrecompileContext
{
bool bSlowPrecompileTask;
public:
FShaderCachePrecompileContext() : bSlowPrecompileTask(false) {}
void SetPrecompilationIsSlowTask() { bSlowPrecompileTask = true; }
bool IsPrecompilationSlowTask() const { return bSlowPrecompileTask; }
};
// 訊號委託函式.
static FShaderCachePreOpenDelegate& GetCachePreOpenDelegate();
static FShaderCacheOpenedDelegate& GetCacheOpenedDelegate();
static FShaderCacheClosedDelegate& GetCacheClosedDelegate();
static FShaderPrecompilationBeginDelegate& GetPrecompilationBeginDelegate();
static FShaderPrecompilationCompleteDelegate& GetPrecompilationCompleteDelegate();
(......)
private:
// 打包預編譯的各種資料.
static FShaderPipelineCache* ShaderPipelineCache;
TArray<CompileJob> ReadTasks;
TArray<CompileJob> CompileTasks;
TArray<FPipelineCachePSOHeader> OrderedCompileTasks;
TDoubleLinkedList<FPipelineCacheFileFormatPSORead*> FetchTasks;
TSet<uint32> CompiledHashes;
FString FileName;
EShaderPlatform CurrentPlatform;
FGuid CacheFileGuid;
uint32 BatchSize;
FShaderCachePrecompileContext ShaderCachePrecompileContext;
FCriticalSection Mutex;
TArray<FPipelineCachePSOHeader> PreFetchedTasks;
TArray<CompileJob> ShutdownReadCompileTasks;
TDoubleLinkedList<FPipelineCacheFileFormatPSORead*> ShutdownFetchTasks;
TMap<FBlendStateInitializerRHI, FRHIBlendState*> BlendStateCache;
TMap<FRasterizerStateInitializerRHI, FRHIRasterizerState*> RasterizerStateCache;
TMap<FDepthStencilStateInitializerRHI, FRHIDepthStencilState*> DepthStencilStateCache;
(......)
};
FShaderPipelineCache的打包預編譯獲得的資料儲存在工程目錄的Saved目錄下,字尾是.upipelinecache:
// Engine\Source\Runtime\RHI\Private\PipelineFileCache.cpp
bool FPipelineFileCache::SavePipelineFileCache(FString const& Name, SaveMode Mode)
{
bool bOk = false;
// 必須開啟PipelineFileCache且記錄PSO到檔案快取.
if(IsPipelineFileCacheEnabled() && LogPSOtoFileCache())
{
if(FileCache)
{
// 儲存的平臺名稱.
FName PlatformName = FileCache->GetPlatformName();
// 儲存的目錄
FString Path = FPaths::ProjectSavedDir() / FString::Printf(TEXT("%s_%s.upipelinecache"), *Name, *PlatformName.ToString());
// 執行儲存操作.
bOk = FileCache->SavePipelineFileCache(Path, Mode, Stats, NewPSOs, RequestedOrder, NewPSOUsage);
(......)
}
}
return bOk;
}
由於是執行時生效的Shader快取,那麼必然是要整合到UE的執行時模組中。實際上是在FEngineLoop內完成對它的操控:
int32 FEngineLoop::PreInitPreStartupScreen(const TCHAR* CmdLine)
{
(......)
{
bool bUseCodeLibrary = FPlatformProperties::RequiresCookedData() || GAllowCookedDataInEditorBuilds;
if (bUseCodeLibrary)
{
{
FShaderCodeLibrary::InitForRuntime(GMaxRHIShaderPlatform);
}
#if !UE_EDITOR
// Cooked data only - but also requires the code library - game only
if (FPlatformProperties::RequiresCookedData())
{
// 初始化FShaderPipelineCache
FShaderPipelineCache::Initialize(GMaxRHIShaderPlatform);
}
#endif //!UE_EDITOR
}
}
(......)
}
int32 FEngineLoop::PreInitPostStartupScreen(const TCHAR* CmdLine)
{
(......)
IInstallBundleManager* BundleManager = IInstallBundleManager::GetPlatformInstallBundleManager();
if (BundleManager == nullptr || BundleManager->IsNullInterface())
{
(......)
{
// 開啟包含了材質著色器的遊戲庫.
FShaderCodeLibrary::OpenLibrary(FApp::GetProjectName(), FPaths::ProjectContentDir());
for (const FString& RootDir : FPlatformMisc::GetAdditionalRootDirectories())
{
FShaderCodeLibrary::OpenLibrary(FApp::GetProjectName(), FPaths::Combine(RootDir, FApp::GetProjectName(), TEXT("Content")));
}
// 開啟FShaderPipelineCache.
FShaderPipelineCache::OpenPipelineFileCache(GMaxRHIShaderPlatform);
}
}
(......)
}
此外,GameEngine也會執行時相應命令列的繼續和暫停預編譯打包。一旦FShaderPipelineCache的實際準備好,RHI層就可以相應它的實際和訊號,以Vulkan的FVulkanPipelineStateCacheManager為例:
// Engine\Source\Runtime\VulkanRHI\Private\VulkanPipeline.h
class FVulkanPipelineStateCacheManager
{
(......)
private:
// 追蹤ShaderPipelineCache的預編譯的委託.
void OnShaderPipelineCacheOpened(FString const& Name, EShaderPlatform Platform, uint32 Count, const FGuid& VersionGuid, FShaderPipelineCache::FShaderCachePrecompileContext& ShaderCachePrecompileContext);
void OnShaderPipelineCachePrecompilationComplete(uint32 Count, double Seconds, const FShaderPipelineCache::FShaderCachePrecompileContext& ShaderCachePrecompileContext);
(......)
};
如果要開啟Shader Pipeline Cache,需要在工程配置裡勾選以下兩項(預設已開啟):
下面有一些命令列變數可以設定Shader Pipeline Cache的屬性:
命令列 | 作用 |
---|---|
r.ShaderPipelineCache.Enabled | 開啟Shader Pipeline Cache,以便從磁碟載入已有的資料並預編譯。 |
r.ShaderPipelineCache.BatchSize / BackgroundBatchSize | 可以設定不同Batch模式下的尺寸。 |
r.ShaderPipelineCache.LogPSO | 開啟Shader Pipeline Cache下的PSO記錄。 |
r.ShaderPipelineCache.SaveAfterPSOsLogged | 設定預期的PSO記錄數量,到了此數量便自動儲存。 |
另外,在GGameIni或GGameUserSettingsIni內,Shader Pipeline Cache用欄位 [ShaderPipelineCache.CacheFile]儲存資訊。
8.4 Shader開發
本章將講述Shader的開發案例、除錯技巧和優化技術。
8.4.1 Shader除錯
如果專案處於開發階段,最好將Shader的編譯選項改成Development,可以通過修改Engine\Config\ConsoleVariables.ini的以下配置達成:
將命令變數前面的分號去掉即可。它們的含義如下:
命令列 | 解析 |
---|---|
r.ShaderDevelopmentMode=1 | 獲得關於著色器編譯的詳細日誌和錯誤重試的機會。 |
r.DumpShaderDebugInfo=1 | 將編譯的所有著色器的檔案儲存到磁碟ProjectName/Saved/ShaderDebugInfo的目錄。包含原始檔、預處理後的版本、一個批處理檔案(用於使用編譯器等效的命令列選項來編譯預處理版本)。 |
r.DumpShaderDebugShortNames=1 | 儲存的Shader路徑將被精簡。 |
r.Shaders.Optimize=0 | 禁用著色器優化,使得shader的除錯資訊被保留。 |
r.Shaders.KeepDebugInfo=1 | 保留除錯資訊,配合RenderDoc等截幀工具時特別有用。 |
r.Shaders.SkipCompression=1 | 忽略shader壓縮,可以節省除錯shader的時間。 |
開啟了以上命令之後,用RenderDoc截幀將可以完整地看到Shader的變數、HLSL程式碼(不開啟將是彙編指令),還可以單步除錯。能夠有效提升Shader開發和除錯的效率。
r.DumpShaderDebugInfo開啟後,隨意在UE的內建shader修改一行程式碼(比如在Common.ush加個空格),重啟UE編輯器,著色器將被重新編譯,完成之後在ProjectName/Saved/ShaderDebugInfo的目錄下生成有用的除錯資訊:
開啟某個具體的材質shader目錄,可以發現有原始檔、預處理後的版本、批處理檔案以及雜湊值:
另外,如果修改了Shader的某些檔案(如BasePassPixelShader.ush),不需要重啟UE編輯器,可以在控制檯輸入RecompileShaders
命令重新編譯指定的shader檔案。其中RecompileShaders
的具體含義如下:
命令 | 解析 |
---|---|
RecompileShaders all | 編譯原始碼有修改的所有shader,包含global、material、meshmaterial。 |
RecompileShaders changed | 編譯原始碼有修改的shader。 |
RecompileShaders global | 編譯原始碼有修改的global shader。 |
RecompileShaders material | 編譯原始碼有修改的material shader。 |
RecompileShaders material | 編譯指定名稱的材質。 |
RecompileShaders | 編譯指定路徑的shader原始檔。 |
執行以上命令之前,必須先儲存shader檔案的修改。
另外,要在除錯時構建專案時,可以設定ShaderCompileWorker的解決方案屬性(Visual Studio:生成 -> 配置管理器)為 Debug_Program:
這樣就可以用ShaderCompileWorker (SCW) 新增Shader除錯命令列:
PathToGeneratedUsfFile -directcompile -format=ShaderFormat -ShaderType -entry=EntryPoint
- PathToGeneratedUsfFile 是 ShaderDebugInfo 資料夾中的最終 usf 檔案。
- ShaderFormat 是您想要除錯的著色器平臺格式(在本例中,這是 PCD3D_SM5)。
- ShaderType 是 vs/ps/gs/hs/ds/cs 中的一項,分別對應於“頂點”、“畫素”、“幾何體”、“物體外殼”、“域”和“計算”著色器型別。
- EntryPoint 是 usf 檔案中此著色器的入口點的函式名稱。
例如:
<ProjectPath>\Saved\ShaderDebugInfo\PCD3D_SM5\M_Egg\LocalVF\BPPSFNoLMPolicy\BasePassPixelShader.usf -format=PCD3D_SM5 -ps -entry=Main
可以對D3D11ShaderCompiler.cpp中的CompileD3DShader()函式設定斷點,通過命令列執行 SCW,可以瞭解如何呼叫平臺編譯器:
// Engine\Source\Developer\Windows\ShaderFormatD3D\Private\D3DShaderCompiler.cpp
void CompileD3DShader(const FShaderCompilerInput& Input, FShaderCompilerOutput& Output, FShaderCompilerDefinitions& AdditionalDefines, const FString& WorkingDirectory, ELanguage Language)
{
FString PreprocessedShaderSource;
const bool bIsRayTracingShader = IsRayTracingShader(Input.Target);
const bool bUseDXC = bIsRayTracingShader
|| Input.Environment.CompilerFlags.Contains(CFLAG_WaveOperations)
|| Input.Environment.CompilerFlags.Contains(CFLAG_ForceDXC);
const TCHAR* ShaderProfile = GetShaderProfileName(Input.Target, bUseDXC);
if(!ShaderProfile)
{
Output.Errors.Add(FShaderCompilerError(TEXT("Unrecognized shader frequency")));
return;
}
// 設定附加的定義.
AdditionalDefines.SetDefine(TEXT("COMPILER_HLSL"), 1);
if (bUseDXC)
{
AdditionalDefines.SetDefine(TEXT("PLATFORM_SUPPORTS_SM6_0_WAVE_OPERATIONS"), 1);
AdditionalDefines.SetDefine(TEXT("PLATFORM_SUPPORTS_STATIC_SAMPLERS"), 1);
}
if (Input.bSkipPreprocessedCache)
{
if (!FFileHelper::LoadFileToString(PreprocessedShaderSource, *Input.VirtualSourceFilePath))
{
return;
}
// 刪除常量, 因為是僅除錯模式.
CrossCompiler::CreateEnvironmentFromResourceTable(PreprocessedShaderSource, (FShaderCompilerEnvironment&)Input.Environment);
}
else
{
if (!PreprocessShader(PreprocessedShaderSource, Output, Input, AdditionalDefines))
{
return;
}
}
GD3DAllowRemoveUnused = Input.Environment.CompilerFlags.Contains(CFLAG_ForceRemoveUnusedInterpolators) ? 1 : 0;
FString EntryPointName = Input.EntryPointName;
Output.bFailedRemovingUnused = false;
if (GD3DAllowRemoveUnused == 1 && Input.Target.Frequency == SF_Vertex && Input.bCompilingForShaderPipeline)
{
// 總是增加SV_Position
TArray<FString> UsedOutputs = Input.UsedOutputs;
UsedOutputs.AddUnique(TEXT("SV_POSITION"));
// 不能刪除任何僅輸出的系統語法.
TArray<FString> Exceptions;
Exceptions.AddUnique(TEXT("SV_ClipDistance"));
Exceptions.AddUnique(TEXT("SV_ClipDistance0"));
Exceptions.AddUnique(TEXT("SV_ClipDistance1"));
Exceptions.AddUnique(TEXT("SV_ClipDistance2"));
Exceptions.AddUnique(TEXT("SV_ClipDistance3"));
Exceptions.AddUnique(TEXT("SV_ClipDistance4"));
Exceptions.AddUnique(TEXT("SV_ClipDistance5"));
Exceptions.AddUnique(TEXT("SV_ClipDistance6"));
Exceptions.AddUnique(TEXT("SV_ClipDistance7"));
Exceptions.AddUnique(TEXT("SV_CullDistance"));
Exceptions.AddUnique(TEXT("SV_CullDistance0"));
Exceptions.AddUnique(TEXT("SV_CullDistance1"));
Exceptions.AddUnique(TEXT("SV_CullDistance2"));
Exceptions.AddUnique(TEXT("SV_CullDistance3"));
Exceptions.AddUnique(TEXT("SV_CullDistance4"));
Exceptions.AddUnique(TEXT("SV_CullDistance5"));
Exceptions.AddUnique(TEXT("SV_CullDistance6"));
Exceptions.AddUnique(TEXT("SV_CullDistance7"));
DumpDebugShaderUSF(PreprocessedShaderSource, Input);
TArray<FString> Errors;
if (!RemoveUnusedOutputs(PreprocessedShaderSource, UsedOutputs, Exceptions, EntryPointName, Errors))
{
DumpDebugShaderUSF(PreprocessedShaderSource, Input);
UE_LOG(LogD3D11ShaderCompiler, Warning, TEXT("Failed to Remove unused outputs [%s]!"), *Input.DumpDebugInfoPath);
for (int32 Index = 0; Index < Errors.Num(); ++Index)
{
FShaderCompilerError NewError;
NewError.StrippedErrorMessage = Errors[Index];
Output.Errors.Add(NewError);
}
Output.bFailedRemovingUnused = true;
}
}
FShaderParameterParser ShaderParameterParser;
if (!ShaderParameterParser.ParseAndMoveShaderParametersToRootConstantBuffer(
Input, Output, PreprocessedShaderSource,
IsRayTracingShader(Input.Target) ? TEXT("cbuffer") : nullptr))
{
return;
}
RemoveUniformBuffersFromSource(Input.Environment, PreprocessedShaderSource);
uint32 CompileFlags = D3D10_SHADER_ENABLE_BACKWARDS_COMPATIBILITY
// 解壓unifor矩陣成行優先(row-major), 以匹配CPU佈局.
| D3D10_SHADER_PACK_MATRIX_ROW_MAJOR;
if (Input.Environment.CompilerFlags.Contains(CFLAG_Debug))
{
// 增加除錯標記.
CompileFlags |= D3D10_SHADER_DEBUG | D3D10_SHADER_SKIP_OPTIMIZATION;
}
else
{
if (Input.Environment.CompilerFlags.Contains(CFLAG_StandardOptimization))
{
CompileFlags |= D3D10_SHADER_OPTIMIZATION_LEVEL1;
}
else
{
CompileFlags |= D3D10_SHADER_OPTIMIZATION_LEVEL3;
}
}
for (int32 FlagIndex = 0; FlagIndex < Input.Environment.CompilerFlags.Num(); FlagIndex++)
{
// 累積標記設定到shader.
CompileFlags |= TranslateCompilerFlagD3D11((ECompilerFlags)Input.Environment.CompilerFlags[FlagIndex]);
}
TArray<FString> FilteredErrors;
if (bUseDXC)
{
if (!CompileAndProcessD3DShaderDXC(PreprocessedShaderSource, CompileFlags, Input, EntryPointName, ShaderProfile, Language, false, FilteredErrors, Output))
{
if (!FilteredErrors.Num())
{
FilteredErrors.Add(TEXT("Compile Failed without errors!"));
}
}
CrossCompiler::FShaderConductorContext::ConvertCompileErrors(MoveTemp(FilteredErrors), Output.Errors);
}
else
{
// 重寫預設的編譯器路徑到更新的dll.
FString CompilerPath = FPaths::EngineDir();
CompilerPath.Append(TEXT("Binaries/ThirdParty/Windows/DirectX/x64/d3dcompiler_47.dll"));
if (!CompileAndProcessD3DShaderFXC(PreprocessedShaderSource, CompilerPath, CompileFlags, Input, EntryPointName, ShaderProfile, false, FilteredErrors, Output))
{
if (!FilteredErrors.Num())
{
FilteredErrors.Add(TEXT("Compile Failed without errors!"));
}
}
// 處理錯誤.
for (int32 ErrorIndex = 0; ErrorIndex < FilteredErrors.Num(); ErrorIndex++)
{
const FString& CurrentError = FilteredErrors[ErrorIndex];
FShaderCompilerError NewError;
// Extract filename and line number from FXC output with format:
// "d:\UE4\Binaries\BasePassPixelShader(30,7): error X3000: invalid target or usage string"
int32 FirstParenIndex = CurrentError.Find(TEXT("("));
int32 LastParenIndex = CurrentError.Find(TEXT("):"));
if (FirstParenIndex != INDEX_NONE &&
LastParenIndex != INDEX_NONE &&
LastParenIndex > FirstParenIndex)
{
// Extract and store error message with source filename
NewError.ErrorVirtualFilePath = CurrentError.Left(FirstParenIndex);
NewError.ErrorLineString = CurrentError.Mid(FirstParenIndex + 1, LastParenIndex - FirstParenIndex - FCString::Strlen(TEXT("(")));
NewError.StrippedErrorMessage = CurrentError.Right(CurrentError.Len() - LastParenIndex - FCString::Strlen(TEXT("):")));
}
else
{
NewError.StrippedErrorMessage = CurrentError;
}
Output.Errors.Add(NewError);
}
}
const bool bDirectCompile = FParse::Param(FCommandLine::Get(), TEXT("directcompile"));
if (bDirectCompile)
{
for (const auto& Error : Output.Errors)
{
FPlatformMisc::LowLevelOutputDebugStringf(TEXT("%s\n"), *Error.GetErrorStringWithLineMarker());
}
}
ShaderParameterParser.ValidateShaderParameterTypes(Input, Output);
if (Input.ExtraSettings.bExtractShaderSource)
{
Output.OptionalFinalShaderSource = PreprocessedShaderSource;
}
}
此外,如果不借助RenderDoc等工具,可以將需要除錯的資料轉換成合理範圍的顏色值,以觀察它的值是否正常,例如:
// 將世界座標除以一個範圍內的數值, 並輸出到顏色.
OutColor = frac(WorldPosition / 1000);
配合RecompileShaders的指令,這一技巧非常管用且高效。
8.4.2 Shader優化
渲染的優化技術五花八門,大到系統、架構、工程層級,小到具體的語句,不過本節專注於UE環境下的Shader常規優化技巧。
8.4.2.1 優化排列
由於UE採用了Uber Shader的設計,同一個shader原始檔包含了大量的巨集定義,這些巨集定義根據不同的值可以組合成非常非常多的目的碼,而這些巨集通常由排列來控制。如果我們能夠有效控制排列的數量,也可以減少Shader的編譯數量、時間,提升執行時的效率。
在工廠配置中,有一些選項可以取消勾選,以減少排列的數量:
但需要注意,如果取消了勾選,意味著引擎將禁用該功能,需要根據實際情況做出權衡和選擇,而不應該為了優化而優化。
此外,在引擎渲染模組的很多內建型別,都提供ShouldCompilePermutation的介面,以便編譯器在正式編譯之前向被編譯物件查詢某個排列是否需要編譯,如果返回false,編譯器將忽略該排列,從而減少shader數量。支援ShouldCompilePermutation的型別包含但不限於:
- FShader
- FGlobalShader
- FMaterialShader
- FMeshMaterialShader
- FVertexFactory
- FLocalVertexFactory
- FShaderType
- FGlobalShaderType
- FMaterialShaderType
- 上述型別的子類
所以,我們在新新增以上型別的子類時,有必要認真對待ShouldCompilePermutation,以便剔除一些無用的shader排列。
對於材質,可以關閉材質屬性模板的 Automatically Set Usage in Editor選項,防止編輯過程中產生額外的標記,增加shader排列:
但帶來的效益可能不明顯,還會因為漏選某些標記導致材質不能正常工作(比如不支援蒙皮骨骼,不支援BS等)。
此外,要謹慎新增Switch節點,這些通常也會增加排列數量:
8.4.2.2 指令優化
避免if、switch分支語句。
避免
for
迴圈語句,特別是迴圈次數可變的。減少紋理取樣次數。
禁用
clip
或discard
操作。減少複雜數學函式呼叫。
使用更低精度的浮點數。OpenGL ES的浮點數有三種精度:highp(32位浮點), mediump(16位浮點), lowp(8位浮點),很多計算不需要高精度,可以改成低精度浮點。
避免重複計算。可以將所有畫素一樣的變數提前計算好,或者由C++層傳入:
precision mediump float;
float a = 0.9;
float b = 0.6; varying vec4 vColor; void main()
{
gl_FragColor = vColor * a * b; // a * b每個畫素都會計算,導致冗餘的消耗。可將a * b在c++層計算好再傳進shader。
}
向量延遲計算。
highp float f0, f1;
highp vec4 v0, v1; v0 = (v1 * f0) * f1; // v1和f0計算後返回一個向量,再和f1計算,多了一次向量計算。
// 改成:
v0 = v1 * (f0 * f1); // 先計算兩個浮點數,這樣只需跟向量計算一次。
充分利用向量分量掩碼。
highp vec4 v0;
highp vec4 v1;
highp vec4 v2;
v2.xz = v0 * v1; // v2只用了xz分量,比v2 = v0 * v1的寫法要快。
避免或減少臨時變數。
儘量將Pixel Shader計算移到Vertex Shader。例如畫素光改成頂點光。
將跟頂點或畫素無關的計算移到CPU,然後通過uniform傳進來。
分級策略。不同畫質不同平臺採用不同複雜度的演算法。
頂點輸入應當採用逐Structure的佈局,避免每個頂點屬性一個數組。逐Structure的佈局有利於提升GPU快取命中率。
儘可能用Compute Shader代替傳統的VS、PS管線。CS的管線更加簡單、純粹,利於並行化計算,結合LDS機制,可有效提升效率。
降解析度渲染。有些資訊沒有必要全分配率渲染,如模糊的倒影、SSR、SSGI等。
8.4.3 Shader開發案例
結合開發案例,有利於鞏固對UE Shader體系的掌握和理解。
8.4.3.1 新增Global Shader
本節通過增加一個全新的最簡化的Global Shader,以闡述Shader新增過程和步驟。
首先需要新增加一個shader原始檔,此處命名為MyTest.ush:
// VS主入口.
void MainVS(
in float4 InPosition : ATTRIBUTE0,
out float4 Output : SV_POSITION)
{
Output = InPosition;
}
// 顏色變數, 由c++層傳入.
float4 MyColor;
// PS主入口.
float4 MainPS() : SV_Target0
{
return MyColor;
}
再新增C++相關的VS和PS:
#include "GlobalShader.h"
// VS, 繼承自FGlobalShader
class FMyVS : public FGlobalShader
{
DECLARE_EXPORTED_SHADER_TYPE(FMyVS, Global, /*MYMODULE_API*/);
FMyTestVS() {}
FMyTestVS(const ShaderMetaType::CompiledShaderInitializerType& Initializer)
: FGlobalShader(Initializer)
{
}
static bool ShouldCache(EShaderPlatform Platform)
{
return true;
}
};
// 實現VS.
IMPLEMENT_SHADER_TYPE(, FMyVS, TEXT("MyTest"), TEXT("MainVS"), SF_Vertex);
// PS, 繼承自FGlobalShader
class FMyTestPS : public FGlobalShader
{
DECLARE_EXPORTED_SHADER_TYPE(FMyPS, Global, /*MYMODULE_API*/);
FShaderParameter MyColorParameter;
FMyTestPS() {}
FMyTestPS(const ShaderMetaType::CompiledShaderInitializerType& Initializer)
: FGlobalShader(Initializer)
{
// 繫結著色器引數.
MyColorParameter.Bind(Initializer.ParameterMap, TEXT("MyColor"), SPF_Mandatory);
}
static void ModifyCompilationEnvironment(EShaderPlatform Platform, FShaderCompilerEnvironment& OutEnvironment)
{
FGlobalShader::ModifyCompilationEnvironment(Platform, OutEnvironment);
// 增加定義.
OutEnvironment.SetDefine(TEXT("MY_DEFINE"), 1);
}
static bool ShouldCache(EShaderPlatform Platform)
{
return true;
}
// 序列化.
virtual bool Serialize(FArchive& Ar) override
{
bool bShaderHasOutdatedParameters = FGlobalShader::Serialize(Ar);
Ar << MyColorParameter;
return bShaderHasOutdatedParameters;
}
void SetColor(FRHICommandList& RHICmdList, const FLinearColor& Color)
{
// 設定顏色到RHI.
SetShaderValue(RHICmdList, RHICmdList.GetBoundPixelShader(), MyColorParameter, Color);
}
};
// 實現PS.
IMPLEMENT_SHADER_TYPE(, FMyPS, TEXT("MyTest"), TEXT("MainPS"), SF_Pixel);
最後編寫渲染程式碼呼叫上述自定義的VS和PS:
void RenderMyTest(FRHICommandList& RHICmdList, ERHIFeatureLevel::Type FeatureLevel, const FLinearColor& Color)
{
// 獲取全域性著色器對映表.
auto ShaderMap = GetGlobalShaderMap(FeatureLevel);
// 獲取VS和PS例項.
TShaderMapRef<FMyVS> MyVS(ShaderMap);
TShaderMapRef<FMyPS> MyPS(ShaderMap);
// 渲染狀態.
static FGlobalBoundShaderState MyTestBoundShaderState;
SetGlobalBoundShaderState(RHICmdList, FeatureLevel, MyTestBoundShaderState, GetVertexDeclarationFVector4(), *MyVS, *MyPS);
// 設定PS的顏色.
MyPS->SetColor(RHICmdList, Color);
// 設定渲染狀態.
RHICmdList.SetRasterizerState(TStaticRasterizerState::GetRHI());
RHICmdList.SetBlendState(TStaticBlendState<>::GetRHI());
RHICmdList.SetDepthStencilState(TStaticDepthStencilState::GetRHI(), 0);
// 建立全螢幕方塊的頂點.
FVector4 Vertices[4];
Vertices[0].Set(-1.0f, 1.0f, 0, 1.0f);
Vertices[1].Set(1.0f, 1.0f, 0, 1.0f);
Vertices[2].Set(-1.0f, -1.0f, 0, 1.0f);
Vertices[3].Set(1.0f, -1.0f, 0, 1.0f);
// 繪製方塊.
DrawPrimitiveUP(RHICmdList, PT_TriangleStrip, 2, Vertices, sizeof(Vertices[0]));
}
RenderMyTest實現完之後,可以新增到FDeferredShadingSceneRenderer::RenderFinish之中,以接入到主渲染流程中:
// 控制檯變數, 以便執行時檢視效果.
static TAutoConsoleVariable CVarMyTest(
TEXT("r.MyTest"),
0,
TEXT("Test My Global Shader, set it to 0 to disable, or to 1, 2 or 3 for fun!"),
ECVF_RenderThreadSafe
);
void FDeferredShadingSceneRenderer::RenderFinish(FRHICommandListImmediate& RHICmdList)
{
(......)
// 增加自定義的程式碼,以覆蓋UE之前的渲染。
int32 MyTestValue = CVarMyTest.GetValueOnAnyThread();
if (MyTestValue != 0)
{
FLinearColor Color(MyTestValue == 1, MyTestValue == 2, MyTestValue == 3, 1);
RenderMyTest(RHICmdList, FeatureLevel, Color);
}
FSceneRenderer::RenderFinish(RHICmdList);
(......)
}
以上邏輯最終渲染的顏色由r.MyTest決定:如果是0,則禁用;是1顯示紅色;是2顯示綠色;是3顯示藍色。
8.4.3.2 新增Vertex Factory
新增加FVertexFactory子類的過程如下:
// FMyVertexFactory.h
// 宣告頂點工廠著色器引數.
BEGIN_GLOBAL_SHADER_PARAMETER_STRUCT(FMyVertexFactoryParameters, )
SHADER_PARAMETER(FVector4, Color)
END_GLOBAL_SHADER_PARAMETER_STRUCT()
// 宣告型別.
typedef TUniformBufferRef<FMyVertexFactoryParameters> FMyVertexFactoryBufferRef;
// 索引緩衝.
class FMyMeshIndexBuffer : public FIndexBuffer
{
public:
FMyMeshIndexBuffer(int32 InNumQuadsPerSide) : NumQuadsPerSide(InNumQuadsPerSide) {}
void InitRHI() override
{
if (NumQuadsPerSide < 256)
{
IndexBufferRHI = CreateIndexBuffer<uint16>();
}
else
{
IndexBufferRHI = CreateIndexBuffer<uint32>();
}
}
int32 GetIndexCount() const { return NumIndices; };
private:
template <typename IndexType>
FIndexBufferRHIRef CreateIndexBuffer()
{
TResourceArray<IndexType, INDEXBUFFER_ALIGNMENT> Indices;
// 分配頂點索引記憶體.
Indices.Reserve(NumQuadsPerSide * NumQuadsPerSide * 6);
// 用Morton順序構建索引緩衝, 以更好地重用頂點.
for (int32 Morton = 0; Morton < NumQuadsPerSide * NumQuadsPerSide; Morton++)
{
int32 SquareX = FMath::ReverseMortonCode2(Morton);
int32 SquareY = FMath::ReverseMortonCode2(Morton >> 1);
bool ForwardDiagonal = false;
if (SquareX % 2)
{
ForwardDiagonal = !ForwardDiagonal;
}
if (SquareY % 2)
{
ForwardDiagonal = !ForwardDiagonal;
}
int32 Index0 = SquareX + SquareY * (NumQuadsPerSide + 1);
int32 Index1 = Index0 + 1;
int32 Index2 = Index0 + (NumQuadsPerSide + 1);
int32 Index3 = Index2 + 1;
Indices.Add(Index3);
Indices.Add(Index1);
Indices.Add(ForwardDiagonal ? Index2 : Index0);
Indices.Add(Index0);
Indices.Add(Index2);
Indices.Add(ForwardDiagonal ? Index1 : Index3);
}
NumIndices = Indices.Num();
const uint32 Size = Indices.GetResourceDataSize();
const uint32 Stride = sizeof(IndexType);
// Create index buffer. Fill buffer with initial data upon creation
FRHIResourceCreateInfo CreateInfo(&Indices);
return RHICreateIndexBuffer(Stride, Size, BUF_Static, CreateInfo);
}
int32 NumIndices = 0;
const int32 NumQuadsPerSide = 0;
};
// 頂點索引.
class FMyMeshVertexBuffer : public FVertexBuffer
{
public:
FMyMeshVertexBuffer(int32 InNumQuadsPerSide) : NumQuadsPerSide(InNumQuadsPerSide) {}
virtual void InitRHI() override
{
const uint32 NumVertsPerSide = NumQuadsPerSide + 1;
NumVerts = NumVertsPerSide * NumVertsPerSide;
FRHIResourceCreateInfo CreateInfo;
void* BufferData = nullptr;
VertexBufferRHI = RHICreateAndLockVertexBuffer(sizeof(FVector4) * NumVerts, BUF_Static, CreateInfo, BufferData);
FVector4* DummyContents = (FVector4*)BufferData;
for (uint32 VertY = 0; VertY < NumVertsPerSide; VertY++)
{
FVector4 VertPos;
VertPos.Y = (float)VertY / NumQuadsPerSide - 0.5f;
for (uint32 VertX = 0; VertX < NumVertsPerSide; VertX++)
{
VertPos.X = (float)VertX / NumQuadsPerSide - 0.5f;
DummyContents[NumVertsPerSide * VertY + VertX] = VertPos;
}
}
RHIUnlockVertexBuffer(VertexBufferRHI);
}
int32 GetVertexCount() const { return NumVerts; }
private:
int32 NumVerts = 0;
const int32 NumQuadsPerSide = 0;
};
// 頂點工廠.
class FMyVertexFactory : public FVertexFactory
{
DECLARE_VERTEX_FACTORY_TYPE(FMyVertexFactory);
public:
using Super = FVertexFactory;
FMyVertexFactory(ERHIFeatureLevel::Type InFeatureLevel);
~FMyVertexFactory();
virtual void InitRHI() override;
virtual void ReleaseRHI() override;
static bool ShouldCompilePermutation(const FVertexFactoryShaderPermutationParameters& Parameters);
static void ModifyCompilationEnvironment(const FVertexFactoryShaderPermutationParameters& Parameters, FShaderCompilerEnvironment& OutEnvironment);
static void ValidateCompiledResult(const FVertexFactoryType* Type, EShaderPlatform Platform, const FShaderParameterMap& ParameterMap, TArray<FString>& OutErrors);
inline const FUniformBufferRHIRef GetMyVertexFactoryUniformBuffer() const { return UniformBuffer; }
private:
void SetupUniformData();
FMyMeshVertexBuffer* VertexBuffer = nullptr;
FMyMeshIndexBuffer* IndexBuffer = nullptr;
FMyVertexFactoryBufferRef UniformBuffer;
};
// FMyVertexFactory.cpp
#include "ShaderParameterUtils.h"
// 實現FMyVertexFactoryParameters, 注意在shader的名字是MyVF.
IMPLEMENT_GLOBAL_SHADER_PARAMETER_STRUCT(FMyVertexFactoryParameters, "MyVF");
// 頂點工廠著色器引數.
class FMyVertexFactoryShaderParameters : public FVertexFactoryShaderParameters
{
DECLARE_TYPE_LAYOUT(FMyVertexFactoryShaderParameters, NonVirtual);
public:
void Bind(const FShaderParameterMap& ParameterMap)
{
}
void GetElementShaderBindings(
const class FSceneInterface* Scene,
const class FSceneView* View,
const class FMeshMaterialShader* Shader,
const EVertexInputStreamType InputStreamType,
ERHIFeatureLevel::Type FeatureLevel,
const class FVertexFactory* InVertexFactory,
const struct FMeshBatchElement& BatchElement,
class FMeshDrawSingleShaderBindings& ShaderBindings,
FVertexInputStreamArray& VertexStreams) const
{
// 強制轉換成FMyVertexFactory.
FMyVertexFactory* VertexFactory = (FMyVertexFactory*)InVertexFactory;
// 增加shader幫定到表格.
ShaderBindings.Add(Shader->GetUniformBufferParameter<FMyVertexFactoryShaderParameters>(), VertexFactory->GetMyVertexFactoryUniformBuffer());
// 填充頂點流.
if (VertexStreams.Num() > 0)
{
// 處理頂點流索引.
for (int32 i = 0; i < 2; ++i)
{
FVertexInputStream* InstanceInputStream = VertexStreams.FindByPredicate([i](const FVertexInputStream& InStream) { return InStream.StreamIndex == i+1; });
// 繫結頂點流索引.
InstanceInputStream->VertexBuffer = InstanceDataBuffers->GetBuffer(i);
}
// 處理偏移.
if (InstanceOffsetValue > 0)
{
VertexFactory->OffsetInstanceStreams(InstanceOffsetValue, InputStreamType, VertexStreams);
}
}
}
};
// ----------- 實現頂點工廠 -----------
FMyVertexFactory::FMyVertexFactory(ERHIFeatureLevel::Type InFeatureLevel)
{
VertexBuffer = new FMyMeshVertexBuffer(16);
IndexBuffer = new FMyMeshIndexBuffer(16);
}
FMyVertexFactory::~FMyVertexFactory()
{
delete VertexBuffer;
delete IndexBuffer;
}
void FMyVertexFactory::InitRHI()
{
Super::InitRHI();
// 設定Uniform資料.
SetupUniformData();
VertexBuffer->InitResource();
IndexBuffer->InitResource();
// 頂點流: 位置
FVertexStream PositionVertexStream;
PositionVertexStream.VertexBuffer = VertexBuffer;
PositionVertexStream.Stride = sizeof(FVector4);
PositionVertexStream.Offset = 0;
PositionVertexStream.VertexStreamUsage = EVertexStreamUsage::Default;
// 簡單的例項化頂點流資料 其中VertexBuffer在繫結時設定.
FVertexStream InstanceDataVertexStream;
InstanceDataVertexStream.VertexBuffer = nullptr;
InstanceDataVertexStream.Stride = sizeof(FVector4);
InstanceDataVertexStream.Offset = 0;
InstanceDataVertexStream.VertexStreamUsage = EVertexStreamUsage::Instancing;
FVertexElement VertexPositionElement(Streams.Add(PositionVertexStream), 0, VET_Float4, 0, PositionVertexStream.Stride, false);
// 頂點宣告.
FVertexDeclarationElementList Elements;
Elements.Add(VertexPositionElement);
// 新增索引頂點流.
for (int32 StreamIdx = 0; StreamIdx < NumAdditionalVertexStreams; ++StreamIdx)
{
FVertexElement InstanceElement(Streams.Add(InstanceDataVertexStream), 0, VET_Float4, 8 + StreamIdx, InstanceDataVertexStream.Stride, true);
Elements.Add(InstanceElement);
}
// 初始化宣告.
InitDeclaration(Elements);
}
void FMyVertexFactory::ReleaseRHI()
{
UniformBuffer.SafeRelease();
if (VertexBuffer)
{
VertexBuffer->ReleaseResource();
}
if (IndexBuffer)
{
IndexBuffer->ReleaseResource();
}
Super::ReleaseRHI();
}
void FMyVertexFactory::SetupUniformData()
{
FMyVertexFactoryParameters UniformParams;
UniformParams.Color = FVector4(1,0,0,1);
UniformBuffer = FMyVertexFactoryBufferRef::CreateUniformBufferImmediate(UniformParams, UniformBuffer_MultiFrame);
}
void FMyVertexFactory::ShouldCompilePermutation(const FVertexFactoryShaderPermutationParameters& Parameters)
{
return true;
}
void FMyVertexFactory::ModifyCompilationEnvironment(const FVertexFactoryShaderPermutationParameters& Parameters, FShaderCompilerEnvironment& OutEnvironment)
{
OutEnvironment.SetDefine(TEXT("MY_MESH_FACTORY"), 1);
}
void FMyVertexFactory::ValidateCompiledResult(const FVertexFactoryType* Type, EShaderPlatform Platform, const FShaderParameterMap& ParameterMap, TArray<FString>& OutErrors)
{
}
C++層的邏輯已經完成,但HLSL層也需要編寫對應的程式碼:
#include "/Engine/Private/VertexFactoryCommon.ush"
// VS插值到PS的結構體。
struct FVertexFactoryInterpolantsVSToPS
{
#if NUM_TEX_COORD_INTERPOLATORS
float4 TexCoords[(NUM_TEX_COORD_INTERPOLATORS+1)/2] : TEXCOORD0;
#endif
#if VF_USE_PRIMITIVE_SCENE_DATA
nointerpolation uint PrimitiveId : PRIMITIVE_ID;
#endif
#if INSTANCED_STEREO
nointerpolation uint EyeIndex : PACKED_EYE_INDEX;
#endif
};
struct FVertexFactoryInput
{
float4 Position : ATTRIBUTE0;
float4 InstanceData0 : ATTRIBUTE8;
float4 InstanceData1 : ATTRIBUTE9;
#if VF_USE_PRIMITIVE_SCENE_DATA
uint PrimitiveId : ATTRIBUTE13;
#endif
};
struct FPositionOnlyVertexFactoryInput
{
float4 Position : ATTRIBUTE0;
float4 InstanceData0 : ATTRIBUTE8;
float4 InstanceData1 : ATTRIBUTE9;
#if VF_USE_PRIMITIVE_SCENE_DATA
uint PrimitiveId : ATTRIBUTE1;
#endif
};
struct FPositionAndNormalOnlyVertexFactoryInput
{
float4 Position : ATTRIBUTE0;
float4 Normal : ATTRIBUTE2;
float4 InstanceData0 : ATTRIBUTE8;
float4 InstanceData1 : ATTRIBUTE9;
#if VF_USE_PRIMITIVE_SCENE_DATA
uint PrimitiveId : ATTRIBUTE1;
#endif
};
struct FVertexFactoryIntermediates
{
float3 OriginalWorldPos;
uint PrimitiveId;
};
uint GetPrimitiveId(FVertexFactoryInterpolantsVSToPS Interpolants)
{
#if VF_USE_PRIMITIVE_SCENE_DATA
return Interpolants.PrimitiveId;
#else
return 0;
#endif
}
void SetPrimitiveId(inout FVertexFactoryInterpolantsVSToPS Interpolants, uint PrimitiveId)
{
#if VF_USE_PRIMITIVE_SCENE_DATA
Interpolants.PrimitiveId = PrimitiveId;
#endif
}
#if NUM_TEX_COORD_INTERPOLATORS
float2 GetUV(FVertexFactoryInterpolantsVSToPS Interpolants, int UVIndex)
{
float4 UVVector = Interpolants.TexCoords[UVIndex / 2];
return UVIndex % 2 ? UVVector.zw : UVVector.xy;
}
void SetUV(inout FVertexFactoryInterpolantsVSToPS Interpolants, int UVIndex, float2 InValue)
{
FLATTEN
if (UVIndex % 2)
{
Interpolants.TexCoords[UVIndex / 2].zw = InValue;
}
else
{
Interpolants.TexCoords[UVIndex / 2].xy = InValue;
}
}
#endif
FMaterialPixelParameters GetMaterialPixelParameters(FVertexFactoryInterpolantsVSToPS Interpolants, float4 SvPosition)
{
// GetMaterialPixelParameters is responsible for fully initializing the result
FMaterialPixelParameters Result = MakeInitializedMaterialPixelParameters();
#if NUM_TEX_COORD_INTERPOLATORS
UNROLL
for (int CoordinateIndex = 0; CoordinateIndex < NUM_TEX_COORD_INTERPOLATORS; CoordinateIndex++)
{
Result.TexCoords[CoordinateIndex] = GetUV(Interpolants, CoordinateIndex);
}
#endif //NUM_MATERIAL_TEXCOORDS
Result.TwoSidedSign = 1;
Result.PrimitiveId = GetPrimitiveId(Interpolants);
return Result;
}
FMaterialVertexParameters GetMaterialVertexParameters(FVertexFactoryInput Input, FVertexFactoryIntermediates Intermediates, float3 WorldPosition, half3x3 TangentToLocal)
{
FMaterialVertexParameters Result = (FMaterialVertexParameters)0;
Result.WorldPosition = WorldPosition;
Result.TangentToWorld = float3x3(1,0,0,0,1,0,0,0,1);
Result.PreSkinnedPosition = Input.Position.xyz;
Result.PreSkinnedNormal = float3(0,0,1);
#if NUM_MATERIAL_TEXCOORDS_VERTEX
UNROLL
for(int CoordinateIndex = 0; CoordinateIndex < NUM_MATERIAL_TEXCOORDS_VERTEX; CoordinateIndex++)
{
Result.TexCoords[CoordinateIndex] = Intermediates.MorphedWorldPosRaw.xy;
}
#endif //NUM_MATERIAL_TEXCOORDS_VERTEX
return Result;
}
FVertexFactoryIntermediates GetVertexFactoryIntermediates(FVertexFactoryInput Input)
{
FVertexFactoryIntermediates Intermediates;
// Get the packed instance data
float4 Data0 = Input.InstanceData0;
float4 Data1 = Input.InstanceData1;
const float3 Translation = Data0.xyz;
const float3 Scale = float3(Data1.zw, 1.0f);
const uint PackedDataChannel = asuint(Data1.x);
// Lod level is in first 8 bits and ShouldMorph bit is in the 9th bit
const float LODLevel = (float)(PackedDataChannel & 0xFF);
const uint ShouldMorph = ((PackedDataChannel >> 8) & 0x1);
// Calculate the world pos
Intermediates.OriginalWorldPos = float3(Input.Position.xy, 0.0f) * Scale + Translation;
#if VF_USE_PRIMITIVE_SCENE_DATA
Intermediates.PrimitiveId = Input.PrimitiveId;
#else
Intermediates.PrimitiveId = 0;
#endif
return Intermediates;
}
half3x3 VertexFactoryGetTangentToLocal(FVertexFactoryInput Input, FVertexFactoryIntermediates Intermediates)
{
return half3x3(1,0,0,0,1,0,0,0,1);
}
float4 VertexFactoryGetRasterizedWorldPosition(FVertexFactoryInput Input, FVertexFactoryIntermediates Intermediates, float4 InWorldPosition)
{
return InWorldPosition;
}
float3 VertexFactoryGetPositionForVertexLighting(FVertexFactoryInput Input, FVertexFactoryIntermediates Intermediates, float3 TranslatedWorldPosition)
{
return TranslatedWorldPosition;
}
FVertexFactoryInterpolantsVSToPS VertexFactoryGetInterpolantsVSToPS(FVertexFactoryInput Input, FVertexFactoryIntermediates Intermediates, FMaterialVertexParameters VertexParameters)
{
FVertexFactoryInterpolantsVSToPS Interpolants;
Interpolants = (FVertexFactoryInterpolantsVSToPS)0;
#if NUM_TEX_COORD_INTERPOLATORS
float2 CustomizedUVs[NUM_TEX_COORD_INTERPOLATORS];
GetMaterialCustomizedUVs(VertexParameters, CustomizedUVs);
GetCustomInterpolators(VertexParameters, CustomizedUVs);
UNROLL
for (int CoordinateIndex = 0; CoordinateIndex < NUM_TEX_COORD_INTERPOLATORS; CoordinateIndex++)
{
SetUV(Interpolants, CoordinateIndex, CustomizedUVs[CoordinateIndex]);
}
#endif
#if INSTANCED_STEREO
Interpolants.EyeIndex = 0;
#endif
SetPrimitiveId(Interpolants, Intermediates.PrimitiveId);
return Interpolants;
}
float4 VertexFactoryGetWorldPosition(FPositionOnlyVertexFactoryInput Input)
{
return Input.Position;
}
float4 VertexFactoryGetPreviousWorldPosition(FVertexFactoryInput Input, FVertexFactoryIntermediates Intermediates)
{
float4x4 PreviousLocalToWorldTranslated = GetPrimitiveData(Intermediates.PrimitiveId).PreviousLocalToWorld;
PreviousLocalToWorldTranslated[3][0] += ResolvedView.PrevPreViewTranslation.x;
PreviousLocalToWorldTranslated[3][1] += ResolvedView.PrevPreViewTranslation.y;
PreviousLocalToWorldTranslated[3][2] += ResolvedView.PrevPreViewTranslation.z;
return mul(Input.Position, PreviousLocalToWorldTranslated);
}
float4 VertexFactoryGetTranslatedPrimitiveVolumeBounds(FVertexFactoryInterpolantsVSToPS Interpolants)
{
float4 ObjectWorldPositionAndRadius = GetPrimitiveData(GetPrimitiveId(Interpolants)).ObjectWorldPositionAndRadius;
return float4(ObjectWorldPositionAndRadius.xyz + ResolvedView.PreViewTranslation.xyz, ObjectWorldPositionAndRadius.w);
}
uint VertexFactoryGetPrimitiveId(FVertexFactoryInterpolantsVSToPS Interpolants)
{
return GetPrimitiveId(Interpolants);
}
float3 VertexFactoryGetWorldNormal(FPositionAndNormalOnlyVertexFactoryInput Input)
{
return Input.Normal.xyz;
}
float3 VertexFactoryGetWorldNormal(FVertexFactoryInput Input, FVertexFactoryIntermediates Intermediates)
{
return float3(0.0f, 0.0f, 1.0f);
}
由此可見,如果新增加了FVertexFactory的自定義型別,需要在HLSL實現以下介面:
函式 | 描述 |
---|---|
FVertexFactoryInput | 定義輸入到VS的資料佈局,需要匹配c++側的FVertexFactory的型別。 |
FVertexFactoryIntermediates | 用於儲存將在多個頂點工廠函式中使用的快取中間資料,比如TangentToLocal。 |
FVertexFactoryInterpolantsVSToPS | 從VS傳遞到PS的頂點工廠資料。 |
VertexFactoryGetWorldPosition | 從頂點著色器呼叫來獲得世界空間的頂點位置。 |
VertexFactoryGetInterpolantsVSToPS | 轉換FVertexFactoryInput到FVertexFactoryInterpolants,在硬體光柵化插值之前計算需要插值或傳遞到PS的資料。 |
GetMaterialPixelParameters | 由PS呼叫,根據FVertexFactoryInterpolants計算並填充FMaterialPixelParameters結構體。 |
8.5 本篇總結
本篇主要闡述了UE的shader體系的基礎概念、型別、機制,希望童鞋們學習完本篇之後,會UE的shader不再預設,並能夠應用於實際專案實踐中。
8.5.1 本篇思考
按慣例,本篇也佈置一些小思考,以助理解和加深UE Shader體系的掌握和理解:
- FShader的繼承體系中有哪些重要的子類?它們的功能是什麼?有什麼異同?
- Shader Parameter和Uniform Buffer如何宣告、實現、應用並更新到GPU中?
- Shader Map的儲存和編譯機制是怎麼樣的?
- UE在Shader跨平臺中採用了什麼方案?為什麼要那樣做?有沒更好的方式?
- 如何更好地除錯或優化Shader?
特別說明
- 感謝所有參考文獻的作者,部分圖片來自參考文獻和網路,侵刪。
- 本系列文章為筆者原創,只發表在部落格園上,歡迎分享本文連結,但未經同意,不允許轉載!
- 系列文章,未完待續,完整目錄請戳內容綱目。
- 系列文章,未完待續,完整目錄請戳內容綱目。
- 系列文章,未完待續,完整目錄請戳內容綱目。
參考文獻
- Unreal Engine Source
- Rendering and Graphics
- Materials
- Graphics Programming
- Shader Development
- Debugging the Shader Compile Process
- Creating a Custom Mesh Component in UE4 | Part 0: Intro
- Creating a Custom Mesh Component in UE4 | Part 1: An In-depth Explanation of Vertex Factories
- Creating a Custom Mesh Component in UE4 | Part 2: Implementing the Vertex Factory
- Unreal Engine 4 Rendering Part 1: Introduction
- Unreal Engine 4 Rendering Part 5: Shader Permutations
- 【UE4 Renderer】<03> PipelineBase
- UE4材質系統原始碼分析之材質編譯成HLSL CODE
- UE4 HLSL 和 Shader 開發指南和技巧
- Uniform Buffer、FVertexFactory、FVertexFactoryType
- 遊戲引擎隨筆 0x02:Shader 跨平臺編譯之路
- UE4 Shader 編譯以及變種實現
- 虛幻4渲染程式設計(Shader篇)【第四卷:虛幻4C++層和Shader層的簡單資料通訊】
- UE4渲染部分2: Shaders和Vertex Data
- Unreal Engine 4 Rendering Part 5: Shader Permutations
- HLSL Cross Compiler
- AsyncCompute
- 深入GPU硬體架構及執行機制
- 移動遊戲效能優化通用技法
- Adding Global Shaders to Unreal Engine
- Create a New Global Shader as a Plugin
- The Industry Open Standard Intermediate Language for Parallel Compute and Graphics
- 跨平臺引擎Shader編譯流程分析
- 關於Shader的跨平臺方案的考慮
- UE4的著色器跨平臺解決方案
- 跨平臺shader編譯的過去、現在和未來
- BRINGING UNREAL ENGINE 4 TO OPENGL
- FShaderCache