Introduction to 3D Game Programming with DirectX 12 學習筆記之 --- 第七章:在Direct3D中繪製(二)
程式碼工程地址:
https://github.com/jiabaodan/Direct12BookReadingNotes
學習目標
- 理解本章中針對命令佇列的更新(不再需要每幀都flush命令佇列),提高效能;
- 理解其他兩種型別的根訊號引數型別:根描述和根常量;
- 熟悉如何通過程式方法來繪製通用的幾何形狀:盒子,圓柱體和球體;
- 學習如何在CPU做頂點動畫,並且通過動態頂點緩衝將頂點資料上傳到GPU記憶體。
1 幀資源
在之前的程式碼中,我們在每幀結束的時候呼叫D3DApp::FlushCommandQueue方法來同步CPU和GPU,這個方法可以使用,但是很低效:
- 在每幀開始的時候,GPU沒有任何命令可以執行,所以它一直在等待,直到CPU提交命令;
- 每幀的結尾,CPU需要等待GPU執行完命令。
這個問題的其中一個解決方案是針對CPU更新的資源建立一個環形陣列,我們叫它幀資源(frame resources),通常情況下陣列中使用3個元素。該方案中,CPU提交資源後,將會獲取下一個可使用的資源(GPU沒有在執行的)繼續資料的更新,使用3個元素可以確保CPU提前2個元素更新,這樣就可以保證GPU一直的高效運算。下面的例子是使用在Shape示例中的,因為CPU只需要更新常量緩衝,所以幀資料只包含常量緩衝:
// Stores the resources needed for the CPU to build the command lists // for a frame. The contents here will vary from app to app based on // the needed resources. struct FrameResource { public: FrameResource(ID3D12Device* device, UINT passCount, UINT objectCount); FrameResource(const FrameResource& rhs) = delete; FrameResource& operator=(const FrameResource& rhs) = delete; ˜FrameResource(); // We cannot reset the allocator until the GPU is done processing the // commands. So each frame needs their own allocator. Microsoft::WRL::ComPtr<ID3D12CommandAllocator> CmdListAlloc; // We cannot update a cbuffer until the GPU is done processing the // commands that reference it. So each frame needs their own cbuffers. std::unique_ptr<UploadBuffer<PassConstants>> PassCB = nullptr; std::unique_ptr<UploadBuffer<ObjectConstants>> ObjectCB = nullptr; // Fence value to mark commands up to this fence point. This lets us // check if these frame resources are still in use by the GPU. UINT64 Fence = 0; }; FrameResource::FrameResource(ID3D12Device* device, UINT passCount, UINT objectCount) { ThrowIfFailed(device->CreateCommandAllocator( D3D12_COMMAND_LIST_TYPE_DIRECT, IID_PPV_ARGS(CmdListAlloc.GetAddressOf()))); PassCB = std::make_unique<UploadBuffer<PassConstants>> (device, passCount, true); ObjectCB = std::make_unique<UploadBuffer<ObjectConstants>> (device, objectCount, true); } FrameResource::˜ FrameResource() { }
在我們的應用中使用Vector來例項化3個資源,並且跟蹤當前的資源:
static const int NumFrameResources = 3; std::vector<std::unique_ptr<FrameResource>> mFrameResources; FrameResource* mCurrFrameResource = nullptr; int mCurrFrameResourceIndex = 0; void ShapesApp::BuildFrameResources() { for(int i = 0; i < gNumFrameResources; ++i) { mFrameResources.push_back(std::make_unique<FrameResource> ( md3dDevice.Get(), 1, (UINT)mAllRitems.size())); } }
現在對於CPU第N幀,執行演算法是:
void ShapesApp::Update(const GameTimer& gt)
{
// Cycle through the circular frame resource array.
mCurrFrameResourceIndex = (mCurrFrameResourceIndex + 1) % NumFrameResources;
mCurrFrameResource = mFrameResources[mCurrFrameResourceIndex];
// Has the GPU finished processing the commands of the current frame
// resource. If not, wait until the GPU has completed commands up to
// this fence point.
if(mCurrFrameResource->Fence != 0
&& mCommandQueue->GetLastCompletedFence() < mCurrFrameResource->Fence)
{
HANDLE eventHandle = CreateEventEx(nullptr, false, false, EVENT_ALL_ACCESS);
ThrowIfFailed(mCommandQueue->SetEventOnFenceCompletion(
mCurrFrameResource->Fence, eventHandle));
WaitForSingleObject(eventHandle, INFINITE);
CloseHandle(eventHandle);
}
// […] Update resources in mCurrFrameResource (like cbuffers).
}
void ShapesApp::Draw(const GameTimer& gt)
{
// […] Build and submit command lists for this frame.
// Advance the fence value to mark commands up to this fence point.
mCurrFrameResource->Fence = ++mCurrentFence;
// Add an instruction to the command queue to set a new fence point.
// Because we are on the GPU timeline, the new fence point won’t be
// set until the GPU finishes processing all the commands prior to
// this Signal().
mCommandQueue->Signal(mFence.Get(), mCurrentFence);
// Note that GPU could still be working on commands from previous
// frames, but that is okay, because we are not touching any frame
// resources associated with those frames.
}
這個方案並沒有完美解決等待,如果其中一個處理器處理太快,它還是要等待另一個處理器。
2 渲染物體(RENDER ITEMS)
繪製一個物體需要設定大量引數,比如建立頂點和索引快取,繫結常量緩衝,設定拓撲結構,指定DrawIndexedInstanced引數。如果我們要繪製多個物體,設計和建立一個輕量級結構用來儲存上述所有資料就很有用。我們對這一組單個繪製呼叫需要的所有資料稱之為一個渲染物體(render item),當前Demo中,我們RenderItem結構如下:
// Lightweight structure stores parameters to draw a shape. This will
// vary from app-to-app.
struct RenderItem
{
RenderItem() = default;
// World matrix of the shape that describes the object’s local space
// relative to the world space, which defines the position,
// orientation, and scale of the object in the world.
XMFLOAT4X4 World = MathHelper::Identity4x4();
// Dirty flag indicating the object data has changed and we need
// to update the constant buffer. Because we have an object
// cbuffer for each FrameResource, we have to apply the
// update to each FrameResource. Thus, when we modify obect data we
// should set
// NumFramesDirty = gNumFrameResources so that each frame resource
// gets the update.
int NumFramesDirty = gNumFrameResources;
// Index into GPU constant buffer corresponding to the ObjectCB
// for this render item.
UINT ObjCBIndex = -1;
// Geometry associated with this render-item. Note that multiple
// render-items can share the same geometry.
MeshGeometry* Geo = nullptr;
// Primitive topology.
D3D12_PRIMITIVE_TOPOLOGY PrimitiveType = D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST;
// DrawIndexedInstanced parameters.
UINT IndexCount = 0;
UINT StartIndexLocation = 0;
int BaseVertexLocation = 0;
};
我們的應用將包含一個渲染物體列表來表示他們如何渲染;需要不同PSO的物體會放置到不同的列表中:
// List of all the render items.
std::vector<std::unique_ptr<RenderItem>> mAllRitems;
// Render items divided by PSO.
std::vector<RenderItem*> mOpaqueRitems;
std::vector<RenderItem*> mTransparentRitems;
3 PASS CONSTANTS
之前的章節中我們介紹了一個新的常量緩衝:
std::unique_ptr<UploadBuffer<PassConstants>> PassCB = nullptr;
它主要包含一些各個物體通用的常量,比如眼睛位置,透視投影矩陣,螢幕解析度資料,還包括時間資料等。目前我們的Demo不需要所有這些資料,但是都實現他們會很方便,並且只會消耗很少的額外資料空間。比如我們如果要做一些後期特效,渲染目標尺寸資料就很有用:
cbuffer cbPass : register(b1)
{
float4x4 gView;
float4x4 gInvView;
float4x4 gProj;
float4x4 gInvProj;
float4x4 gViewProj;
float4x4 gInvViewProj;
float3 gEyePosW;
float cbPerObjectPad1;
float2 gRenderTargetSize;
float2 gInvRenderTargetSize;
float gNearZ;
float gFarZ;
float gTotalTime;
float gDeltaTime;
};
我們也需要修改之和每個物體關聯的常量緩衝。目前我們只需要世界變換矩陣:
cbuffer cbPerObject : register(b0)
{
float4x4 gWorld;
};
這樣做的好處是可以將常量緩衝分組進行更新,每一個pass更新的常量緩衝需要每一個渲染Pass的時候更新;物體常量只需要當物體世界矩陣變換的時候更新;靜態物體只需要在初始化的時候更新一下。在我們Demo中,實現了下面的方法來更新常量緩衝,它們每幀在Update中呼叫一次:
void ShapesApp::UpdateObjectCBs(const GameTimer& gt)
{
auto currObjectCB = mCurrFrameResource->ObjectCB.get();
for(auto& e : mAllRitems)
{
// Only update the cbuffer data if the constants have changed.
// This needs to be tracked per frame resource.
if(e->NumFramesDirty > 0)
{
XMMATRIX world = XMLoadFloat4x4(&e->World);
ObjectConstants objConstants;
XMStoreFloat4x4(&objConstants.World, XMMatrixTranspose(world));
currObjectCB->CopyData(e->ObjCBIndex, objConstants);
// Next FrameResource need to be updated too.
e->NumFramesDirty--;
}
}
}
void ShapesApp::UpdateMainPassCB(const GameTimer& gt)
{
XMMATRIX view = XMLoadFloat4x4(&mView);
XMMATRIX proj = XMLoadFloat4x4(&mProj);
XMMATRIX viewProj = XMMatrixMultiply(view, proj);
XMMATRIX invView = XMMatrixInverse(&XMMatrixDeterminant(view), view);
XMMATRIX invProj = XMMatrixInverse(&XMMatrixDeterminant(proj), proj);
XMMATRIX invViewProj = XMMatrixInverse(&XMMatrixDeterminant(viewProj), viewProj);
XMStoreFloat4x4(&mMainPassCB.View, XMMatrixTranspose(view));
XMStoreFloat4x4(&mMainPassCB.InvView, XMMatrixTranspose(invView));
XMStoreFloat4x4(&mMainPassCB.Proj, XMMatrixTranspose(proj));
XMStoreFloat4x4(&mMainPassCB.InvProj, XMMatrixTranspose(invProj));
XMStoreFloat4x4(&mMainPassCB.ViewProj, XMMatrixTranspose(viewProj));
XMStoreFloat4x4(&mMainPassCB.InvViewProj, XMMatrixTranspose(invViewProj));
mMainPassCB.EyePosW = mEyePos;
mMainPassCB.RenderTargetSize = XMFLOAT2((float)mClientWidth, (float)mClientHeight);
mMainPassCB.InvRenderTargetSize = XMFLOAT2(1.0f / mClientWidth, 1.0f / mClientHeight);
mMainPassCB.NearZ = 1.0f;
mMainPassCB.FarZ = 1000.0f;
mMainPassCB.TotalTime = gt.TotalTime();
mMainPassCB.DeltaTime = gt.DeltaTime();
auto currPassCB = mCurrFrameResource->PassCB.get();
currPassCB->CopyData(0, mMainPassCB);
}
我們更新頂點著色器相應的支援這個緩衝變換:
VertexOut VS(VertexIn vin)
{
VertexOut vout;
// Transform to homogeneous clip space.
float4 posW = mul(float4(vin.PosL, 1.0f), gWorld);
vout.PosH = mul(posW, gViewProj);
// Just pass vertex color into the pixel shader.
vout.Color = vin.Color;
return vout;
}
這裡額外的逐頂點矩陣相乘,在現在強大的GPU上是微不足道的。
著色器需要的資源發生變化,所以需要更新根簽名相應的包含兩個描述表:
CD3DX12_DESCRIPTOR_RANGE cbvTable0;
cbvTable0.Init(D3D12_DESCRIPTOR_RANGE_TYPE_CBV, 1, 0);
CD3DX12_DESCRIPTOR_RANGE cbvTable1;
cbvTable1.Init(D3D12_DESCRIPTOR_RANGE_TYPE_CBV, 1, 1);
// Root parameter can be a table, root descriptor or root constants.
CD3DX12_ROOT_PARAMETER slotRootParameter[2];
// Create root CBVs.
slotRootParameter[0].InitAsDescriptorTable(1, &cbvTable0);
slotRootParameter[1].InitAsDescriptorTable(1, &cbvTable1);
// A root signature is an array of root parameters.
CD3DX12_ROOT_SIGNATURE_DESC rootSigDesc(2,
slotRootParameter, 0, nullptr,
D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT__LAYOUT);
不要在著色器中使用太多的常量緩衝,為了效能[Thibieroz13]建議保持在5個以下。
4 形狀幾何
這節將會展示如何建立橢球體,球體,圓柱體和圓錐體。這些形狀對於繪製天空示例,Debugging,視覺化碰撞檢測和延時渲染非常有用。
我們將在程式中建立幾何體的程式碼放在GeometryGenerator(GeometryGenerator.h/.cpp)類中,該類建立的資料儲存在記憶體中,所以我們還需要將它們賦值到頂點/索引緩衝中。MeshData結構是一個內嵌在GeometryGenerator中用來儲存頂點和索引列表的簡單結構:
class GeometryGenerator
{
public:
using uint16 = std::uint16_t;
using uint32 = std::uint32_t;
struct Vertex
{
Vertex(){}
Vertex(
const DirectX::XMFLOAT3& p,
const DirectX::XMFLOAT3& n,
const DirectX::XMFLOAT3& t,
const DirectX::XMFLOAT2& uv) :
Position(p),
Normal(n),
TangentU(t),
TexC(uv){}
Vertex(
float px, float py, float pz,
float nx, float ny, float nz,
float tx, float ty, float tz,
float u, float v) :
Position(px,py,pz),
Normal(nx,ny,nz),
TangentU(tx, ty, tz),
TexC(u,v){}
DirectX::XMFLOAT3 Position;
DirectX::XMFLOAT3 Normal;
DirectX::XMFLOAT3 TangentU;
DirectX::XMFLOAT2 TexC;
};
struct MeshData
{
std::vector<Vertex> Vertices;
std::vector<uint32> Indices32;
std::vector<uint16>& GetIndices16()
{
if(mIndices16.empty())
{
mIndices16.resize(Indices32.size());
for(size_t i = 0; i < Indices32.size(); ++i)
mIndices16[i] = static_cast<uint16> (Indices32[i]);
}
return mIndices16;
}
private:
std::vector<uint16> mIndices16;
};
…
};
4.1 建立圓柱體網格
我們通過定義底面和頂面半徑,高度,切片(slice)和堆疊(stack)個數來定義一個圓柱體網格,如下圖,我們將圓柱體劃分成側面,底面和頂面:
4.1.1 圓柱體側面幾何
我們建立的圓柱體中心的原點,平行於Y軸,所有頂點依賴於環(rings)。每個圓柱體有stackCount + 1環,每一環有sliceCount個獨立的頂點。每一環半徑的變化為(topRadius – bottomRadius)/stackCount;所以基本的建立圓柱體的思路就是遍歷每一環建立頂點:
GeometryGenerator::MeshData
GeometryGenerator::CreateCylinder(
float bottomRadius, float topRadius,
float height, uint32 sliceCount, uint32
stackCount)
{
MeshData meshData;
//
// Build Stacks.
//
float stackHeight = height / stackCount;
// Amount to increment radius as we move up each stack level from
// bottom to top.
float radiusStep = (topRadius - bottomRadius) / stackCount;
uint32 ringCount = stackCount+1;
// Compute vertices for each stack ring starting at the bottom and
// moving up.
for(uint32 i = 0; i < ringCount; ++i)
{
float y = -0.5f*height + i*stackHeight;
float r = bottomRadius + i*radiusStep;
// vertices of ring
float dTheta = 2.0f*XM_PI/sliceCount;
for(uint32 j = 0; j <= sliceCount; ++j)
{
Vertex vertex;
float c = cosf(j*dTheta);
float s = sinf(j*dTheta);
vertex.Position = XMFLOAT3(r*c, y, r*s);
vertex.TexC.x = (float)j/sliceCount;
vertex.TexC.y = 1.0f - (float)i/stackCount;
// Cylinder can be parameterized as follows, where we introduce v
// parameter that goes in the same direction as the v tex-coord
// so that the bitangent goes in the same direction as the
// v tex-coord.
// Let r0 be the bottom radius and let r1 be the top radius.
// y(v) = h - hv for v in [0,1].
// r(v) = r1 + (r0-r1)v
//
// x(t, v) = r(v)*cos(t)
// y(t, v) = h - hv
// z(t, v) = r(v)*sin(t)
//
// dx/dt = -r(v)*sin(t)
// dy/dt = 0
// dz/dt = +r(v)*cos(t)
//
// dx/dv = (r0-r1)*cos(t)
// dy/dv = -h
// dz/dv = (r0-r1)*sin(t)
// This is unit length.
vertex.TangentU = XMFLOAT3(-s, 0.0f, c);
float dr = bottomRadius-topRadius;
XMFLOAT3 bitangent(dr*c, -height, dr*s);
XMVECTOR T = XMLoadFloat3(&vertex.TangentU);
XMVECTOR B = XMLoadFloat3(&bitangent);
XMVECTOR N = XMVector3Normalize(XMVector3Cross(T, B));
XMStoreFloat3(&vertex.Normal, N);
meshData.Vertices.push_back(vertex);
}
}
}
側面的每個四邊形中有2個三角形,所以第i層的第j個切片的索引計算如下:
n是每環中頂點的索引,所以建立索引的思路是遍歷每一層的每一個切面,然後應用上面的公式:
// Add one because we duplicate the first and last vertex per ring
// since the texture coordinates are different.
uint32 ringVertexCount = sliceCount+1;
// Compute indices for each stack.
for(uint32 i = 0; i < stackCount; ++i)
{
for(uint32 j = 0; j < sliceCount; ++j)
{
meshData.Indices32.push_back(i*ringVertexCount + j);
meshData.Indices32.push_back((i+1)*ringVertexCount + j);
meshData.Indices32.push_back((i+1)*ringVertexCount + j+1);
meshData.Indices32.push_back(i*ringVertexCount + j);
meshData.Indices32.push_back((i+1)*ringVertexCount + j+1);
meshData.Indices32.push_back(i*ringVertexCount + j+1);
}
}
BuildCylinderTopCap(bottomRadius, topRadius, height, sliceCount, stackCount, meshData);
BuildCylinderBottomCap(bottomRadius, topRadius, height, sliceCount, stackCount, meshData);
return meshData;
}
4.1.2 上下蓋
對頂面和底面通過建立切面數量個三角形模擬出近似的圓:
void GeometryGenerator::BuildCylinderTopCap(
float bottomRadius, float topRadius, float height,
uint32 sliceCount, uint32 stackCount, MeshData& meshData)
{
uint32 baseIndex = (uint32)meshData.Vertices.size();
float y = 0.5f*height;
float dTheta = 2.0f*XM_PI/sliceCount;
// Duplicate cap ring vertices because the texture coordinates and
// normals differ.
for(uint32 i = 0; i <= sliceCount; ++i)
{
float x = topRadius*cosf(i*dTheta);
float z = topRadius*sinf(i*dTheta);
// Scale down by the height to try and make top cap texture coord
// area proportional to base.
float u = x/height + 0.5f;
float v = z/height + 0.5f;
meshData.Vertices.push_back(Vertex(x, y, z, 0.0f, 1.0f, 0.0f, 1.0f, 0.0f, 0.0f, u, v) );
}
// Cap center vertex.
meshData.Vertices.push_back( Vertex(0.0f, y, 0.0f, 0.0f, 1.0f, 0.0f, 1.0f, 0.0f, 0.0f, 0.5f, 0.5f) );
// Index of center vertex.
uint32 centerIndex = (uint32)meshData.Vertices.size()-1;
for(uint32 i = 0; i < sliceCount; ++i)
{
meshData.Indices32.push_back(centerIndex);
meshData.Indices32.push_back(baseIndex + i+1);
meshData.Indices32.push_back(baseIndex + i);
}
}
底部程式碼類似
4.2 建立球體網格
我們通過定義半徑,切面和堆疊數來定義一個球體。建立球體的演算法和圓柱體的演算法非常類似,除了每環半徑的變化是非線性的基於三角函式的方程(GeometryGenerator::CreateSphere程式碼中)。我們可以通過縮放球體來建立橢圓形。
4.3 建立三角面片球體(Geosphere)網格
上節建立的球體並不具有相同的面積,在某些需求下這是不合需求的。一個三角面片球體使用相同面積和邊長的三角形組成近似的球體。
為了建立它,我們從一個二十面體開始,細分每個三角形然後將頂點對映到給定半徑的一個球體上。我們重複這個過程來對三角形細分。
下圖展示瞭如何細分三角形,就是簡單的找到每個邊的中點。
GeometryGenerator::MeshData GeometryGenerator::CreateGeosphere(float radius, uint32 numSubdivisions)
{
MeshData meshData;
// Put a cap on the number of subdivisions.
numSubdivisions = std::min<uint32> (numSubdivisions, 6u);
// Approximate a sphere by tessellating an icosahedron.
const float X = 0.525731f;
const float Z = 0.850651f;
XMFLOAT3 pos[12] =
{
XMFLOAT3(-X, 0.0f, Z), XMFLOAT3(X, 0.0f, Z),
XMFLOAT3(-X, 0.0f, -Z), XMFLOAT3(X, 0.0f, - Z),
XMFLOAT3(0.0f, Z, X), XMFLOAT3(0.0f, Z, - X),
XMFLOAT3(0.0f, -Z, X), XMFLOAT3(0.0f, -Z, - X),
XMFLOAT3(Z, X, 0.0f), XMFLOAT3(-Z, X, 0.0f),
XMFLOAT3(Z, -X, 0.0f), XMFLOAT3(-Z, -X, 0.0f)
};
uint32 k[60] =
{
1,4,0, 4,9,0, 4,5,9, 8,5,4, 1,8,4,
1,10,8, 10,3,8, 8,3,5, 3,2,5, 3,7,2,
3,10,7, 10,6,7, 6,11,7, 6,0,11, 6,1,0,
10,1,6, 11,0,9, 2,11,9, 5,2,9, 11,2,7
};
meshData.Vertices.resize(12);
meshData.Indices32.assign(&k[0], &k[60]);
for(uint32 i = 0; i < 12; ++i)
meshData.Vertices[i].Position = pos[i];
for(uint32 i = 0; i < numSubdivisions; ++i)
Subdivide(meshData);
// Project vertices onto sphere and scale.
for(uint32 i = 0; i < meshData.Vertices.size(); ++i)
{
// Project onto unit sphere.
XMVECTOR n = XMVector3Normalize(XMLoadFloat3(&meshData.Vertices[i].Position));
// Project onto sphere.
XMVECTOR p = radius*n;
XMStoreFloat3(&meshData.Vertices[i].Position, p);
XMStoreFloat3(&meshData.Vertices[i].Normal, n);
// Derive texture coordinates from spherical coordinates.
float theta = atan2f(meshData.Vertices[i].Position.z, meshData.Vertices[i].Position.x);
// Put in [0, 2pi].
if(theta < 0.0f)
theta += XM_2PI;
float phi = acosf(meshData.Vertices[i].Position.y / radius);
meshData.Vertices[i].TexC.x = theta/XM_2PI;
meshData.Vertices[i].TexC.y = phi/XM_PI;
// Partial derivative of P with respect to theta
meshData.Vertices[i].TangentU.x = -radius*sinf(phi)*sinf(theta);
meshData.Vertices[i].TangentU.y = 0.0f;
meshData.Vertices[i].TangentU.z = +radius*sinf(phi)*cosf(theta);
XMVECTOR T = XMLoadFloat3(&meshData.Vertices[i].TangentU);
XMStoreFloat3(&meshData.Vertices[i].TangentU, XMVector3Normalize(T));
}
return meshData;
}
5 形狀示例
為了驗證上面的程式碼,我們實現了這個“Shapes”Demo。另外還會學習多個物體的位置設定,並且將多個物體的資料放到同一個頂點和索引緩衝中。
5.1 頂點和索引緩衝
示例中,我們只儲存一份球體和圓柱體的資料,然後使用不同的世界座標重新繪製它們多次,這是一個例項化的例子,它可以節約記憶體。
下面的程式碼展示瞭如何建立幾何緩衝,如何快取需要的引數,如何繪製物體:
void ShapesApp::BuildShapeGeometry()
{
GeometryGenerator geoGen;
GeometryGenerator::MeshData box = geoGen.CreateBox(1.5f, 0.5f, 1.5f, 3);
GeometryGenerator::MeshData grid = geoGen.CreateGrid(20.0f, 30.0f, 60, 40);
GeometryGenerator::MeshData sphere = geoGen.CreateSphere(0.5f, 20, 20);
GeometryGenerator::MeshData cylinder = geoGen.CreateCylinder(0.5f, 0.3f, 3.0f, 20, 20);
//
// We are concatenating all the geometry into one big vertex/index
// buffer. So define the regions in the buffer each submesh covers.
//
// Cache the vertex offsets to each object in the concatenated vertex
// buffer.
UINT boxVertexOffset = 0;
UINT gridVertexOffset = (UINT)box.Vertices.size();
UINT sphereVertexOffset = gridVertexOffset + (UINT)grid.Vertices.size();
UINT cylinderVertexOffset = sphereVertexOffset + (UINT)sphere.Vertices.size();
// Cache the starting index for each object in the concatenated index
// buffer.
UINT boxIndexOffset = 0;
UINT gridIndexOffset = (UINT)box.Indices32.size();
UINT sphereIndexOffset = gridIndexOffset + (UINT)grid.Indices32.size();
UINT cylinderIndexOffset = sphereIndexOffset + (UINT)sphere.Indices32.size();
// Define the SubmeshGeometry that cover different
// regions of the vertex/index buffers.
SubmeshGeometry boxSubmesh;
boxSubmesh.IndexCount = (UINT)box.Indices32.size();
boxSubmesh.StartIndexLocation = boxIndexOffset;
boxSubmesh.BaseVertexLocation = boxVertexOffset;
SubmeshGeometry gridSubmesh;
gridSubmesh.IndexCount = (UINT)grid.Indices32.size();
gridSubmesh.StartIndexLocation = gridIndexOffset;
gridSubmesh.BaseVertexLocation = gridVertexOffset;
SubmeshGeometry sphereSubmesh;
sphereSubmesh.IndexCount = (UINT)sphere.Indices32.size();
sphereSubmesh.StartIndexLocation = sphereIndexOffset;
sphereSubmesh.BaseVertexLocation = sphereVertexOffset;
SubmeshGeometry cylinderSubmesh;
cylinderSubmesh.IndexCount = (UINT)cylinder.Indices32.size();
cylinderSubmesh.StartIndexLocation = cylinderIndexOffset;
cylinderSubmesh.BaseVertexLocation = cylinderVertexOffset;
//
// Extract the vertex elements we are interested in and pack the
// vertices of all the meshes into one vertex buffer.
//
auto totalVertexCount = box.Vertices.size() + grid.Vertices.size() + sphere.Vertices.size() + cylinder.Vertices.size();
std::vector<Vertex> vertices(totalVertexCount);
UINT k = 0;
for(size_t i = 0; i < box.Vertices.size(); ++i, ++k)
{
vertices[k].Pos = box.Vertices[i].Position;
vertices[k].Color = XMFLOAT4(DirectX::Colors::DarkGreen);
}
for(size_t i = 0; i < grid.Vertices.size(); ++i, ++k)
{
vertices[k].Pos = grid.Vertices[i].Position;
vertices[k].Color = XMFLOAT4(DirectX::Colors::ForestGreen);
}
for(size_t i = 0; i < sphere.Vertices.size(); ++i, ++k)
{
vertices[k].Pos = sphere.Vertices[i].Position;
vertices[k].Color = XMFLOAT4(DirectX::Colors::Crimson);
}
for(size_t i = 0; i < cylinder.Vertices.size(); ++i, ++k)
{
vertices[k].Pos = cylinder.Vertices[i].Position;
vertices[k].Color = XMFLOAT4(DirectX::Colors::SteelBlue);
}
std::vector<std::uint16_t> indices;
indices.insert(indices.end(), std::begin(box.GetIndices16()), std::end(box.GetIndices16()));
indices.insert(indices.end(), std::begin(grid.GetIndices16()), std::end(grid.GetIndices16()));
indices.insert(indices.end(), std::begin(sphere.GetIndices16()), std::end(sphere.GetIndices16()));
indices.insert(indices.end(), std::begin(cylinder.GetIndices16()), std::end(cylinder.GetIndices16()));
const UINT vbByteSize = (UINT)vertices.size() * sizeof(Vertex);
const UINT ibByteSize = (UINT)indices.size() * sizeof(std::uint16_t);
auto geo = std::make_unique<MeshGeometry>();
geo->Name = "shapeGeo";
ThrowIfFailed(D3DCreateBlob(vbByteSize, &geo->VertexBufferCPU));
CopyMemory(geo->VertexBufferCPU->GetBufferPointer(), vertices.data(), vbByteSize);
ThrowIfFailed(D3DCreateBlob(ibByteSize, &geo->IndexBufferCPU));
CopyMemory(geo->IndexBufferCPU->GetBufferPointer(), indices.data(), ibByteSize);
geo->VertexBufferGPU = d3dUtil::CreateDefaultBuffer(md3dDevice.Get(),
mCommandList.Get(), vertices.data(),
vbByteSize, geo->VertexBufferUploader);
geo->IndexBufferGPU = d3dUtil::CreateDefaultBuffer(md3dDevice.Get(),
mCommandList.Get(), indices.data(),
ibByteSize, geo->IndexBufferUploader);
geo->VertexByteStride = sizeof(Vertex);
geo->VertexBufferByteSize = vbByteSize;
geo->IndexFormat = DXGI_FORMAT_R16_UINT;
geo->IndexBufferByteSize = ibByteSize;
geo->DrawArgs["box"] = boxSubmesh;
geo->DrawArgs["grid"] = gridSubmesh;
geo->DrawArgs["sphere"] = sphereSubmesh;
geo->DrawArgs["cylinder"] = cylinderSubmesh;
mGeometries[geo->Name] = std::move(geo);
}
mGeometries變數定義如下:
std::unordered_map<std::string, std::unique_ptr<MeshGeometry>> mGeometries;
這個模式會在本書中其他地方一直使用,為每個幾何體,PSO,紋理和著色器建立一個新的變數名稱是非常笨重的,所以我們使用一個unordered maps在固定時間使用名稱來查詢或者引用物件,下面是一些其他例子:
std::unordered_map<std::string,
std::unique_ptr<MeshGeometry>> mGeometries;
std::unordered_map<std::string, ComPtr<ID3DBlob>> mShaders;
std::unordered_map<std::string, ComPtr<ID3D12PipelineState>> mPSOs;
5.2 渲染專案
現在我們定義場景中的渲染專案,觀察所有渲染專案如何共用一個MeshGeometry,並且如何使用DrawArgs獲取DrawIndexedInstanced來繪製子區間的頂點/索引緩衝:
// ShapesApp member variable.
std::vector<std::unique_ptr<RenderItem>> mAllRitems;
std::vector<RenderItem*> mOpaqueRitems;
void ShapesApp::BuildRenderItems()
{
auto boxRitem = std::make_unique<RenderItem>();
XMStoreFloat4x4(&boxRitem->World, XMMatrixScaling(2.0f, 2.0f, 2.0f)*XMMatrixTranslation(0.0f, 0.5f, 0.0f));
boxRitem->ObjCBIndex = 0;
boxRitem->Geo = mGeometries["shapeGeo"].get();
boxRitem->PrimitiveType = D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST;
boxRitem->IndexCount = boxRitem->Geo->DrawArgs["box"].IndexCount;
boxRitem->StartIndexLocation = boxRitem->Geo->DrawArgs["box"]. StartIndexLocation;
boxRitem->BaseVertexLocation = boxRitem->Geo->DrawArgs["box"]. BaseVertexLocation;
mAllRitems.push_back(std::move(boxRitem));
auto gridRitem = std::make_unique<RenderItem> ();
gridRitem->World = MathHelper::Identity4x4();
gridRitem->ObjCBIndex = 1;
gridRitem->Geo = mGeometries["shapeGeo"].get();
gridRitem->PrimitiveType = D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST;
gridRitem->IndexCount = gridRitem->Geo->DrawArgs["grid"].IndexCount;
gridRitem->StartIndexLocation = gridRitem->Geo->DrawArgs["grid"].StartIndexLocation;
gridRitem->BaseVertexLocation = gridRitem->Geo->DrawArgs["grid"].BaseVertexLocation;
mAllRitems.push_back(std::move(gridRitem));
// Build the columns and spheres in rows as in Figure 7.6.
UINT objCBIndex = 2;
for(int i = 0; i < 5; ++i)
{
auto leftCylRitem = std::make_unique<RenderItem>();
auto rightCylRitem = std::make_unique<RenderItem>();
auto leftSphereRitem = std::make_unique<RenderItem>();
auto rightSphereRitem = std::make_unique<RenderItem>();
XMMATRIX leftCylWorld = XMMatrixTranslation(-5.0f, 1.5f, -10.0f + i*5.0f);
XMMATRIX rightCylWorld = XMMatrixTranslation(+5.0f, 1.5f, -10.0f + i*5.0f);
XMMATRIX leftSphereWorld = XMMatrixTranslation(-5.0f, 3.5f, -10.0f + i*5.0f);
XMMATRIX rightSphereWorld = XMMatrixTranslation(+5.0f, 3.5f, -10.0f + i*5.0f);
XMStoreFloat4x4(&leftCylRitem->World, rightCylWorld);
leftCylRitem->ObjCBIndex = objCBIndex++;
leftCylRitem->Geo = mGeometries["shapeGeo"].get();
leftCylRitem->PrimitiveType = D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST;
leftCylRitem->IndexCount = leftCylRitem->Geo->DrawArgs["cylinder"].IndexCount;
leftCylRitem->StartIndexLocation =leftCylRitem->Geo->DrawArgs["cylinder"].StartIndexLocation;
leftCylRitem->BaseVertexLocation =leftCylRitem->Geo->DrawArgs["cylinder"].BaseVertexLocation;
XMStoreFloat4x4(&rightCylRitem->World,leftCylWorld);
rightCylRitem->ObjCBIndex = objCBIndex++;
rightCylRitem->Geo =mGeometries["shapeGeo"].get();
rightCylRitem->PrimitiveType = D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST;
rightCylRitem->IndexCount = rightCylRitem-> Geo->DrawArgs["cylinder"].IndexCount;
rightCylRitem->StartIndexLocation = rightCylRitem->Geo->DrawArgs["cylinder"].StartIndexLocation;
rightCylRitem->BaseVertexLocation = rightCylRitem->Geo->DrawArgs["cylinder"].BaseVertexLocation;
XMStoreFloat4x4(&leftSphereRitem->World, leftSphereWorld);
leftSphereRitem->ObjCBIndex = objCBIndex++;
leftSphereRitem->Geo = mGeometries["shapeGeo"].get();
leftSphereRitem->PrimitiveType = D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST;
leftSphereRitem->IndexCount = leftSphereRitem->Geo->DrawArgs["sphere"].IndexCount;
leftSphereRitem->StartIndexLocation = leftSphereRitem->Geo->DrawArgs["sphere"].StartIndexLocation;
leftSphereRitem->BaseVertexLocation = leftSphereRitem->Geo->DrawArgs["sphere"].BaseVertexLocation;
XMStoreFloat4x4(&rightSphereRitem->World, rightSphereWorld);
rightSphereRitem->ObjCBIndex = objCBIndex++;
rightSphereRitem->Geo = mGeometries["shapeGeo"].get();
rightSphereRitem->PrimitiveType = D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST;
rightSphereRitem->IndexCount = rightSphereRitem->Geo->DrawArgs["sphere"].IndexCount;
rightSphereRitem->StartIndexLocation = rightSphereRitem->Geo->DrawArgs["sphere"].StartIndexLocation;
rightSphereRitem->BaseVertexLocation = rightSphereRitem->Geo->DrawArgs["sphere"].BaseVertexLocation;
mAllRitems.push_back(std::move(leftCylRitem));
mAllRitems.push_back(std::move(rightCylRitem));
mAllRitems.push_back(std::move(leftSphereRitem));
mAllRitems.push_back(std::move(rightSphereRitem));
}
// All the render items are opaque in this demo.
for(auto& e : mAllRitems)
mOpaqueRitems.push_back(e.get());
}
5.3 幀資源和常量緩衝描述(Constant Buffer Views)
我們有一個vector的FrameResources,然後每一個FrameResources都有一個上傳緩衝來為場景中每一個專案儲存常量緩衝:
std::unique_ptr<UploadBuffer<PassConstants>> PassCB = nullptr;
std::unique_ptr<UploadBuffer<ObjectConstants>> ObjectCB = nullptr;
如果我們有3個幀資源,和n個渲染專案,那麼我們需要3n個物體常量緩衝和3個Pass常量緩衝,所以我們需要3(n+1)constant buffer views (CBVs),所以我們需要修改我們的CBV堆來包含這些東西:
void ShapesApp::BuildDescriptorHeaps()
{
UINT objCount = (UINT)mOpaqueRitems.size();
// Need a CBV descriptor for each object for each frame resource,
// +1 for the perPass CBV for each frame resource.
UINT numDescriptors = (objCount+1) * gNumFrameResources;
// Save an offset to the start of the pass CBVs. These are the last 3 descriptors.
mPassCbvOffset = objCount * gNumFrameResources;
D3D12_DESCRIPTOR_HEAP_DESC cbvHeapDesc;
cbvHeapDesc.NumDescriptors = numDescriptors;
cbvHeapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV;
cbvHeapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE;
cbvHeapDesc.NodeMask = 0;
ThrowIfFailed(md3dDevice->CreateDescriptorHeap(&cbvHeapDesc, IID_PPV_ARGS(&mCbvHeap)));
}
現在我們可以使用下面的程式碼填充CBV堆。其中0 到 n-1儲存第0個幀資源中的物體CBVs,n 到 2n−1儲存第1個幀資源中的物體CBVs,2n 到 3n−1儲存第2個幀資源中的物體CBVs,然後3n, 3n+1,和 3n+2儲存pass CBVs:
void ShapesApp::BuildConstantBufferViews()
{
UINT objCBByteSize = d3dUtil::CalcConstantBufferByteSize(sizeof(ObjectConstants));
UINT objCount = (UINT)mOpaqueRitems.size();
// Need a CBV descriptor for each object for each frame resource.
for(int frameIndex = 0; frameIndex < gNumFrameResources; ++frameIndex)
{
auto objectCB = mFrameResources[frameIndex]->ObjectCB->Resource();
for(UINT i = 0; i < objCount; ++i)
{
D3D12_GPU_VIRTUAL_ADDRESS cbAddress = objectCB->GetGPUVirtualAddress();
// Offset to the ith object constant buffer in the current buffer.
cbAddress += i*objCBByteSize;
// Offset to the object CBV in the descriptor heap.
int heapIndex = frameIndex*objCount + i;
auto handle = CD3DX12_CPU_DESCRIPTOR_HANDLE( mCbvHeap->GetCPUDescriptorHandleForHeapStart());
handle.Offset(heapIndex, mCbvSrvUavDescriptorSize);
D3D12_CONSTANT_BUFFER_VIEW_DESC cbvDesc;
cbvDesc.BufferLocation = cbAddress;
cbvDesc.SizeInBytes = objCBByteSize;
md3dDevice- >CreateConstantBufferView(&cbvDesc, handle);
}
}
UINT passCBByteSize = d3dUtil::CalcConstantBufferByteSize(sizeof (PassConstants));
// Last three descriptors are the pass CBVs for each frame resource.
for(int frameIndex = 0; frameIndex < gNumFrameResources; ++frameIndex)
{
auto passCB = mFrameResources[frameIndex]->PassCB->Resource();
// Pass buffer only stores one cbuffer per frame resource.
D3D12_GPU_VIRTUAL_ADDRESS cbAddress = passCB->GetGPUVirtualAddress();
// Offset to the pass cbv in the descriptor heap.
int heapIndex = mPassCbvOffset + frameIndex;
auto handle = CD3DX12_CPU_DESCRIPTOR_HANDLE(mCbvHeap->GetCPUDescriptorHandleForHeapStart());
handle.Offset(heapIndex, mCbvSrvUavDescriptorSize);
D3D12_CONSTANT_BUFFER_VIEW_DESC cbvDesc;
cbvDesc.BufferLocation = cbAddress;
cbvDesc.SizeInBytes = passCBByteSize;
md3dDevice->CreateConstantBufferView(&cbvDesc, handle);
}
}
我們可以得到第一個描述的控制代碼通過ID3D12DescriptorHeap::GetCPUDescriptorHandleForHeapStar方法。但是現在我們堆中有多個描述,所以這個方法並不高效。我們需要在堆中偏移我們的描述,為了實現這個,我們需要知道得到下一個描述需要偏移的大小。這個由硬體定義,所以我們需要從device那裡確認這些資訊,並且它取決於對型別:
mRtvDescriptorSize = md3dDevice->GetDescriptorHandleIncrementSize(D3D12_DESCRIPTOR_HEAP_TYPE_RTV);
mDsvDescriptorSize = md3dDevice->GetDescriptorHandleIncrementSize(D3D12_DESCRIPTOR_HEAP_TYPE_DSV);
mCbvSrvUavDescriptorSize = md3dDevice->GetDescriptorHandleIncrementSize(D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV);
當我們知道描述增長的大小後,可以使用2個CD3DX12_CPU_DESCRIPTOR_HANDLE::Offset方法中的一個來偏移到目標描述:
// Specify the number of descriptors to offset times the descriptor
// Offset by n descriptors:
CD3DX12_CPU_DESCRIPTOR_HANDLE handle = mCbvHeap->GetCPUDescriptorHandleForHeapStart();
handle.Offset(n * mCbvSrvDescriptorSize);
// Or equivalently, specify the number of descriptors to offset,
// followed by the descriptor increment size:
CD3DX12_CPU_DESCRIPTOR_HANDLE handle = mCbvHeap->GetCPUDescriptorHandleForHeapStart();
handle.Offset(n, mCbvSrvDescriptorSize);
其中CD3DX12_GPU_DESCRIPTOR_HANDLE也具有相同的Offset方法。
5.4 繪製場景
最後,我們可以繪製我們的渲染專案了。可能稍微不同一點的地方在於我們需要偏移到對應的CBV:
void ShapesApp::DrawRenderItems(ID3D12GraphicsCommandList* cmdList, const std::vector<RenderItem*>& ritems)
{
UINT objCBByteSize = d3dUtil::CalcConstantBufferByteSize(sizeof(ObjectConstants));
auto objectCB = mCurrFrameResource->ObjectCB->Resource();
// For each render item…
for(size_t i = 0; i < ritems.size(); ++i)
{
auto ri = ritems[i];
cmdList->IASetVertexBuffers(0, 1, &ri->Geo->VertexBufferView());
cmdList->IASetIndexBuffer(&ri->Geo->IndexBufferView());
cmdList->IASetPrimitiveTopology(ri->PrimitiveType);
// Offset to the CBV in the descriptor heap for this object and
// for this frame resource.
UINT cbvIndex = mCurrFrameResourceIndex* (UINT)mOpaqueRitems.size() + ri->ObjCBIndex;
auto cbvHandle = CD3DX12_GPU_DESCRIPTOR_HANDLE(mCbvHeap->GetGPUDescriptorHandleForHeapStart());
cbvHandle.Offset(cbvIndex, mCbvSrvUavDescriptorSize);
cmdList->SetGraphicsRootDescriptorTable(0, cbvHandle);
cmdList->DrawIndexedInstanced(ri->IndexCount,
1,
ri->StartIndexLocation, ri- >BaseVertexLocation, 0);
}
}
DrawRenderItems方法在Draw中呼叫:
void ShapesApp::Draw(const GameTimer& gt)
{
auto cmdListAlloc = mCurrFrameResource->CmdListAlloc;
// Reuse the memory associated with command recording.
// We can only reset when the associated command lists have
// finished execution on the GPU.
ThrowIfFailed(cmdListAlloc->Reset());
// A command list can be reset after it has been added to the
// command queue via ExecuteCommandList.
// Reusing the command list reuses memory.
if(mIsWireframe)
{
ThrowIfFailed(mCommandList->Reset(cmdListAlloc.Get(), mPSOs["opaque_wireframe"].Get()));
}
else
{
ThrowIfFailed(mCommandList->Reset(cmdListAlloc.Get(), mPSOs["opaque"].Get()));
}
mCommandList->RSSetViewports(1, &mScreenViewport);
mCommandList->RSSetScissorRects(1, &mScissorRect);
// Indicate a state transition on the resource usage.
mCommandList->ResourceBarrier(1,
&CD3DX12_RESOURCE_BARRIER::Transition(CurrentBackBuffer(),
D3D12_RESOURCE_STATE_PRESENT,
D3D12_RESOURCE_STATE_RENDER_TARGET));
// Clear the back buffer and depth buffer.
mCommandList->ClearRenderTargetView(CurrentBackBufferView(),Colors::LightSteelBlue, 0, nullptr);
mCommandList->ClearDepthStencilView(DepthStencilView(),
D3D12_CLEAR_FLAG_DEPTH |
D3D12_CLEAR_FLAG_STENCIL,
1.0f, 0, 0, nullptr);
// Specify the buffers we are going to render to.
mCommandList->OMSetRenderTargets(1, &CurrentBackBufferView(), true, &DepthStencilView());
ID3D12DescriptorHeap* descriptorHeaps[] = { mCbvHeap.Get() };
mCommandList->SetDescriptorHeaps(_countof(descriptorHeaps), descriptorHeaps);
mCommandList->SetGraphicsRootSignature(mRootSignature.Get());
int passCbvIndex = mPassCbvOffset + mCurrFrameResourceIndex;
auto passCbvHandle = CD3DX12_GPU_DESCRIPTOR_HANDLE(mCbvHeap->GetGPUDescriptorHandleForHeapStart());
passCbvHandle.Offset(passCbvIndex, mCbvSrvUavDescriptorSize);
mCommandList->SetGraphicsRootDescriptorTable(1, passCbvHandle);
DrawRenderItems(mCommandList.Get(), mOpaqueRitems);
// Indicate a state transition on the resource usage.
mCommandList->ResourceBarrier(1,
&CD3DX12_RESOURCE_BARRIER::Transition(CurrentBackBuffer(),
D3D12_RESOURCE_STATE_RENDER_TARGET,
D3D12_RESOURCE_STATE_PRESENT));
// Done recording commands.
ThrowIfFailed(mCommandList->Close());
// Add the command list to the queue for execution.
ID3D12CommandList* cmdsLists[] = { mCommandList.Get() };
mCommandQueue- >ExecuteCommandLists(_countof(cmdsLists), cmdsLists);
// Swap the back and front buffers
ThrowIfFailed(mSwapChain->Present(0, 0));
mCurrBackBuffer = (mCurrBackBuffer + 1) % SwapChainBufferCount;
// Advance the fence value to mark commands up to this fence point.
mCurrFrameResource->Fence = ++mCurrentFence;
// Add an instruction to the command queue to set a new fence point.
// Because we are on the GPU timeline, the new fence point won’t be
// set until the GPU finishes processing all the commands prior to this Signal().
mCommandQueue->Signal(mFence.Get(),
mCurrentFence);
}
6 更多關於根簽名(ROOT SIGNATURES)
在前一小節我們介紹了根簽名。一個根簽名定義了在設定繪製呼叫前,那些資源需要繫結到渲染管線和如何對映到著色器程式。當一個PSO建立好後,根簽名和著色器程式組合將被驗證。
6.1 根引數
一個根簽名是由一個根引數陣列定義,之前我們只是建立一個根引數來儲存一個描述表。然而一個根引數可以擁有下面3中型別:
- 描述表(Descriptor Table):繫結在堆中連續範圍定義的資源的引用;
- 根描述(Root descriptor (inline descriptor)):直接繫結到確定資源的描述;該描述不需要放到堆中。只有針對常量緩衝的CBV和針對緩衝的SRV/UAVs可以繫結到根描述;所以針對貼圖的SRV不能繫結到根描述。
- 根常量(Root constant):一個32位常量列表直接繫結的值。
為了效能,根簽名有一個64DWORDs的數量限制,下面是每種根簽名型別佔用的空間:
- Descriptor Table: 1 DWORD
- Root Descriptor: 2 DWORDs
- Root Constant: 1 DWORD per 32-bit constant
我們可以建立任意不超過64DWORD的根簽名,Root Descriptor非常方便但是會佔用更多的空間;比如如果只有一個常量資料:world-viewprojection矩陣,我們使用16個Root Constant來儲存它,它可以讓我們不用去建立常量緩衝和CBV堆;這些操作會消耗四分之一的開銷。在實際遊戲應用中,我們需要這三種類型的結合。
在程式碼中,我們需要為一個CD3DX12_ROOT_PARAMETER結構賦值:
typedef struct D3D12_ROOT_PARAMETER
{
D3D12_ROOT_PARAMETER_TYPE ParameterType;
union
{
D3D12_ROOT_DESCRIPTOR_TABLE DescriptorTable;
D3D12_ROOT_CONSTANTS Constants;
D3D12_ROOT_DESCRIPTOR Descriptor;
};
D3D12_SHADER_VISIBILITY ShaderVisibility;
}D3D12_ROOT_PARAMETER;
- ParameterType:下面列舉中的一個型別,定義根引數的型別:
enum D3D12_ROOT_PARAMETER_TYPE
{
D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE = 0,
D3D12_ROOT_PARAMETER_TYPE_32BIT_CONSTANTS= 1,
D3D12_ROOT_PARAMETER_TYPE_CBV = 2,
D3D12_ROOT_PARAMETER_TYPE_SRV = 3 ,
D3D12_ROOT_PARAMETER_TYPE_UAV = 4
} D3D12_ROOT_PARAMETER_TYPE;
- DescriptorTable/Constants/Descriptor:描述根引數的結構,根據型別來指定結構型別;
- ShaderVisibility:下面列舉中的一個型別,來定義著色器的可見性。本書中我們一般設定為D3D12_SHADER_VISIBILITY_ALL。但是如果我們只希望在畫素著色器中使用該資源我們可以設定為D3D12_SHADER_VISIBILITY_PIXEL。限制根引數的可見性可能會提高效能:
enum D3D12_SHADER_VISIBILITY
{
D3D12_SHADER_VISIBILITY_ALL = 0,
D3D12_SHADER_VISIBILITY_VERTEX = 1,
D3D12_SHADER_VISIBILITY_HULL = 2,
D3D12_SHADER_VISIBILITY_DOMAIN = 3,
D3D12_SHADER_VISIBILITY_GEOMETRY = 4,
D3D12_SHADER_VISIBILITY_PIXEL = 5
} D3D12_SHADER_VISIBILITY;
6.2 描述表(Descriptor Tables)
一個描述表根引數是由賦值一個DescriptorTable的變數D3D12_ROOT_PARAMETER結構來定義的:
typedef struct D3D12_ROOT_DESCRIPTOR_TABLE
{
UINT NumDescriptorRanges;
const D3D12_DESCRIPTOR_RANGE *pDescriptorRanges;
} D3D12_ROOT_DESCRIPTOR_TABLE;
它簡單定義一個D3D12_DESCRIPTOR_RANGEs資料和陣列中範圍的數量。
D3D12_DESCRIPTOR_RANGE結構定義如下:
typedef struct D3D12_DESCRIPTOR_RANGE
{
D3D12_DESCRIPTOR_RANGE_TYPE RangeType;
UINT NumDescriptors;
UINT BaseShaderRegister;
UINT RegisterSpace;
UINT OffsetInDescriptorsFromTableStart;
} D3D12_DESCRIPTOR_RANGE;
- RangeType:下面列舉型別來指定當前範圍的描述型別:
enum D3D12_DESCRIPTOR_RANGE_TYPE
{
D3D12_DESCRIPTOR_RANGE_TYPE_SRV = 0,
D3D12_DESCRIPTOR_RANGE_TYPE_UAV = 1,
D3D12_DESCRIPTOR_RANGE_TYPE_CBV = 2 ,
D3D12_DESCRIPTOR_RANGE_TYPE_SAMPLER = 3
} D3D12_DESCRIPTOR_RANGE_TYPE;
- NumDescriptors:該範圍中描述的數量;
- BaseShaderRegister:繫結的基本著色器暫存器引數。比如如果你設定NumDescriptors為3,BaseShaderRegister為1,並且型別為CBV。那麼你將繫結到暫存器到HLSL:
cbuffer cbA : register(b1) {…};
cbuffer cbB : register(b2) {…};
cbuffer cbC : register(b3) {…};
- RegisterSpace:這個屬性給予你另一個維度來定義著色器暫存器。比如下面2個暫存器看起來是重疊的,但是它們是不同的因為它們有不同的空間:
Texture2D gDiffuseMap : register(t0, space0);
Texture2D gNormalMap : register(t0, space1);
如果著色器程式中沒有指定空間,那麼它預設為space0。通常情況下我們都使用space0,但是對於一個資源資料,它就比較有用了,並且如果資源的大小無法確定是,它就很有必要。
- OffsetInDescriptorsFromTableStart:該描述範圍從表開始位置的偏移量。
一個槽的根引數通過一個D3D12_DESCRIPTOR_RANGE例項陣列來初始化為描述表是因為我們可以混合多種型別的描述在一個表中。假設我們定義一個下面三種類型,擁有6個描述的一個表:兩個CBV,三個SRV,和一個UAV。那麼這個表應該這麼定義:
// Create a table with 2 CBVs, 3 SRVs and 1 UAV.
CD3DX12_DESCRIPTOR_RANGE descRange[3];
descRange[0].Init(
D3D12_DESCRIPTOR_RANGE_TYPE_CBV, // descriptor type
2, // descriptor count
0, // base shader register arguments are bound to for this root
// parameter
0, // register space
0);// offset from start of table
descRange[1].Init(
D3D12_DESCRIPTOR_RANGE_TYPE_SRV, // descriptor type
3, // descriptor count
0, // base shader register arguments are bound to for this root
// parameter
0, // register space
2);// offset from start of table
descRange[2].Init(
D3D12_DESCRIPTOR_RANGE_TYPE_UAV, // descriptor
type
1, // descriptor count
0, // base shader register arguments are bound to for this root
// parameter
0, // register space
5);// offset from start of table
slotRootParameter[0].InitAsDescriptorTable(3, descRange, D3D12_SHADER_VISIBILITY_ALL);
CD3DX12_DESCRIPTOR_RANGE是繼承自D3D12_DESCRIPTOR_RANGE的結構,我們使用下面的初始化方法:
void CD3DX12_DESCRIPTOR_RANGE::Init(
D3D12_DESCRIPTOR_RANGE_TYPE rangeType,
UINT numDescriptors,
UINT baseShaderRegister,
UINT registerSpace = 0,
UINT offsetInDescriptorsFromTableStart = D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND);
該表包含6個描述,每個型別暫存器都從0開始,它們是不重複的,因為不同型別擁有不同暫存器型別。我們可以通過制定D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND讓D3D計算offsetInDescriptorsFromTableStart;該命令Direct3D使用之前描述範圍個數來計算偏移。CD3DX12_DESCRIPTOR_RANGE::Init方法暫存器空間預設是0並且OffsetInDescriptorsFromTableStart為D3D12_DESCRIPTOR_RANGE_OFFSET_APPEND。
6.3 根描述(Root Descriptors)
一個根描述根引數通過進一步定義Descriptor的D3D12_ROOT_PARAMETER變數來定義:
typedef struct D3D12_ROOT_DESCRIPTOR
{
UINT ShaderRegister;
UINT RegisterSpace;
}D3D12_ROOT_DESCRIPTOR;
- ShaderRegister:描述需要繫結的暫存器。
- RegisterSpace:如上述space。
和描述表不同,我們只需要簡單的直接把虛擬位置繫結到資源:
UINT objCBByteSize = d3dUtil::CalcConstantBufferByteSize(sizeof(ObjectConstants));
D3D12_GPU_VIRTUAL_ADDRESS objCBAddress = objectCB->GetGPUVirtualAddress();
// Offset to the constants for this object in the buffer.
objCBAddress += ri->ObjCBIndex*objCBByteSize;
cmdList->SetGraphicsRootConstantBufferView(
0, // root parameter index
objCBAddress);
6.4 根常量(Root Constants)
需要進一步定義D3D12_ROOT_PARAMETER:
typedef struct D3D12_ROOT_CONSTANTS
{
UINT ShaderRegister;
UINT RegisterSpace;
UINT Num32BitValues;
} D3D12_ROOT_CONSTANTS;
下面是一個使用的例子:
// Application code: Root signature definition.
CD3DX12_ROOT_PARAMETER slotRootParameter[1];
slotRootParameter[0].InitAsConstants(12, 0);
// A root signature is an array of root parameters.
CD3DX12_ROOT_SIGNATURE_DESC rootSigDesc(1,
slotRootParameter,
0, nullptr,
D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT);
// Application code: to set the constants to register b0.
auto weights = CalcGaussWeights(2.5f);
int blurRadius = (int)weights.size() / 2;
cmdList->SetGraphicsRoot32BitConstants(0, 1, &blurRadius, 0);
cmdList->SetGraphicsRoot32BitConstants(0, (UINT)weights.size(), weights.data(), 1);
// HLSL code.
cbuffer cbSettings : register(b0)
{
// We cannot have an array entry in a constant buffer that gets
// mapped onto root constants, so list each element.
int gBlurRadius;
// Support up to 11 blur weights.
float w0;
float w1;
float w2;
float w3;
float w4;
float w5;
float w6;
float w7;
float w8;
float w9;
float w10;
};
ID3D12GraphicsCommandList::SetGraphicsRoot32BitConstant引數如下:
void ID3D12GraphicsCommandList::SetGraphicsRoot32BitConstants(
UINT RootParameterIndex,
UINT Num32BitValuesToSet,
const void *pSrcData,
UINT DestOffsetIn32BitValues);
6.5 更復雜的根訊號例子
假設著色器程式希望獲得下面資源:
Texture2D gDiffuseMap : register(t0);
cbuffer cbPerObject : register(b0)
{
float4x4 gWorld;
float4x4 gTexTransform;
};
cbuffer cbPass : register(b1)
{
float4x4 gView;
float4x4 gInvView;
float4x4 gProj;
float4x4 gInvProj;
float4x4 gViewProj;
float4x4 gInvViewProj;
float3 gEyePosW;
float cbPerObjectPad1;
float2 gRenderTargetSize;
float2 gInvRenderTargetSize;
float gNearZ;
float gFarZ;
float gTotalTime;
float gDeltaTime;
float4 gAmbientLight;
Light gLights[MaxLights];
};
cbuffer cbMaterial : register(b2)
{
float4 gDiffuseAlbedo;
float3 gFresnelR0;
float gRoughness;
float4x4 gMatTransform;
};
那麼根簽名描述如下:
CD3DX12_DESCRIPTOR_RANGE texTable;
texTable.Init(
D3D12_DESCRIPTOR_RANGE_TYPE_SRV,
1, // number of descriptors
0); // register t0
// Root parameter can be a table, root descriptor or root constants.
CD3DX12_ROOT_PARAMETER slotRootParameter[4];
// Perfomance TIP: Order from most frequent to least frequent.
slotRootParameter[0].InitAsDescriptorTable(1, &texTable, D3D12_SHADER_VISIBILITY_PIXEL);
slotRootParameter[1].InitAsConstantBufferView(0);
// register b0
slotRootParameter[2].InitAsConstantBufferView(1);
// register b1
slotRootParameter[3].InitAsConstantBufferView(2);
// register b2
// A root signature is an array of root parameters.
CD3DX12_ROOT_SIGNATURE_DESC rootSigDesc(4,
slotRootParameter,
0, nullptr,
D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT);
6.6 根引數的版本管理
根引數物件(Root arguments)我們傳遞的根引數中的實際資料,考慮下面的程式碼,我們每一個繪製呼叫都改變了根引數物件:
for(size_t i = 0; i < mRitems.size(); ++i)
{
const auto& ri = mRitems[i];
…
// Offset to the CBV for this frame and this render item.
int cbvOffset = mCurrFrameResourceIndex*(int)mRitems.size();
cbvOffset += ri.CbIndex;
cbvHandle.Offset(cbvOffset, mCbvSrvDescriptorSize);
// Identify descriptors to use for this draw call.
cmdList->SetGraphicsRootDescriptorTable(0, cbvHandle);
cmdList->DrawIndexedInstanced(
ri.IndexCount, 1,
ri.StartIndexLocation,
ri.BaseVertexLocation, 0);
}
每一個繪製呼叫都會以當前設定的根引數物件狀態來執行。這樣可以正常執行,因為硬體會為每一個繪製呼叫自動儲存一份根引數物件的snapshot。也就是說根引數物件會在每次繪製呼叫中自動進行版本管理。
一個根簽名可以提供比著色器使用的更多的欄位。
為了效能優化,我們應該儘可能讓根簽名更小,其中一個原因是每個繪製呼叫中對根引數物件的自動版本管理,根簽名越大,需要的開銷就越大。更進一步,SDK文件建議跟引數應該根據改變的頻率來排序(從高到低),並且儘可能減少根簽名的切換。所以在多個PSO中共享一個根簽名是一個不錯的主意。所以建立一個“super”根簽名在多個著色器程式中共享,即使部分引數在部分著色器中不需要使用,也可以優化效能。但是如果這個“super”根簽名太大就會讓減少切換獲得的好處變小。
7 陸地和波浪示例
該圖是一個基於實數方程y = f(x, z)的平面,我們可以通過生成一個xz平面(每個方格由2個三角形組成),然後再將每個頂點套入上述方程後近似得到。
7.1 建立網格頂點
一個m × n的網格包含(m – 1) × (n – 1)個四邊形,每個四邊形包含2個三角形,所以總共就有2 (m – 1) × (n – 1)個三角形。如果網格的寬度是w,深度是d,那麼每個四邊形在x軸方向的長度是dx = w/(n – 1),在z軸方向就是dz = d/(m − 1)。所以我們從左上角開始建立,那麼第ijth個頂點的座標就是:
其建立程式碼如下:
GeometryGenerator::MeshData GeometryGenerator::CreateGrid(float width, float depth, uint32 m, uint32 n)
{
MeshData meshData;
uint32 vertexCount = m*n;
uint32 faceCount = (m-1)*(n-1)*2;
float halfWidth = 0.5f*width;
float halfDepth = 0.5f*depth;
float dx = width / (n-1);
float dz = depth / (m-1);
float du = 1.0f / (n-1);
float dv = 1.0f / (m-1);
meshData.Vertices.resize(vertexCount);
for(uint32 i = 0; i < m; ++i)
{
float z = halfDepth - i*dz;
for(uint32 j = 0; j < n; ++j)
{
float x = -halfWidth + j*dx;
meshData.Vertices[i*n+j].Position = XMFLOAT3(x, 0.0f, z);
meshData.Vertices[i*n+j].Normal = XMFLOAT3(0.0f, 1.0f, 0.0f);
meshData.Vertices[i*n+j].TangentU = XMFLOAT3(1.0f, 0.0f, 0.0f);
// Stretch texture over grid.
meshData.Vertices[i*n+j].TexC.x = j*du;
meshData.Vertices[i*n+j].TexC.y = i*dv;
}
}
}
7.2 建立網格索引
從左上角開始遍歷每個四邊形,用索引生成2個三角形:
生成索引的程式碼如下:
meshData.Indices32.resize(faceCount*3); // 3 indices per face
// Iterate over each quad and compute indices.
uint32 k = 0;
for(uint32 i = 0; i < m-1; ++i)
{
for(uint32 j = 0; j < n-1; ++j)
{
meshData.Indices32[k] = i*n+j;
meshData.Indices32[k+1] = i*n+j+1;
meshData.Indices32[k+2] = (i+1)*n+j;
meshData.Indices32[k+3] = (i+1)*n+j;
meshData.Indices32[k+4] = i*n+j+1;
meshData.Indices32[k+5] = (i+1)*n+j+1;
k += 6; // next quad
}
}
return meshData;
}
7.3 應用高度計算方程
建立好網格後,我們可以從MeshData中提取出頂點資料,轉換成高度不同的表面來表示山脈,然後再根據高度賦予不同的顏色:
// Not to be confused with GeometryGenerator::Vertex.
struct Vertex
{
XMFLOAT3 Pos;
XMFLOAT4 Color;
};
void LandAndWavesApp::BuildLandGeometry()
{
GeometryGenerator geoGen;
GeometryGenerator::MeshData grid = geoGen.CreateGrid(160.0f, 160.0f, 50, 50);
//
// Extract the vertex elements we are interested and apply the height
// function to each vertex. In addition, color the vertices based on
// their height so we have sandy looking beaches, grassy low hills,
// and snow mountain peaks.
//
std::vector<Vertex> vertices(grid.Vertices.size());
for(size_t i = 0; i < grid.Vertices.size(); ++i)
{
auto& p = grid.Vertices[i].Position;
vertices[i].Pos = p;
vertices[i].Pos.y = GetHillsHeight(p.x, p.z);
// Color the vertex based on its height.
if(vertices[i].Pos.y < -10.0f)
{
// Sandy beach color.
vertices[i].Color = XMFLOAT4(1.0f, 0.96f, 0.62f, 1.0f);
}
else if(vertices[i].Pos.y < 5.0f)
{
// Light yellow-green.
vertices[i].Color = XMFLOAT4(0.48f, 0.77f, 0.46f, 1.0f);
}
else if(vertices[i].Pos.y < 12.0f)
{
// Dark yellow-green.
vertices[i].Color = XMFLOAT4(0.1f, 0.48f, 0.19f, 1.0f);
}
else if(vertices[i].Pos.y < 20.0f)
{
// Dark brown.
vertices[i].Color = XMFLOAT4(0.45f, 0.39f, 0.34f, 1.0f);
}
else
{
// White snow.
vertices[i].Color = XMFLOAT4(1.0f, 1.0f, 1.0f, 1.0f);
}
}
const UINT vbByteSize = (UINT)vertices.size() * sizeof(Vertex);
std::vector<std::uint16_t> indices = grid.GetIndices16();
const UINT ibByteSize = (UINT)indices.size() * sizeof(std::uint16_t);
auto geo = std::make_unique<MeshGeometry>();
geo->Name = "landGeo";
ThrowIfFailed(D3DCreateBlob(vbByteSize, &geo->VertexBufferCPU));
CopyMemory(geo->VertexBufferCPU->GetBufferPointer(), vertices.data(), vbByteSize);
ThrowIfFailed(D3DCreateBlob(ibByteSize, &geo->IndexBufferCPU));
CopyMemory(geo->IndexBufferCPU->GetBufferPointer(), indices.data(), ibByteSize);
geo->VertexBufferGPU = d3dUtil::CreateDefaultBuffer(md3dDevice.Get(),
mCommandList.Get(), vertices.data(),
vbByteSize, geo->VertexBufferUploader);
geo->IndexBufferGPU = d3dUtil::CreateDefaultBuffer(md3dDevice.Get(),
mCommandList.Get(), indices.data(),
ibByteSize, geo->IndexBufferUploader);
geo->VertexByteStride = sizeof(Vertex);
geo->VertexBufferByteSize = vbByteSize