1. 程式人生 > >系統技術非業餘研究 » CPU拓撲結構的調查

系統技術非業餘研究 » CPU拓撲結構的調查

在做多核程式的時候(比如Erlang程式),我們需要了解cpu的拓撲結構, 瞭解logic CPU和物理的CPU的對映關係,以及瞭解CPU的內部的硬體引數,比如說
L1,L2 cache的大小等資訊。

Linux下的/proc/cpuinfo提供了相應的資訊,但是比較不全面。 /sys/devices/system/cpu/也提供了topology結構但是比較難解讀。

下載下來編譯執行就好。

[[email protected] cpu-topology]$ ./cpu_topology64.out

Advisory to Users on system topology enumeration

This utility is for demonstration purpose only. It assumes the hardware topology
configuration within a coherent domain does not change during the life of an OS
session. If an OS support advanced features that can change hardware topology
configurations, more sophisticated adaptation may be necessary to account for
the hardware configuration change that might have added and reduced the number
of logical processors being managed by the OS.

User should also`be aware that the system topology enumeration algorithm is
based on the assumption that CPUID instruction will return raw data reflecting
the native hardware configuration. When an application runs inside a virtual
machine hosted by a Virtual Machine Monitor (VMM), any CPUID instructions
issued by an app (or a guest OS) are trapped by the VMM and it is the VMM’s
responsibility and decision to emulate/supply CPUID return data to the virtual
machines. When deploying topology enumeration code based on querying CPUID
inside a VM environment, the user must consult with the VMM vendor on how an VMM
will emulate CPUID instruction relating to topology enumeration.

Software visible enumeration in the system:
Number of logical processors visible to the OS: 16
Number of logical processors visible to this process: 16
Number of processor cores visible to this process: 8
Number of physical packages visible to this process: 2

Hierarchical counts by levels of processor topology:
# of cores in package 0 visible to this process: 4 .
# of logical processors in Core 0 visible to this process: 2 .
# of logical processors in Core 1 visible to this process: 2 .
# of logical processors in Core 2 visible to this process: 2 .
# of logical processors in Core 3 visible to this process: 2 .
# of cores in package 1 visible to this process: 4 .
# of logical processors in Core 0 visible to this process: 2 .
# of logical processors in Core 1 visible to this process: 2 .
# of logical processors in Core 2 visible to this process: 2 .
# of logical processors in Core 3 visible to this process: 2 .

Affinity masks per SMT thread, per core, per package:
Individual:
P:0, C:0, T:0 –> 1
P:0, C:0, T:1 –> 100

Core-aggregated:
P:0, C:0 –> 101
Individual:
P:0, C:1, T:0 –> 4
P:0, C:1, T:1 –> 400

Core-aggregated:
P:0, C:1 –> 404
Individual:
P:0, C:2, T:0 –> 10
P:0, C:2, T:1 –> 1z3

Core-aggregated:
P:0, C:2 –> 1010
Individual:
P:0, C:3, T:0 –> 40
P:0, C:3, T:1 –> 4z3

Core-aggregated:
P:0, C:3 –> 4040

Pkg-aggregated:
P:0 –> 5555
Individual:
P:1, C:0, T:0 –> 2
P:1, C:0, T:1 –> 200

Core-aggregated:
P:1, C:0 –> 202
Individual:
P:1, C:1, T:0 –> 8
P:1, C:1, T:1 –> 800

Core-aggregated:
P:1, C:1 –> 808
Individual:
P:1, C:2, T:0 –> 20
P:1, C:2, T:1 –> 2z3

Core-aggregated:
P:1, C:2 –> 2020
Individual:
P:1, C:3, T:0 –> 80
P:1, C:3, T:1 –> 8z3

Core-aggregated:
P:1, C:3 –> 8080

Pkg-aggregated:
P:1 –> aaaa

APIC ID listings from affinity masks
OS cpu 0, Affinity mask 000001 – apic id 10
OS cpu 1, Affinity mask 000002 – apic id 0
OS cpu 2, Affinity mask 000004 – apic id 12
OS cpu 3, Affinity mask 000008 – apic id 2
OS cpu 4, Affinity mask 000010 – apic id 14
OS cpu 5, Affinity mask 000020 – apic id 4
OS cpu 6, Affinity mask 000040 – apic id 16
OS cpu 7, Affinity mask 000080 – apic id 6
OS cpu 8, Affinity mask 000100 – apic id 11
OS cpu 9, Affinity mask 000200 – apic id 1
OS cpu 10, Affinity mask 000400 – apic id 13
OS cpu 11, Affinity mask 000800 – apic id 3
OS cpu 12, Affinity mask 001000 – apic id 15
OS cpu 13, Affinity mask 002000 – apic id 5
OS cpu 14, Affinity mask 004000 – apic id 17
OS cpu 15, Affinity mask 008000 – apic id 7

Package 0 Cache and Thread details

Box Description:
Cache is cache level designator
Size is cache size
OScpu# is cpu # as seen by OS
Core is core#[_thread# if > 1 thread/core] inside socket
AffMsk is AffinityMask(extended hex) for core and thread
CmbMsk is Combined AffinityMask(extended hex) for hw threads sharing cache
CmbMsk will differ from AffMsk if > 1 hw_thread/cache
Extended Hex replaces trailing zeroes with ‘z#’
where # is number of zeroes (so ‘8z5’ is ‘0x800000’)
L1D is Level 1 Data cache, size(KBytes)= 32, Cores/cache= 2, Caches/package= 4
L1I is Level 1 Instruction cache, size(KBytes)= 32, Cores/cache= 2, Caches/package= 4
L2 is Level 2 Unified cache, size(KBytes)= 256, Cores/cache= 2, Caches/package= 4
L3 is Level 3 Unified cache, size(KBytes)= 8192, Cores/cache= 8, Caches/package= 1
+———–+———–+———–+———–+
Cache | L1D | L1D | L1D | L1D |
Size | 32K | 32K | 32K | 32K |
OScpu#| 0 8| 2 10| 4 12| 6 14|
Core |c0_t0 c0_t1|c1_t0 c1_t1|c2_t0 c2_t1|c3_t0 c3_t1|
AffMsk| 1 100| 4 400| 10 1z3| 40 4z3|
CmbMsk| 101 | 404 | 1010 | 4040 |
+———–+———–+———–+———–+

Cache | L1I | L1I | L1I | L1I |
Size | 32K | 32K | 32K | 32K |
+———–+———–+———–+———–+

Cache | L2 | L2 | L2 | L2 |
Size | 256K | 256K | 256K | 256K |
+———–+———–+———–+———–+

Cache | L3 |
Size | 8M |
CmbMsk| 5555 |
+———————————————–+

Combined socket AffinityMask= 0x5555

Package 1 Cache and Thread details

Box Description:
Cache is cache level designator
Size is cache size
OScpu# is cpu # as seen by OS
Core is core#[_thread# if > 1 thread/core] inside socket
AffMsk is AffinityMask(extended hex) for core and thread
CmbMsk is Combined AffinityMask(extended hex) for hw threads sharing cache
CmbMsk will differ from AffMsk if > 1 hw_thread/cache
Extended Hex replaces trailing zeroes with ‘z#’
where # is number of zeroes (so ‘8z5’ is ‘0x800000’)
+———–+———–+———–+———–+
Cache | L1D | L1D | L1D | L1D |
Size | 32K | 32K | 32K | 32K |
OScpu#| 1 9| 3 11| 5 13| 7 15|
Core |c0_t0 c0_t1|c1_t0 c1_t1|c2_t0 c2_t1|c3_t0 c3_t1|
AffMsk| 2 200| 8 800| 20 2z3| 80 8z3|
CmbMsk| 202 | 808 | 2020 | 8080 |
+———–+———–+———–+———–+

Cache | L1I | L1I | L1I | L1I |
Size | 32K | 32K | 32K | 32K |
+———–+———–+———–+———–+

Cache | L2 | L2 | L2 | L2 |
Size | 256K | 256K | 256K | 256K |
+———–+———–+———–+———–+

Cache | L3 |
Size | 8M |
CmbMsk| aaaa |
+———————————————–+

我們可以很清楚的看到我們CPU的資訊,L1,L2,L3, cacheline的大小等,這些資訊我們在做程式的時候經常需要的。
玩的開心!

參考文獻:

Post Footer automatically generated by wp-posturl plugin for wordpress.

No related posts.