跟Google學寫程式碼--Chromium/base--cpu原始碼學習及應用

阿新 • • 發佈：2019-01-14

今天分享cpu相關的操作。

先看看這個列舉：

  enum IntelMicroArchitecture {
    PENTIUM,
    SSE,
    SSE2,
    SSE3,
    SSSE3,
    SSE41,
    SSE42,
    AVX,
    MAX_INTEL_MICRO_ARCHITECTURE
  };

什麼是sse?
SSE(Streaming SIMD Extensions)是英特爾在AMD的3D Now!釋出一年之後，在其計算機晶片Pentium III中引入的指令集，是MMX的超集。

SSE2
SSE2是Intel在Pentium 4處理器的最初版本中引入的，但是AMD後來在Opteron 和Athlon 64處理器中也加入了SSE2的支援。SSE2指令集添加了對64位雙精度浮點數的支援。這個指令集還增加了對CPU快取的控制指令。AMD對它的擴充套件增加了8個XMM暫存器，但是需要切換到64位模式（AMD64）才可以使用這些暫存器。

SSE3
SSE3是Intel在Pentium 4處理器的 Prescott 核心中引入的第三代SIMD指令集，AMD在Athlon 64的第五個版本，Venice核心中也加入了SSE3的支援。以及對超執行緒技術的支援。
SSSE3
SSSE3是Intel針對SSE3指令集的一次額外擴充，最早內建於Core 2 Duo處理器中。

SSE4
SSE4是Intel在Penryn核心的Core 2 Duo與Core 2 Solo處理器時，新增的47條新多媒體指令集，多媒體指令集，並內建在Phenom與Opteron等K10架構處理器中，不過無法與Intel的SSE4系列指令集相容。

SSE5
SSE5]是AMD為了打破Intel壟斷在處理器指令集的獨霸地位所提出的，SSE5初期規劃將加入超過100條新指令，其中最引人注目的就是三運算元指令（3-Operand Instructions）及熔合乘法累積（Fused Multiply Accumulate）。其中，三運算元指令讓處理器可將一個數學或邏輯函式庫，套用到運算元或輸入資料。藉由增加運算元的數量，一個 x86 指令能處理二至三筆資料， SSE5 允許將多個簡單指令彙整成一個指令，達到更有效率的指令處理模式。提升為三運算指令的運算能力，是少數 RISC 架構的水平。熔合乘法累積讓允許建立新的指令，有效率地執行各種複雜的運算。熔合乘法累積可結合乘法與加法運算，透過單一指令執行多筆重複計算。透過簡化程式碼，讓系統能迅速執行繪圖著色、快速相片著色、音場音效，以及複雜向量演算等效能密集的應用作業。SSE5最快將內建於AMD下一代Bulldozer核心。

AVX
AVX是Intel的SSE延伸架構，如IA16至IA32般的把暫存器XMM 128bit提升至YMM 256bit，以增加一倍的運算效率。此架構支援了三運算指令（3-Operand Instructions），減少在編碼上需要先複製才能運算的動作。在微碼部分使用了LES LDS這兩少用的指令作為延伸指令Prefix。

cpu.h
由於這個類比較簡短，所以就貼上所有的標頭檔案了：

#ifndef BASE_CPU_H_
#define BASE_CPU_H_

#include <string>

#include "base/base_export.h"

namespace 
 base {

// Query information about the processor.
class BASE_EXPORT CPU {
 public:
  // Constructor
  CPU();

  enum IntelMicroArchitecture {
    PENTIUM,
    SSE,
    SSE2,
    SSE3,
    SSSE3,
    SSE41,
    SSE42,
    AVX,
    MAX_INTEL_MICRO_ARCHITECTURE
  };

  // Accessors for CPU information.
  const std::string& vendor_name() const { return cpu_vendor_; }
  int signature() const { return signature_; }
  int stepping() const { return stepping_; }
  int model() const { return model_; }
  int family() const { return family_; }
  int type() const { return type_; }
  int extended_model() const { return ext_model_; }
  int extended_family() const { return ext_family_; }
  bool has_mmx() const { return has_mmx_; }
  bool has_sse() const { return has_sse_; }
  bool has_sse2() const { return has_sse2_; }
  bool has_sse3() const { return has_sse3_; }
  bool has_ssse3() const { return has_ssse3_; }
  bool has_sse41() const { return has_sse41_; }
  bool has_sse42() const { return has_sse42_; }
  bool has_avx() const { return has_avx_; }
  // has_avx_hardware returns true when AVX is present in the CPU. This might
  // differ from the value of |has_avx()| because |has_avx()| also tests for
  // operating system support needed to actually call AVX instuctions.
  // Note: you should never need to call this function. It was added in order
  // to workaround a bug in NSS but |has_avx()| is what you want.
  bool has_avx_hardware() const { return has_avx_hardware_; }
  bool has_aesni() const { return has_aesni_; }
  bool has_non_stop_time_stamp_counter() const {
    return has_non_stop_time_stamp_counter_;
  }
  // has_broken_neon is only valid on ARM chips. If true, it indicates that we
  // believe that the NEON unit on the current CPU is flawed and cannot execute
  // some code. See https://code.google.com/p/chromium/issues/detail?id=341598
  bool has_broken_neon() const { return has_broken_neon_; }

  IntelMicroArchitecture GetIntelMicroArchitecture() const;
  const std::string& cpu_brand() const { return cpu_brand_; }

 private:
  // Query the processor for CPUID information.
  void Initialize();

  int signature_;  // raw form of type, family, model, and stepping
  int type_;  // process type
  int family_;  // family of the processor
  int model_;  // model of processor
  int stepping_;  // processor revision number
  int ext_model_;
  int ext_family_;
  bool has_mmx_;
  bool has_sse_;
  bool has_sse2_;
  bool has_sse3_;
  bool has_ssse3_;
  bool has_sse41_;
  bool has_sse42_;
  bool has_avx_;
  bool has_avx_hardware_;
  bool has_aesni_;
  bool has_non_stop_time_stamp_counter_;
  bool has_broken_neon_;
  std::string cpu_vendor_;
  std::string cpu_brand_;
};

}  // namespace base

#endif  // BASE_CPU_H_

Initialize的實現

void CPU::Initialize() {
#if defined(ARCH_CPU_X86_FAMILY)
  int cpu_info[4] = {-1};
  char cpu_string[48];

  // __cpuid with an InfoType argument of 0 returns the number of
  // valid Ids in CPUInfo[0] and the CPU identification string in
  // the other three array elements. The CPU identification string is
  // not in linear order. The code below arranges the information
  // in a human readable form. The human readable order is CPUInfo[1] |
  // CPUInfo[3] | CPUInfo[2]. CPUInfo[2] and CPUInfo[3] are swapped
  // before using memcpy to copy these three array elements to cpu_string.
  __cpuid(cpu_info, 0);
  int num_ids = cpu_info[0];
  std::swap(cpu_info[2], cpu_info[3]);
  memcpy(cpu_string, &cpu_info[1], 3 * sizeof(cpu_info[1]));
  cpu_vendor_.assign(cpu_string, 3 * sizeof(cpu_info[1]));

  // Interpret CPU feature information.
  if (num_ids > 0) {
    __cpuid(cpu_info, 1);
    signature_ = cpu_info[0];
    stepping_ = cpu_info[0] & 0xf;
    model_ = ((cpu_info[0] >> 4) & 0xf) + ((cpu_info[0] >> 12) & 0xf0);
    family_ = (cpu_info[0] >> 8) & 0xf;
    type_ = (cpu_info[0] >> 12) & 0x3;
    ext_model_ = (cpu_info[0] >> 16) & 0xf;
    ext_family_ = (cpu_info[0] >> 20) & 0xff;
    has_mmx_ =   (cpu_info[3] & 0x00800000) != 0;
    has_sse_ =   (cpu_info[3] & 0x02000000) != 0;
    has_sse2_ =  (cpu_info[3] & 0x04000000) != 0;
    has_sse3_ =  (cpu_info[2] & 0x00000001) != 0;
    has_ssse3_ = (cpu_info[2] & 0x00000200) != 0;
    has_sse41_ = (cpu_info[2] & 0x00080000) != 0;
    has_sse42_ = (cpu_info[2] & 0x00100000) != 0;
    has_avx_hardware_ =
                 (cpu_info[2] & 0x10000000) != 0;
    // AVX instructions will generate an illegal instruction exception unless
    //   a) they are supported by the CPU,
    //   b) XSAVE is supported by the CPU and
    //   c) XSAVE is enabled by the kernel.
    // See http://software.intel.com/en-us/blogs/2011/04/14/is-avx-enabled
    //
    // In addition, we have observed some crashes with the xgetbv instruction
    // even after following Intel's example code. (See crbug.com/375968.)
    // Because of that, we also test the XSAVE bit because its description in
    // the CPUID documentation suggests that it signals xgetbv support.
    has_avx_ =
        has_avx_hardware_ &&
        (cpu_info[2] & 0x04000000) != 0 /* XSAVE */ &&
        (cpu_info[2] & 0x08000000) != 0 /* OSXSAVE */ &&
        (_xgetbv(0) & 6) == 6 /* XSAVE enabled by kernel */;
    has_aesni_ = (cpu_info[2] & 0x02000000) != 0;
  }

  // Get the brand string of the cpu.
  __cpuid(cpu_info, 0x80000000);
  const int parameter_end = 0x80000004;
  int max_parameter = cpu_info[0];

  if (cpu_info[0] >= parameter_end) {
    char* cpu_string_ptr = cpu_string;

    for (int parameter = 0x80000002; parameter <= parameter_end &&
         cpu_string_ptr < &cpu_string[sizeof(cpu_string)]; parameter++) {
      __cpuid(cpu_info, parameter);
      memcpy(cpu_string_ptr, cpu_info, sizeof(cpu_info));
      cpu_string_ptr += sizeof(cpu_info);
    }
    cpu_brand_.assign(cpu_string, cpu_string_ptr - cpu_string);
  }

  const int parameter_containing_non_stop_time_stamp_counter = 0x80000007;
  if (max_parameter >= parameter_containing_non_stop_time_stamp_counter) {
    __cpuid(cpu_info, parameter_containing_non_stop_time_stamp_counter);
    has_non_stop_time_stamp_counter_ = (cpu_info[3] & (1 << 8)) != 0;
  }
#elif defined(ARCH_CPU_ARM_FAMILY) && (defined(OS_ANDROID) || defined(OS_LINUX))
  cpu_brand_.assign(g_lazy_cpuinfo.Get().brand());
  has_broken_neon_ = g_lazy_cpuinfo.Get().has_broken_neon();
#endif
}

CPU::IntelMicroArchitecture CPU::GetIntelMicroArchitecture() const {
  if (has_avx()) return AVX;
  if (has_sse42()) return SSE42;
  if (has_sse41()) return SSE41;
  if (has_ssse3()) return SSSE3;
  if (has_sse3()) return SSE3;
  if (has_sse2()) return SSE2;
  if (has_sse()) return SSE;
  return PENTIUM;
}

上面的程式碼中用到了__cpuid，下面就行介紹介紹。

__cpuid

功能：
Generates the cpuid instruction available on x86 and x64, which queries the processor for information about the supported features and CPU type.

原型：

void __cpuid(
   int CPUInfo[4],
   int InfoType
);

__cpuidex函式的InfoType引數是CPUID指令的eax引數，即功能ID。ECXValue引數是CPUID指令的ecx引數，即子功能ID。CPUInfo引數用於接收輸出的eax, ebx, ecx, edx這四個暫存器。

用條件編譯判斷VC編譯器對Intrinsics函式的支援性（_MSC_VER）。

使用

int main(int argc, char* argv[]) {

  base::CPU *cpu = new base::CPU();
  std::cout << cpu->cpu_brand() << std::endl;
  system("pause");
  return 0;
}

輸出：
Intel(R) Core(TM) i7-5500U CPU @ 2.40GHz

跟Google學寫程式碼--Chromium/base--cpu原始碼學習及應用

今天分享cpu相關的操作。先看看這個列舉： enum IntelMicroArchitecture { PENTIUM, SSE, SSE2, SSE3, SSSE3, SSE41, SS

跟Google學寫程式碼--Chromium/base--stl_util原始碼學習及應用

Ttile: Chromium/base–stl_util原始碼學習及應用 Chromium是一個偉大的、龐大的開源工程，很多值得我們學習的地方。今天與大家分享的就是Chromium下base中的stl_util，是對stl的補充，封裝，更有利於我們的使

跟Google 學程式碼：Web Apps以及WebView究極優化

引言從本篇部落格可以學到什麼？ 1. 用WebView構建頁面 2. 優化WebView的載入 3. 成型的WebView優化載入方案，crosswalk 4. hybrid app混合開發，常用框架 WebView WebView提供了自

跟 Google 學 machineLearning [1]

方法 valid tar dex [] 訓練發展好的 set 時至今日，我才發現 machineLearning 的應用門檻已經被降到了這麽低，簡直唾手可得。我實在找不到任何理由不對它進入深入了解。如標題，感謝 Google 為這項技術發展作出的貢獻。當然，可能其他人做

程式設計師嘆息：花了4萬學寫程式碼，工資卻只有5千，被學校忽悠了

現如今程式設計師應該算得上是一個收入不錯的工作，像華為、阿里這樣的企業更不用說，年薪幾十萬的程式設計師隨便都是一大把。也正因為如此，很多人寧願花上幾萬塊學費也要學寫程式碼。昨天在論壇看到有程式設計師留言，他說自己當初被培訓學校給忽悠了，把程式設計師工資說得天花亂墜，於是花了4萬塊錢學寫程式碼。

What-If 工具：無需寫程式碼，即可測試機器學習模型

文 / Google AI 軟體工程師 James Wexler 來源 | TensorFlow 公眾號構建有效的機器學習 (ML) 系統需要提出許多問題。僅僅訓練一個模型，然後放任不管，是遠遠不夠的。而優秀的開發者就像偵探一樣，總是不斷探索，試圖更好地理解自

130行程式碼實現BP神經網路原理及應用舉例

優化演算法是機器學習的一個重要組成部分，BP神經網路是深度學習的基礎，BP神經網路原理非常簡單，幾乎可以理解作是logistic迴歸的一種集合方式，在前面的博文中，筆者利用R語言實現了幾種優化演算法，本文以前面提到的粒子群演算法為工具，以神經網路的原理為基礎，實現了

CPU快取學習及C6678快取使用總結（知識歸納）

作者注： 1.本篇部落格內容是本人在學習cpu快取原理時進行的學習總結，參考了多處相關資源（書籍，視訊，知乎回答等），參考出處標註在內容最後。 2.由於本篇內容的編輯工作在印象筆記完成，輸出的PDF檔案無法上傳到部落格編輯器中，所以將PDF轉化為多個圖片，通過圖片嵌入部落格內，所以內容之間存在大片

學Java程式設計是不是會寫程式碼就行了？

在將要到來的人工智慧時代，許多孩子立志要成為一個合格的Java程式設計開發人員，他們刻苦學習Java程式設計的基礎技能，但真正碼起程式碼來卻依然發現犯了眾多錯誤。為什麼？究其原因是Java程式設計人員不僅僅需要過硬的專業技能，還必須具備核心的工作素質！下面小編將為大家羅列在Java程式設計種所要具備

判斷程式設計師水平高低的5大因素，僅一項跟寫程式碼有關

一名優秀的程式設計師：接到任務，擡起頭，仰望天花板閃爍的燈光，狀入老僧入定，時而愁容滿面，時而展顏一笑。忽然，眉宇間閃過一絲睿氣，橫撫鍵盤，下手如有神…… 一名普通的程式設計師：接到任務，噼裡啪啦，一天程式碼千行，不困不乏。大概，這就是優秀程式設計師和普通程式設計師的區別，決定程式設計師水

Android : 跟我學Binder --- (2) AIDL分析及手動實現 Android : 跟我學Binder --- (1) 什麼是Binder IPC？為何要使用Binder機制？ Android : 跟我學Binder --- (3) 深入程式碼實戰（假裝有連結）

一、關於Android日常開發中程序間通訊-AIDL 　　通常Android應用開發中實現程序間通訊用的最多的就是 AIDL，藉助 AIDL 編譯以後的程式碼能幫助我們進一步理解 Binder IPC 的通訊原理。但是無論是從可讀性還是可理解性上來看，編譯器生成的程式碼對開發者並不友好。比如一個 INano

跟Google學寫程式碼--Chromium/base--cpu原始碼學習及應用

跟Google學寫程式碼--Chromium/base--cpu原始碼學習及應用

跟Google學寫程式碼--Chromium/base--stl_util原始碼學習及應用

跟Google 學程式碼：Web Apps以及WebView究極優化

跟 Google 學 machineLearning [1]

程式設計師嘆息：花了4萬學寫程式碼，工資卻只有5千，被學校忽悠了

What-If 工具：無需寫程式碼，即可測試機器學習模型

130行程式碼實現BP神經網路原理及應用舉例

CPU快取學習及C6678快取使用總結（知識歸納）

學Java程式設計是不是會寫程式碼就行了？

判斷程式設計師水平高低的5大因素，僅一項跟寫程式碼有關

Android : 跟我學Binder --- (2) AIDL分析及手動實現 Android : 跟我學Binder --- (1) 什麼是Binder IPC？為何要使用Binder機制？ Android : 跟我學Binder --- (3) 深入程式碼實戰（假裝有連結）

我只想安靜地寫程式碼，領導卻跟我談大局講奉獻

跟我學程式碼架構設計模式之--鎖和執行緒

跟我學程式碼架構設計模式之--Lock和Condition

跟我學程式碼架構設計模式之--鎖和執行緒的補充

跟我學程式碼架構設計模式之--協議棧的設計思路

跟我學程式碼架構設計模式之--同步的引入

跟我學程式碼架構設計模式之--異常還是返回值？

跟我學程式碼架構設計模式之--切面思想和代理模式

跟bWAPP學WEB安全(PHP程式碼)--OS命令注入

跟Google學寫程式碼--Chromium/base--cpu原始碼學習及應用

相關推薦