RISC-V雙週簡報0x0e：Intel漏洞是非多！(2018-01-04)

阿新 • • 發佈：2018-12-12

RISC-V 雙週簡報 (2018-01-04)

要點

展望2018
評論"英特爾CPU爆驚天漏洞"

新年展望

過去的一年發生了很多有趣的事情，相比去年，我能夠看到越來越多的人開始對RISC-V持有樂觀的態度。今年5月在上海舉辦的RISC-V Workshop在中國的影響是深遠的，很多國內的半導體廠商開始意識到RISC-V或許是除了AI之外最不能錯過的趨勢。當然，這當中有些廠商有心參與卻無處發力，有能力的一些廠商開始積極佈局，更多的廠商打著自己的算盤悶聲發大財。不論怎樣，如果RISC-V不能給大家帶來切切實實的利益，要他又有何用？

但不論在是國內還是國外，在RISC-V的普遍好評中我都能感受到一種氣氛，那就是他們仍然在用過去20-30年的發展思路去看待RISC-V。很多人會想去做第二個Arm或是第二個Intel，我不能反駁這個思路，但我認為在技術創新的同時不能沒有有商業模式的創新，而這或許是你沒法做第二個Arm的主要原因之一。其實很難說最後會不會出現一家通過RISC-V發家的巨頭，但是RISC-V要想跨越鴻溝，被普羅大眾所接受，或許要是無數個小的力量聚合在一起所推動的。

RISC-V代表了一種“去中心化”的趨勢，這意味著每個有能力的人都可以開始使用和改進RISC-V，而且門檻會越來越低，少數有能力的人會開發出穩定可靠的開源CPU，而你或許只要掌握少量的知識和技能就可以加入自己的東西並且為我所用。這就像是在Web開發領域，大量的高效能庫、框架和可伸縮的雲端計算服務能夠顯著的降低開發人員的技能門檻。

所以，在新的一年裡，讓我們學著“解放思想”，想想怎麼能把RISC-V和你們自己的優勢結合起來，來鞏固和加強你們的優勢。黑貓白貓，捉住老鼠就是好貓。

最後，CNRV會在新的一年裡繼續努力，建立更大、更健康、更活躍的社群，我們也在準備會舉辦一場大規模的線下活動，到時候可別忘了來捧場，老鐵！

安全點評

評論"英特爾CPU爆驚天漏洞"

最近一則關於“英特爾CPU爆驚天漏洞”的新聞傳遍網路。其實在小編看來，這則新聞並沒有那麼新，英特爾修復漏洞是必然，只是個時間問題。反觀RISC-V，其實RISC-V已經規避了Intel的問題，在這方面，是一個更加安全的指令集。下面就讓我來稍微仔細一點的說一說。

有人猜測說這個漏洞和亂序流水線的分支預測(branch speculation)有關，導致了低優先順序的使用者程式能夠訪問核心空間。小編個人對此解釋持保留態度。使用分支預測已是亂序流處理器的基本配置，Intel存在漏洞而AMD倖免有些說不過去。後來看更多的英文文件 [Lipp2017]，其實英文裡面說的是預測執行(speculative execution)，這個就比分支預測的範圍大得多了。其原理簡單說就是處理器允許預測執行的使用者態程式碼訪問核心資料並將核心資料真的讀取並置入快取。從2016年CCS安全會議的一篇文章來看，使用Intel處理器的Linux系統早就被攻破，其關鍵也是對使用者程式的記憶體讀寫監管不嚴。

先簡單介紹一下基本的核心空間保護：基於種種原因，至少在Linux系統中，核心空間是被完整新增到使用者程序的虛擬空間中的。這就導致使用者程式能很容易的訪問核心空間，造成了安全隱患。為了緩解該問題，現代核心往往使用核心空間地址隨機化(KASLR)，隨機化之後，使用者程式需要猜出核心空間分佈才能窺視核心。繞過核心地址隨機化變成了攻擊核心的重要一步。

在Sandy Bridge之後的Intel處理器允許使用者程式的prefetch指令預取任意地址，包括核心空間的地址。這邊提供了攻擊者創造特殊快取時間側通道來探測核心空間 [Gruss2016]。更糟糕的是，Linux核心將所有使用者空間都對映到了核心空間以方便記憶體管理(physmap)。攻擊者通過尋找自己的資料空間在核心空間的對應對映，便能夠做到核心態程式碼注入，用以劫持核心 [Gruss2014, Kemerlis2014]。

這裡暴露了很多問題：

核心地址隨機化往往是系統啟動時確定的，一旦被攻破，就失去保護作用。
將核心空間對映入使用者空間其實本來就是安全隱患。
Intel的prefetch允許使用者程式預取核心資料提供了側通道攻擊的條件。
Linux作業系統將使用者記憶體影射入核心空間也是很危險的。

那麼我們再來看看RISC-V。RISC-V指令集在SSTATUS控制寄存中定義了SUM標誌位。一般情況下，SUM=0，即核心態程式碼不能讀寫使用者態資料，只有當必要時，核心才臨時開啟SUM直接操縱使用者資料。但是，核心無論如何都不能執行使用者頁的指令。這樣，即使physmap將使用者頁對映到了核心空間，其頁表仍然標註該頁屬於使用者態，則徹底堵上了利用physmap對核心的程式碼注入。

另外，RISC-V討論組內部也早已開始討論比Intel更加徹底的使用者和核心頁表分離方案 [isa-dev-2017-11-30]。當前Intel的頁表分離方案的明顯效能下降主要來源於核心和使用者切換時需要重新整理TLB，產生了大量缺頁中斷 [Gruss2017]。如果使用者和核心使用不同的頁表基地址和ASID，則不要頻繁重新整理TLB，大大降低效能損失。由於使用者和核心的徹底頁表分離，prefetch指令將徹底失去預取核心資料的能力，去除其造成的側通道。

[Lipp2017] M. Lipp, M. Schwarz, D. Gruss, T. Prescher, W. Haas, S. Mangard, P. Kocher, D. Genkin, Y. Yarom and M. Hamburg. “Meltdown.” https://meltdownattack.com/meltdown.pdf
[Gruss2017] D. Gruss, M. Lipp, M. Schwarz, R. Fellner, C. Maurice and S. Mangard. “KASLR is dead: long live KASLR.” In Proc. of ESSoS, Bonn, Germany, July 3-5, 2017, 2017, pp. 161-176.
[Gruss2016] D. Gruss, C. Maurice, A. Fogh, M. Lipp and S. Mangard. “Prefetch side-channel attacks: bypassing SMAP and kernel ASLR” in Proc. of CCS, Vienna, Austria, October 24-28, 2016, pp. 368-379.
[Kemerlis2014] V. P. Kemerlis, M. Polychronakis and A. Keromytis. “ret2dir: rethinking kernel isolation.” In Proc. Security, San Diego, CA, USA, August 20-22, 2014, pp. 957-972.

(宋威：以上為個人基於有限文獻的見解，受知識面和能力限制，如有錯誤，還望不吝指正。同時感謝Shawn的審校。)

RV新聞

glibc port v3

最近 Palmer 在 glibc-alpha 上提交了 glibc port v2 和 v3。由於Debian, Fedora, OpenEmbedded 等上層軟體都仰賴 glibc 和ABI 的穩定，glibc的upstream便顯得急迫且重要。glibc的RISC-V port 將包含六個configuration。詳請可見Palmer在port v2中的介紹：

Our QEMU port is not yet upstream (we’re preparing it for submission now), and general availiability of the first commercial, Linux-capable RISC-V chips is scheduled for Q1 2018.

We’ve had preliminary ports of Debian, Fedora, and OpenEmbedded to RISC-V, all of which are currently waiting on glibc so we can declare the ABI stable.

QEMU port v1

Michael Clark 最近提交了QEMU port v1，期望能在QEMU 2.12 合併主線。這次的提交主要是針對base ISA的部分。未來，可能會再加上hypervisor extension, vector extension 和device emulation 等功能。

技術討論

如何支援semi-host

很多指令集都支援使用偵錯程式來實現一些簡單的I/O功能以方便系統的早期除錯。這時候，偵錯程式便起到了一部分主機的功能，即稱作semi-host。由於偵錯程式的可見中斷只有EBREAK指令，所以實現semi-host必須在這條指令上想辦法。在RISC-V指令集定義的時候，EBREAK指令被定義為不帶任何運算元的指令，其基本考慮如下：

From Krste:

A debugger knows the addresses at which it has inserted breakpoints. (偵錯程式應該知道原始碼中的斷點位置)

When the debugger is notified that a hart has stopped at a breakpoint, it needs to read the pc at least. (當偵錯程式被斷點喚醒，偵錯程式至少會讀取當前PC指標)

A debugger-side hashtable indexed by the pc can uncover all the necessary information about that breakpoint without having to go back and read the EBREAK instruction bits from target memory, and without being constrained by the size of the EBREAK immediate field. (使用一個簡單的雜湊表應該就能很快分析出斷點的功能，而不需要去分析斷點指令的引數)

A compressed EBREAK should work the same as an uncompressed EBREAK, and encoding space is more critical in compressed code. (壓縮和非壓縮斷點指令的行為應當一致)

More generally, we viewed EBREAKs as instructions that are poked into compiled code from outside, not instructions that are compiled in to a binary, though it’s clear software breakpoints are being used in this way for other architectures. (從根本上說，斷點指令應當被看作被外部新增的指令，而不是原有執行程式的固有指令。儘管軟中斷被其他的體系結構大量使用為軟體基本功能)

上面的軟體中斷即暗指ARM系列的軟終端，想起來了ARM 7中SWI #imm的用法嗎？

不過這些基本假設現在似乎出現了問題。比如，Liviu就指出，偵錯程式並不是任何時候都能拿到被除錯程式碼的執行副本的(思考直接將偵錯程式接到一個嵌入式系統盲調)。這這種時候，就不能假設偵錯程式能獨立理解斷點所需要的功能。

為了解決這個問題，Liviu提議改變EBREAK的格式，新增一個立即數引數。但是這樣便破壞了已經被定義的使用者態RISC-V指令級標準。作為折衷方案，Krste提出：

Use EBREAK + 16-bit zero + 16-bit tag as current workaround to lack of EBREAK immediate (effectively making a new 64-bit EBREAK encoding). (仍然使用現在的EBRAK指令，不過在指令後新增一個16位元0的非法指令做識別符號，然後使用一個16位元作為斷點引數)

Add “undelegate” feature to debug spec so ECALL (and other exceptions/interrupts) can trap to debugger. (為系統呼叫ECALL提供偵錯程式的介面，讓部分系統呼叫由偵錯程式完成)

Move eventually to using ECALL for semihosting-like applications where debug hardware supports “undelegate”. (逐步推進使用系統呼叫實現semi-host的大部分功能)

Use labeled ELF for EBREAK “assert failure” use cases. (使用斷點和執行檔案副本實現assert)

但是即使這樣也不能滿足所有需求。其中反映最激烈的問題是使用系統呼叫的方式實現semi-host導致每次host呼叫都必須陷入異常處理，這對於使用semi-host來實現輕量級列印輸出來說代價過大。用semi-host的方式實現輕量級列印輸出是trace除錯的基本操作之一，在系統早期除錯硬體和基礎軟體，trace除錯是非常有用的。所以支援這種輕量級的semi-host也是很有必要。

現在這個討論還在繼續中，決定之後，RISC-V的除錯標準將會做相應修改。現在看來，對於I/O的semi-host應用會使用系統呼叫方式，而對於列印除錯資訊則可能使用斷點叫附加斷點引數的方式。

使用32位元的NOP指令隱藏一條16位元指令？

這個技巧還真是令人大開眼界。假設下面的程式：

if(a>0) b=b+1;
else    b=b-1;

這段小程式的else分支只有一條指令。一般編譯器會在if分支的最後加一條跳轉指令跳到else分支之後的程式碼。但是由於else分支只有一條指令，與其使用跳轉指令，可以使用一條類似LUI x0, #imm20的NOP指令（x0暫存器永遠為0，所以賦值沒有意義），然後將else分支的指令用一條RVC指令代替，然後將其藏在20位元的立即數裡。很有點腦洞大開。

這麼做的主要意義是節省指令空間，據說在以前的ROM大小受限的Z80系統經常會使用這種技巧。在現代處理器中，這種使用方式基本不會帶來任何速度優勢（分支預測能很準確地預測跳轉），但可能造成取指（Fetch）部分的錯誤。不過對於系統初始環節，在初始記憶體受限情況下，也許有其用途。

程式碼更新

RISC-V port 針對 linux kernel 4.15-rc4 的更新

這次的更新算是修正一些小錯誤。包括：

修正 sys_riscv_flush_icache()裡的一個typo。這是RISC-V linux裡專門處理 I$ flush的 syscall。由於RISC-V 裡沒有跟cache 直接相關的指令，這個syscall 底層是使用fence.i 指令實作的。詳情可見：fence.i 的實作和 syscall的實作
移除了關於舊版HVC_driver的code。這原本是為了early printk而做的。後來有比較好的方法了。詳情可參考連結。

更多細節可參考 GIT pull 的連結：link

實用資料

Linux kernel Upstream的內容（Palmer 的 All Aboard, Part 8: The RISC-V Linux Port is Upstream!）

Palmer在這篇中記錄了 4.15中 Upstream的內容，同時也講了哪些還需要加強。還需要加強的部分包括memory model、PLIC、DMA、timer、device tree文件、以及SBI console driver。想了解Linux kernel整體的情況的話，可以參考。想了解更多更新，也可以追蹤 [email protected]。

Palmer 的 All Aboard, Part 9: Paging and the MMU in the RISC-V Linux Kernel

前半段Palmer先介紹了 RISC-V 系統中的 AEE (Application Execution Environment)和 SEE (Supervisor Execution Environment)。在RISC-V中，AEE的部分基本要求包括(小編：scall在2.1中其實已經被更名為ecall)：

The ISA string, which determines what the vast majority of instructions do as well as which registers constitute the machine’s current state.

The supervisor’s user-visible ABI, which determines what the scall instruction does. This is different than the C compiler’s ABI, which defines the interface between different components of the application.

The contents of the entire memory address space.

當然，實際上AEE還是要看各作業系統決定的，譬如FreeBSD 和 Linux的AEE就不同。至於Unix-class 的 SEE，則是會由Platform spec 小組訂出一些基本要求，Palmer所猜測的定義如下(除了PMA和SBI以外，這跟workshop時小編記錄的類似）：

Either the RV32I or RV64I base ISAs, along with the M, A, and C extensions. The F and D extensions are optional but paired together, leaving the possible standard ISAs for application-class SEEs as RV32IMAC, RV32IMAFDC (RV32GC), RV64IMAC, and RV64IMAFDC (RV64GC).

On RV32I-based systems, support for Sv32 page-based virtual memory.

On RV64I-based systems, support for at least Sv48 page-based virtual memory.

Upon entering the SEE, the PMAs are set such that memory accesses are point-to-point strongly ordered between harts and devices.

An SBI that implements various fences, timers, and a console.

在介紹完AEE和SEE這些系統基本的要求後，Palmer整理了一些RISC-V中 page table的特點：

Pages are 4KiB at the leaf node, and it’s possible to map large contiguous regions with every level of the page table.

RV32I-based systems can have up to 34-bit physical addresses with a three level page table.

RV64I-based systems can have multiple virtual address widths, starting with 39-bit and extending up to 64-bit in increments of 9 bits.

Mappings must be synchronized via the sfence.vma instruction.

There are bits for global mappings, supervisor-only, read/write/execute, and accessed/dirty.

There is a single valid bit, which allows storing XLEN-1 bits of flags in an otherwise unused page tables. Additionally, there are two bits of software flags in mapped pages.

Address space identifiers are 9 bits or RV32I and 16 bits on RV64I, and they’re a hint so a valid implementation is to ignore them.

The accessed and dirty bits are strongly ordered with respect to accesses from the same hart, but are optional (with a trap-based mechanism when unsupported)

值得注意的是，像ASID這種額外功能在RISC-V的 Linux kernel中其實還沒實作。有興趣的同學可以把握機會。最後，Palmer介紹了跟device 和DMA有關的部分。因為RISC-V 沒有定義IOMMU，目前RISC-V linux kernel 使用 bounce buffer 及 32bit ZOME_DMA 來處理 device addressing的問題。

Sodor 的設計文件

Shimomura Shigeki 整理了一份約20頁的Sodor design doc，詳細整理了code structure和每一個檔的作用。有在用sodor研究RISC-V，或想學chisel的可以參考。

另一個chisel的學習資料：Berkeley 的 Generator Bootcamp

想透過學chisel學習rocket和BOOM的同學可以參考這個新的repo。這個 repo 用 Jupyter Notebook 來教怎麼用chisel寫generator。另外，chisel learning journey 的文件也在不斷更新，也可以參考或加入chisel learning journey的hangout。

Links:

行業視角

34C3大會上RISC-V相關的報道

你恐怕要先問什麼是34C3，34C3代表第34屆混沌通訊會議(Chaos Communication Congress)，是一個常年在德國舉辦的黑客聚會。

在本屆34C3大會上，活躍在開源EDA領域的Clifford Wolf發表了演講。當中提到了，在開源硬體領域，以RISC-V為代表，今年可以稱之為開源矽(Open Silicon)之年了。

演講中提到了Clifford在RISC-V形式驗證當中的一些工作。

If we were to pick one of the largest developments in the open-source hardware industry this year, we’d call 2017 the year of open silicon. In particular the open RISC-V processor came out in hardware that you can play around with now. In ten years, when we’re all running open-silicon “Arduinos”, remember this time. And if you haven’t been watching [Clifford Wolf], you might have missed that he wrote a 3D modelling software called openSCAD or a free FPGA toolchain, project Icestorm.

“Reflection On 2017”系列報道

Semiconductor Engineering的Brian Bailey發表了他對於2017年整個行業的思考，每一年他都會諮詢他的業界朋友們對新一年的預測，而在第二年的時候他又會讓他們去年的預測，看看哪些應驗了，而哪些又會讓人大吃一驚。

Schirrmeister預測2017對於微處理器行業來說會是很有趣的一年。這一年，MIPS在Imagination的交易中得以繼續，而RISC-V則吸引了大量的關注。越來越多的公司認可RISC-V並且有公司表明他們在做相關的開發工作。這一趨勢將會在2018年持續下去，這也讓一些傳統的廠商必須快速作出反應，就像Arm的非常的積極用DsignStart來對RISC-V作出反應一樣。

Schirrmeister also predicted that 2017 would be an interesting year for processor architectures. At that point he had said that “even Open Source hardware architectures are looking like they will be very relevant. It’s definitely one of the most entertaining spaces to watch in 2017 and for years to come.” He now responds. “The battle is raging and has resulted in some interesting new business models. MIPS has been spun out as part of the overall Imagination transition, and RISC-V certainly enjoys A LOT of attention with more companies endorsing it and announcing serious product developments for it. This trend will only accelerate in 2018 and has forced some of the traditional players to react fast—like Arm did with its Design Start initiative.

“The semiconductor industry witnessed a consolidation slowdown with new startups offering free, open solutions for today’s design challenges – not to mention established companies moving away from closed architectures,” adds Rick O’Connor, executive director of the RISC-V Foundation. “There is a growing interest in open-source instruction set architectures (ISAs), such as RISC-V. The portability and flexibility of the RISC-V architecture has driven innovation in a number of applications, addressing the increasing demands of our connected world from big data to the IoT. This newfound freedom in silicon design has also encouraged collaboration across the ecosystem by fostering a system-level approach to SoC design.

…

In fact, the RISC-V appears to be creating a lot of excitement. Rick O’Connor, executive director of the RISC-V Foundation, notes that “in 2017, the semiconductor industry witnessed a consolidation slowdown with new startups offering free, open solutions for today’s design challenges – not to mention established companies moving away from closed architectures. There is a growing interest in open-source instruction set architectures (ISAs), such as RISC-V. The portability and flexibility of the RISC-V architecture has driven innovation in a number of applications, addressing the increasing demands of our connected world from big data to the IoT. This newfound freedom in silicon design also has encouraged collaboration across the ecosystem by fostering a system-level approach to SoC design.”