Go has good support for calling into assembly, and a lot of the fast cryptographic code in the stdlib is carefully optimized assembly, bringing speedups of over 20 times.

However, writing assembly code is hard, reviewing it is possibly harder, and cryptography is unforgiving. Wouldn't it be nice if we could write these hot functions in a higher level language?

This post is the story of a slightly-less-than-sane experiment to call Rust code from Go fast enough to replace assembly. No need to know Rust, or compiler internals, but knowing what a linker is would help.

Why Rust

I'll be upfront: I don't know Rust, and don't feel compelled to do my day-to-day programming in it. However, I know Rust is a very tweakable and optimizable language, while still more readable than assembly. (After all, everything is more readable than assembly!)

Go strives to find defaults that are good for its core use cases, and only accepts features that are fast enough to be enabled by default, in a constant and successful fight against knobs. I love it for that. But for what we are doing today we need a language that won't flinch when asked to generate stack-only functions with manually hinted away safety checks.

So if there's a language that we might be able to constrain enough to behave like assembly, and to optimize enough to be as useful as assembly, it might be Rust.

Finally, Rust is safe, actively developed, and not least, there's already a good ecosystem of high-performance Rust cryptography code to tap into.

Why not cgo

Go has a Foreign Function Interface, cgo. cgo allows Go programs to call C functions in the most natural way possible—which is unfortunately not very natural at all. (I know more than I'd like to about cgo, and I can tell you it's not fun.)

By using the C ABI as lingua franca of FFIs, we can call anything from anything: Rust can compile into a library exposing the C ABI, and cgo can use that. It's awkward, but it works.

We can even use reverse-cgo to build Go into a C library and call it from random languages, like I did with Python as a stunt. (It was a stunt folks, stop taking me seriously.)

But cgo does a lot of things to enable that bit of Go naturalness it provides: it will setup a whole stack for C to live in, it makes defer calls to prepare for a panic in a Go callback... this could be will be a whole post of its own.

As a result, the performance cost of each cgo call is way too high for the use case we are thinking about—small hot functions.

Linking it together

So here's the idea: if we have Rust code that is as constrained as assembly, we should be able to use it just like assembly, and call straight into it. Maybe with a thin layer of glue.

We don't have to work at the IR level: the Go compiler converts both code and high-level assembly into machine code before linking since Go 1.3.

This is confirmed by the existence of "external linking", where the system linker is used to put together a Go program. It's how cgo works, too: it compiles C with the C compiler, Go with the Go compiler, and links it all together with clang or gcc. We can even pass flags to the linker with CGO_LDFLAGS.

Underneath all the safety features of cgo, we surely find a cross-language function call, after all.

It would be nice if we could figure out how to do this without patching the compiler, though. First, let's figure out how to link a Go program with a Rust archive.

I could not find a decent way to link against a foreign blob with go build (why should there be one?) except using #cgo directives. However, invoking cgo makes .s files go to the C compiler instead of the Go one, and my friends, we will need Go assembly.

Thankfully go/build is nothing but a frontend! Go offers a set of low level tools to compile and link programs, go build just collects files and invokes those tools. We can follow what it does by using the -x flag.

I built this small Makefile by following a -x -ldflags "-v -linkmode=external '-extldflags=-v'" invocation of a cgo build.

rustgo: rustgo.a  
        go tool link -o rustgo -extld clang -buildmode exe -buildid b01dca11ab1e -linkmode external -v rustgo.a

rustgo.a: hello.go hello.o  
        go tool compile -o rustgo.a -p main -buildid b01dca11ab1e -pack hello.go
        go tool pack r rustgo.a hello.o

hello.o: hello.s  
        go tool asm -I "$(shell go env GOROOT)/pkg/include" -D GOOS_darwin -D GOARCH_amd64 -o hello.o hello.s

This compiles a simple main package composed of a Go file (hello.go) and a Go assembly file (hello.s).

Now, if we want to link in a Rust object we first build it as a static library...

libhello.a: hello.rs  
        rustc -g -O --crate-type staticlib hello.rs

... and then just tell the external linker to link it together.

rustgo: rustgo.a libhello.a  
        go tool link -o rustgo -extld clang -buildmode exe -buildid b01dca11ab1e -linkmode external -v -extldflags='-lhello -L"$(CURDIR)"' rustgo.a
$ make
go tool asm -I "/usr/local/Cellar/go/1.8.1_1/libexec/pkg/include" -D GOOS_darwin -D GOARCH_amd64 -o hello.o hello.s  
go tool compile -o rustgo.a -p main -buildid b01dca11ab1e -pack hello.go  
go tool pack r rustgo.a hello.o  
rustc --crate-type staticlib hello.rs  
note: link against the following native artifacts when linking against this static library

note: the order and any duplication can be significant on some platforms, and so may need to be preserved

note: library: System

note: library: c

note: library: m

go tool link -o rustgo -extld clang -buildmode exe -buildid b01dca11ab1e -linkmode external -v -extldflags="-lhello -L/Users/filippo/code/misc/rustgo" rustgo.a  
HEADER = -H1 -T0x1001000 -D0x0 -R0x1000  
searching for runtime.a in /usr/local/Cellar/go/1.8.1_1/libexec/pkg/darwin_amd64/runtime.a  
searching for runtime/cgo.a in /usr/local/Cellar/go/1.8.1_1/libexec/pkg/darwin_amd64/runtime/cgo.a  
 0.00 deadcode
 0.00 pclntab=166785 bytes, funcdata total 17079 bytes
 0.01 dodata
 0.01 symsize = 0
 0.01 symsize = 0
 0.01 reloc
 0.01 dwarf
 0.02 symsize = 0
 0.02 reloc
 0.02 asmb
 0.02 codeblk
 0.03 datblk
 0.03 sym
 0.03 headr
 0.06 host link: "clang" "-m64" "-gdwarf-2" "-Wl,-headerpad,1144" "-Wl,-no_pie" "-Wl,-pagezero_size,4000000" "-o" "rustgo" "-Qunused-arguments" "/var/folders/ry/v14gg02d0y9cb2w9809hf6ch0000gn/T/go-link-412633279/go.o" "/var/folders/ry/v14gg02d0y9cb2w9809hf6ch0000gn/T/go-link-412633279/000000.o" "-g" "-O2" "-lpthread" "-lhello" "-L/Users/filippo/code/misc/rustgo"
 0.34 cpu time
12641 symbols  
5764 liveness data  

Jumping into Rust

Alright, so we linked it, but the symbols are not going to do anything just by sitting next to each other. We need to somehow call the Rust function from our Go code.

We know how to call a Go function from Go. In assembly the same call looks like CALL hello(SB), where SB is a virtual register all global symbols are relative to.

If we want to call an assembly function from Go we make the compiler aware of its existence like a C header, by writing func hello() without a function body.

I tried all combinations of the above to call an external (Rust) function, but they all complained that they couldn't find either the symbol name, or the function body.

But cgo, which at the end of the day is just a giant code generator, somehow manages to eventually invoke that foreign function! How?

I stumbled upon the answer a couple days later.

//go:cgo_import_static _cgoPREFIX_Cfunc__Cmalloc
//go:linkname __cgofn__cgoPREFIX_Cfunc__Cmalloc _cgoPREFIX_Cfunc__Cmalloc
var __cgofn__cgoPREFIX_Cfunc__Cmalloc byte  
var _cgoPREFIX_Cfunc__Cmalloc = unsafe.Pointer(&__cgofn__cgoPREFIX_Cfunc__Cmalloc)  

That looks like an interesting pragma! //go:linkname just creates a symbol alias in the local scope (which can be used to call private functions!), and I'm pretty sure the byte trick is only cleverness to have something to take the address of, but //go:cgo_import_static... this imports an external symbol!

Armed with this new tool and the Makefile above, we have a chance to invoke this Rust function (hello.rs)

#[no_mangle]
pub extern fn hello() {  
    println!("Hello, Rust!");
}

(The no-mangle-pub-extern incantation is from this tutorial.)

from this Go program (hello.go)

package main

//go:cgo_import_static hello

func trampoline()

func main() {  
    println("Hello, Go!")
    trampoline()
}

with the help of this assembly snippet. (hello.s)

TEXT ·trampoline(SB), 0, $2048  
    JMP hello(SB)
    RET

CALL was a bit too smart to work, but using a simple JMP...

Hello, Go!  
Hello, Rust!  
panic: runtime error: invalid memory address or nil pointer dereference  
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x0]