Tag Archives: cgo

cgo is not Go

To steal a quote from JWZ,

Some people, when confronted with a problem, think “I know, I’ll use cgo.”
Now they have two problems.

Recently the use of cgo came up on the Gophers’ slack channel and I voiced my concerns that using cgo, especially on a project that is intended to showcase Go inside an organisation was a bad idea. I’ve said this a number of times, and people are probably sick of hearing my spiel, so I figured that I’d write it down and be done with it.

cgo is an amazing technology which allows Go programs to interoperate with C libraries. It’s a tremendously useful feature without which Go would not be in the position it is today. cgo is key to ability to run Go programs on Android and iOS.

However, and to be clear these are my opinions, I am not speaking for anyone else, I think cgo is overused in Go projects. I believe that when faced with reimplementing a large piece of C code in Go, programmers choose instead to use cgo to wrap the library, believing that it is a more tractable problem. I believe this is a false economy.

Obviously, there are some cases where cgo is unavoidable, most notably where you have to interoperate with a graphics driver or windowing system that is only available as a binary blob. But those cases where cgo’s use justifies its trade-offs are fewer and further between than many are prepared to admit.

Here is an incomplete list of trade-offs you make, possibly without realising them, when you base your Go project on a cgo library.

Slower build times

When you import "C" in your Go package, go build has to do a lot more work to build your code. Building your package is no longer simply passing a list of all the .go files in scope to a single invocation of go tool compile, instead:

The cgo tool needs to be invoked to generate the C to Go and Go to C thunks and stubs.
Your system C compiler has to be invoked for every C file in the package.
The individual compilation units are combined together into a single .o file.
The resulting .o file take a trip through the system linker for fix-ups against shared objects they reference.

All this work happens every time you compile or test your package, which is constantly, if you’re actively working in that package. The Go tool parallelises some of this work where possible, but your packages’ compile time just grew to include a full rebuild of all that C code.

It’s possible to work around this by pushing the cgo shims out into their own package, avoiding the compile time hit, but now you’ve had to restructure your application to work around a problem that you didn’t have before you started to use cgo.

Oh, and you have to debug C compilation failures on the various platforms your package supports.

Complicated builds

One of the goals of Go was to produce a language who’s build process was self describing; the source of your program contains enough information for a tool to build the project. This is not to say that using a Makefile to automate your build workflow is bad, but before cgo was introduced into a project, you may not have needed anything but the go tool to build and test. Afterwards, to set all the environment variables, keep track of shared objects and header files that may be installed in weird places, now you do.

Keep in mind that Go supports platforms that don’t ship with make out of the box, so you’ll have to dedicate some time to coming up with a solution for your Windows users.

Oh, and now your users have to have a C compiler installed, not just a Go compiler. They also have to install the C libraries your project depends on, so you’ll be taking on that support cost as well.

Cross compilation goes out the window

Go’s support for cross compilation is best in class. As of Go 1.5 you can cross compile from any supported platform to any other platform with the official installer available on the Go project website.

By default cgo is disabled when cross compiling. Normally this isn’t a problem if your project is pure Go. When you mix in dependencies on C libraries, you either have to give up the option to cross compile your product, or you have to invest time in finding and maintaining cross compilation C toolchains for all your targets.

Maybe if you work on a product that only communicates with clients over TCP sockets and you intend to run it in a SaaS model it’s reasonable to say that you don’t care about cross compilation. However, if you’re making a product which others will use, possibly integrated into their products, maybe it’s a monitoring solution, maybe it’s a client for your SaaS service, then you’ve locked them out of being able to easily cross compile.

The number of platforms that Go supports continues to grow. Go 1.5 added support for 64 bit ARM and PowerPC. Go 1.6 adds support for 64 bit MIPS, and IBM’s s390 architecture is touted for Go 1.7. RISC-V is in the pipeline. If your product relies on a C library, not only do you have the all problems of cross compilation described above, you also have to make sure the C code you depend on works reliably on the new platforms Go is supporting — and you have to do that with the limited debuggability a C/Go hybrid affords you. Which brings me to my next point.

You lose access to all your tools

Go has great tools; we have the race detector, pprof for profiling code, coverage, fuzz testing, and source code analysis tools. None of those work across the cgo blood/brain barrier.

Conversely excellent tools like valgrind don’t understand Go’s calling conventions or stack layout. On that point, Ian Lance Taylor’s work to integrate clang’s memory sanitiser to debug dangling pointers on the C side will be of benefit for cgo users in Go 1.6.

Combing Go code and C code results in the intersection of both worlds, not the union; the memory safety of C, and the debuggability of a Go program.

Performance will always be an issue

C code and Go code live in two different universes, cgo traverses the boundary between them. This transition is not free and depending on where it exists in your code, the cost could be inconsequential, or substantial.

C doesn’t know anything about Go’s calling convention or growable stacks, so a call down to C code must record all the details of the goroutine stack, switch to the C stack, and run C code which has no knowledge of how it was invoked, or the larger Go runtime in charge of the program.

To be fair, Go doesn’t know anything about C’s world either. This is why the rules for passing data between the two have become more onerous over time as the compiler becomes better at spotting stack data that is no longer considered live, and the garbage collector becomes better at doing the same for the heap.

If there is a fault while in the C universe, the Go code has to recover enough state to at least print a stack trace and exit the program cleanly, rather than barfing up a core file.

Managing this transition across call stacks, especially where signals, threads and callbacks are involved is non trivial, and again Ian Lance Taylor has done a huge amount of work in Go 1.6 to improve the interoperability of signal handling with C.

The take away is that the transition between the C and Go world is non trivial, and it will never be free from overhead.

C calls the shots, not your code

It doesn’t matter which language you’re writing bindings or wrapping C code with; Python, Java with JNI, some language using libFFI, or Go via cgo; it’s C’s world, you’re just living in it.

Go code and C code have to agree on how resources like address space, signal handlers, and thread TLS slots are to be shared — and when I say agree, I actually mean Go has to work around the C code’s assumption. C code that can assume it always runs on one thread, or blithely be unprepared to work in a multi threaded environment at all.

You’re not writing a Go program that uses some logic from a C library, instead you’re writing a Go program that has to coexist with a belligerent piece of C code that is hard to replace, has the upper hand negotiations, and doesn’t care about your problems.

Deployment gets more complicated

Any presentation on Go to a general audience will contain at least one slide with these words:

Single, static binary

This is Go’s ace in the hole that has lead it to become a poster child of the movement away from virtual machines and managed runtimes. Using cgo, you give that up.

Depending on your environment, it’s probably possible to build your Go project into a deb or rpm, and assuming your other dependencies are also packaged, add them as an install dependency and push the problem off the operating system’s package manager. But that’s several significant changes to a build and deploy process that was previously as straight forward as go build && scp.

It is possible to compile a Go program entirely statically, but it is by no means simple and shows that the ramifications of including cgo in your project will ripple through your entire build and deploy life cycle.

Choose wisely

To be clear, I am not saying that you should not use cgo. But before you make that Faustian bargain, please consider carefully the qualities of Go that you’ll be giving up in return.

Associative commentary

This is a quick post to discuss the rules of comments in Go.

To quickly recap, Go comments come in two forms

// everything from the double slash to the end of line is a comment
/* everything from the opening slash star, to the closing one is a comment */

As the first form declares that the remainder of the line is a comment, if you want to comment out more than one line, you need to do this

// this comment form is useful for
// commenting out sections of your code
// as you are working

The second form is also useful, and generally preferred, for large blocks of commentary

/*
The generally accepted rule when writing large
comment blocks in this form is to leave a newline
at the start and the end of your comment
*/

One important thing to note is that comments do not nest, thus

// // This is fine because everything from the double 
// // slash to the end of line is ignored

/* 
But, if you you were to start a new
/* comment inside an old one 
the closing star slash would close the comment block and */
this line would generate a syntax error 
*/

Association

A feature of the tools that consume Go source code, not the language, is the convention that a comment which directly preceeds a declaration is associated with that declaration.

// A Foo is a Fooid in the class of Endofoos.
func Foo() { .... }

Conversely, a comment followed by a newline stands alone, it’s just a comment.

package foo

// utility foos

func Quxx() { ... }

godoc allows comments to be associated with any of the top level declarations; package, var, const, type, and func.

import “C”

The trap that catches people out when they are using cgo is they don’t realise the significance of the newline, or more correctly, the lack of newline between their block of C code and the import "C" declaration.

package main

/*
#include "stdio.h"
*/
import "C"

func main() {
        C.printf(C.CString("Hello world\n"))
}

In this example the import "C" declaration is immediately preceded by the comment block containing our C code, in this case including stdio.h to obtain a reference to the printf function.

If there was a newline between the comment block and import "C" then the cgo preprocessor would not associate the comment with the import declaration and act as if #include "stdio.h" was never there.

% go run cgo.go
# command-line-arguments
37: error: use of undeclared identifier 'printf'

Update don’t forget to read the follow up to this post.

How to include C code in your Go package

It looks like Go 1.4 will remove support for Go packages containing C code (as described below, don’t confuse this with CGO), so enjoy it while it lasts.

This is a short post designed to illustrate how Go package authors can write package level functions in C and call them from Go code without using cgo. The code for this article is available on GitHub, https://github.com/davecheney/ccode.

Some warnings

Before we start, there are a few warnings that need to be spelled out

Using C code is inherently unsafe, not just because it unholsters all the C footguns, but because you can address any symbol in the runtime. With great power comes great responsibility.
The Go 1 compatibility guarantee does not extend to C code.
C functions cannot be inlined.
Escape analysis cannot follow values passed into C functions.
Code coverage does not extend to C functions.
The C compilers (5c, 6c, 8c) are not as optimised as their companion Go compilers, you may find that the code generated is not as efficient as the same code in Go.
You are writing plan 9 style C code, which is a rough analogue of C89.

Returning a value

The first example is a simple function called True that always returns true.

void ·True(bool res) {
        res = true;
        FLUSH(&res);
}

Even with this simple example there is a lot going on, let’s start with the function signature. The signature of True is void ·True(bool res). The void return code is required as all C to Go interworking is done via arguments passed on the stack. The Interpunkt, the middle dot, ·, is part of the package naming system in Go. By proceeding the name of the function with · we are declaring this function is part of the current package. It is possible to define C functions in other packages, or to provide a package name before the interpunkt, but it gets complicated when your package is heavily namespaced and so is beyond the scope of this article.

The next part of the function signature is the return argument specified in the C declaration style. Both calling and return arguments are supplied as parameters to any C function you want to call from Go. We’ll see in a moment how to write the corresponding forward declaration.

Moving on to the body of the function, assigning true to res is fairly straight forward, but the final line, FLUSH(&res) needs some explanation. Because res is not used inside the body of the function a sufficiently aggressive compiler may optimise the assignment away. FLUSH is used to ensure the final value of res is written back to the stack.

The forward declaration

To make the True function available to our Go code, we need to write a forward declaration. Without the forward declaration the function is invisible to the Go compiler. This is unrelated to the normal rules for making Go a symbol public or private via a capital letter.

A forward declaration for the True function looks like this

// True always returns true.
func True() bool

The forward declaration says that True takes no arguments and returns one boolean argument. As this is a normal Go function, you can attach a comment describing the function which will appear in godoc (comments on the function in C code will not appear in documentation).

That is all you need to do make True available to Go code in your package. It should be noted that while True is a public function, this was not required.

Passing arguments to C functions

Extending from the previous example, let’s define a function called Max which returns the maximum of two ints.

void ·Max(intptr a, intptr b, intptr res) {
        res = a > b ? a : b;
        FLUSH(&res);
}

Max is similar to the previous function; the first two arguments are function arguments, the final is the return value. Using res for the name of the return argument is not required, but appears to be the convention used heavily throughout the standard library.

The type of a and b is intptr which is the C equivalent of Go’s platform dependant int type.

The forward declaration of Max is shown below. You can see the how function arguments and return values map between Go and C functions.

// Max returns the maximum of two integers.
func Max(a, b int) int

Passing addresses

In the previous two examples we have passed values to functions and returned copies of the result via the stack. In Go, all arguments are passed by value, and calling to C functions is no different. For this final example we’ll write a function that increments a value by passing a pointer to that value.

void ·Inc(intptr* addr) {
        *addr+=1;
        USED(addr);
}

The Inc function takes the address of a intptr (a *int in Go terms), dereferences it, increments it by one, and stores the result at the address addr. The USED macro is similar in function to FLUSH and is used mainly to silence the compiler.

Looking at the forward declaration, we define the function to take a pointer to the int to be incremented.

// Inc increments the value of the integer add address p.
func Inc(p *int)

Putting it all together

To demonstate using these C defined functions in Go code I’ve written a few tests which exercise the code. The code for the tests are here, and the results of running the tests are shown below.

% go test -v
github.com/davecheney/ccode
=== RUN TestTrue
--- PASS: TestTrue (0.00 seconds)
=== RUN TestMax
--- PASS: TestMax (0.00 seconds)
=== RUN TestInc
--- PASS: TestInc (0.00 seconds)
PASS
ok      github.com/davecheney/ccode     0.005s

Conclusion

In this short article I’ve shown how you can write a Go package that includes functions written in C. While a quite niche use case, it may come in handy for someone and also lays important groundwork for writing packages containing functions in raw assembler.

An introduction to cross compilation with Go 1.1

This post is a compliment to one I wrote in August of last year, updating it for Go 1.1. Since last year tools such as goxc have appeared which go a beyond a simple shell wrapper to provide a complete build and distribution solution.

Introduction

Go provides excellent support for producing binaries for foreign platforms without having to install Go on the target. This is extremely handy for testing packages that use build tags or where the target platform is not suitable for development.

Support for building a version of Go suitable for cross compilation is built into the Go build scripts; just set the GOOS, GOARCH, and possibly GOARM correctly and invoke ./make.bash in $GOROOT/src. Therefore, what follows is provided simply for convenience.

Getting started

1. Install Go from source. The instructions are well documented on the Go website, golang.org/doc/install/source. A summary for those familiar with the process follows.

% hg clone https://code.google.com/p/go
% cd go/src
% ./all.bash

2. Checkout the support scripts from Github, github.com/davecheney/golang-crosscompile

% git clone git://github.com/davecheney/golang-crosscompile.git
% source golang-crosscompile/crosscompile.bash

3. Build Go for all supported platforms

% go-crosscompile-build-all
go-crosscompile-build darwin/386
go-crosscompile-build darwin/amd64
go-crosscompile-build freebsd/386
go-crosscompile-build freebsd/amd64
go-crosscompile-build linux/386
go-crosscompile-build linux/amd64
go-crosscompile-build linux/arm
go-crosscompile-build windows/386
go-crosscompile-build windows/amd64

This will compile the Go runtime and standard library for each platform. You can see these packages if you look in go/pkg.

% ls -1 go/pkg 
darwin_386
darwin_amd64
freebsd_386
freebsd_amd64
linux_386
linux_amd64
linux_arm
obj
tool
windows_386
windows_amd64

Using your cross compilation environment

Sourcing crosscompile.bash provides a go-$GOOS-$GOARCH function for each platform, you can use these as you would the standard go tool. For example, to compile a program to run on linux/arm.

% cd $GOPATH/github.com/davecheney/gmx/gmxc
% go-linux-arm build 
% file ./gmxc 
./gmxc: ELF 32-bit LSB executable, ARM, version 1 (SYSV), 
statically linked, not stripped

This file is not executable on the host system (darwin/amd64), but will work on linux/arm.

Some caveats

Cross compiled binaries, not a cross compiled Go installation

This post describes how to produce an environment that will build Go programs for your target environment, it will not however build a Go environment for your target. For that, you must build Go directly on the target platform. For most platforms this means installing from source, or using a version of Go provided by your operating systems packaging system.
If you are using

No cgo in cross platform builds

It is currently not possible to produce a cgo enabled binary when cross compiling from one operating system to another. This is because packages that use cgo invoke the C compiler directly as part of the build process to compile their C code and produce the C to Go trampoline functions. At the moment the name of the C compiler is hard coded to gcc, which assumes the system default gcc compiler even if a cross compiler is installed.

In Go 1.1 this restriction was reinforced further by making CGO_ENABLED default to 0 (off) when any cross compilation was attempted.

GOARM flag needed for cross compiling to linux/arm.

Because some arm platforms lack a hardware floating point unit the GOARM value is used to tell the linker to use hardware or software floating point code. Depending on the specifics of the target machine you are building for, you may need to supply this environment value when building.

% GOARM=5 go-linux-arm build

As of e4b20018f797 you will at least get a nice error telling you which GOARM value to use.

$ ./gmxc 
runtime: this CPU has no floating point hardware, so it cannot 
run this GOARM=7 binary. Recompile using GOARM=5.

By default, Go assumes a hardware floating point unit if no GOARM value is supplied. You can read more about Go on linux/arm on the Go Language Community Wiki.

Dave Cheney

The acme of foolishness