Unhelpful abstractions

Sandi Metz’s post on abstraction struck a chord with me recently. I was working with a piece of code which looked like this (in pseudo code):

func Start() {
        const filename = "..."
        createOuputFile(filename)
        go run(filename)
}

It turned out that createOutputFile was written in an obscure way which first caused me to look at it more closely. Why the code expected the file to exist before starting wasn’t immediately clear. It may have been because some other goroutine was expecting the file to exist on disk, even if nothing had been written yet (slight race smell), or more likely the necessary information was not available for the job itself to create the file with the correct permissions. This calls for a refactoring!

There is a well known UNIX utility that provides these semantics, touch(1). So, I reasoned, createOutputFile is really touch plus the ability to set file permissions, which the former was hard coded to do implicitly. This was a job for abstraction!

How would you write the signature for TouchFile? Here is what I came up with:

// TouchFile ensures path exists, or creates it using
// the supplied file mode.
func TouchFile(path string, mode os.FileMode) error

This looked pretty reasonable, and was a nice generalisation over the previous function. TouchFile makes setting the file mode on creation, the primary reason why this code existed in the first place, explicit.

However, this is precisely the train of thought that Metz warned of in her post. By generalising this function I had made its API worse.

Specifically, now every caller to this function has to pass in a mode value, even if the file exists, even if they don’t really care and are happy with a default file mode. Worse still, mode is only applied if the file is not already present. Not only had I made this function harder to use in its default use case, I’d added a footgun to the API that someone might call this function expecting it to update the mode of an existing file.

The second clue that I was heading in the wrong direction was the implementation of TouchFile itself:

func TouchFile(path string, mode os.FileMode) error {
        f, err := os.OpenFile(path, os.O_WRONLY|os.O_CREATE, mode)
        if err != nil {
                return err
        }
        return f.Close()
}

TouchFile just calls os.OpenFile passing in the right flags to get the create-if-missing semantics it wants. You still have to pass in the mode, because os.OpenFile requires a mode, so the utility of TouchFile as a wrapper is undermined by the cognitive overhead of having to remember its quirks.

Coming to my senses, I reverted my change and replaced createOutputFile with a direct call to os.OpenFile.

Whereas someone reading this code and seeing a call to TouchFile may think that the goal is to ensure the file exists, will miss the subtle point that purpose of Start was to ensure that the file exists with the right permission. By making a direct call to the os package in body of the function it becomes explicit to the next reader, who already knows the os package, that the file is being created explicitly is to set the mode.

I realised that I hadn’t made things simpler by adding this abstraction, instead I’d made them more opaque. Sometimes it’s better to be explicit than abstract.

cgo is not Go

To steal a quote from JWZ,

Some people, when confronted with a problem, think “I know, I’ll use cgo.”
Now they have two problems.

Recently the use of cgo came up on the Gophers’ slack channel and I voiced my concerns that using cgo, especially on a project that is intended to showcase Go inside an organisation was a bad idea. I’ve said this a number of times, and people are probably sick of hearing my spiel, so I figured that I’d write it down and be done with it.

cgo is an amazing technology which allows Go programs to interoperate with C libraries. It’s a tremendously useful feature without which Go would not be in the position it is today. cgo is key to ability to run Go programs on Android and iOS.

However, and to be clear these are my opinions, I am not speaking for anyone else, I think cgo is overused in Go projects. I believe that when faced with reimplementing a large piece of C code in Go, programmers choose instead to use cgo to wrap the library, believing that it is a more tractable problem. I believe this is a false economy.

Obviously, there are some cases where cgo is unavoidable, most notably where you have to interoperate with a graphics driver or windowing system that is only available as a binary blob. But those cases where cgo’s use justifies its trade-offs are fewer and further between than many are prepared to admit.

Here is an incomplete list of trade-offs you make, possibly without realising them, when you base your Go project on a cgo library.

Slower build times

When you import "C" in your Go package, go build has to do a lot more work to build your code. Building your package is no longer simply passing a list of all the .go files in scope to a single invocation of go tool compile, instead:

  • The cgo tool needs to be invoked to generate the C to Go and Go to C thunks and stubs.
  • Your system C compiler has to be invoked for every C file in the package.
  • The individual compilation units are combined together into a single .o file.
  • The resulting .o file take a trip through the system linker for fix-ups against shared objects they reference.

All this work happens every time you compile or test your package, which is constantly, if you’re actively working in that package. The Go tool parallelises some of this work where possible, but your packages’ compile time just grew to include a full rebuild of all that C code.

It’s possible to work around this by pushing the cgo shims out into their own package, avoiding the compile time hit, but now you’ve had to restructure your application to work around a problem that you didn’t have before you started to use cgo.

Oh, and you have to debug C compilation failures on the various platforms your package supports.

Complicated builds

One of the goals of Go was to produce a language who’s build process was self describing; the source of your program contains enough information for a tool to build the project. This is not to say that using a Makefile to automate your build workflow is bad, but before cgo was introduced into a project, you may not have needed anything but the go tool to build and test. Afterwards, to set all the environment variables, keep track of shared objects and header files that may be installed in weird places, now you do.

Keep in mind that Go supports platforms that don’t ship with make out of the box, so you’ll have to dedicate some time to coming up with a solution for your Windows users.

Oh, and now your users have to have a C compiler installed, not just a Go compiler. They also have to install the C libraries your project depends on, so you’ll be taking on that support cost as well.

Cross compilation goes out the window

Go’s support for cross compilation is best in class. As of Go 1.5 you can cross compile from any supported platform to any other platform with the official installer available on the Go project website.

By default cgo is disabled when cross compiling. Normally this isn’t a problem if your project is pure Go. When you mix in dependencies on C libraries, you either have to give up the option to cross compile your product, or you have to invest time in finding and maintaining cross compilation C toolchains for all your targets.

Maybe if you work on a product that only communicates with clients over TCP sockets and you intend to run it in a SaaS model it’s reasonable to say that you don’t care about cross compilation. However, if you’re making a product which others will use, possibly integrated into their products, maybe it’s a monitoring solution, maybe it’s a client for your SaaS service, then you’ve locked them out of being able to easily cross compile.

The number of platforms that Go supports continues to grow. Go 1.5 added support for 64 bit ARM and PowerPC. Go 1.6 adds support for 64 bit MIPS, and IBM’s s390 architecture is touted for Go 1.7. RISC-V is in the pipeline. If your product relies on a C library, not only do you have the all problems of cross compilation described above, you also have to make sure the C code you depend on works reliably on the new platforms Go is supporting — and you have to do that with the limited debuggability a C/Go hybrid affords you. Which brings me to my next point.

You lose access to all your tools

Go has great tools; we have the race detector, pprof for profiling code, coverage, fuzz testing, and source code analysis tools. None of those work across the cgo blood/brain barrier.

Conversely excellent tools like valgrind don’t understand Go’s calling conventions or stack layout.  On that point, Ian Lance Taylor’s work to integrate clang’s memory sanitiser to debug dangling pointers on the C side will be of benefit for cgo users in Go 1.6.

Combing Go code and C code results in the intersection of both worlds, not the union; the memory safety of C, and the debuggability of a Go program.

Performance will always be an issue

C code and Go code live in two different universes, cgo traverses the boundary between them. This transition is not free and depending on where it exists in your code, the cost could be inconsequential, or substantial.

C doesn’t know anything about Go’s calling convention or growable stacks, so a call down to C code must record all the details of the goroutine stack, switch to the C stack, and run C code which has no knowledge of how it was invoked, or the larger Go runtime in charge of the program.

To be fair, Go doesn’t know anything about C’s world either. This is why the rules for passing data between the two have become more onerous over time as the compiler becomes better at spotting stack data that is no longer considered live, and the garbage collector becomes better at doing the same for the heap.

If there is a fault while in the C universe, the Go code has to recover enough state to at least print a stack trace and exit the program cleanly, rather than barfing up a core file.

Managing this transition across call stacks, especially where signals, threads and callbacks are involved is non trivial, and again Ian Lance Taylor has done a huge amount of work in Go 1.6 to improve the interoperability of signal handling with C.

The take away is that the transition between the C and Go world is non trivial, and it will never be free from overhead.

C calls the shots, not your code

It doesn’t matter which language you’re writing bindings or wrapping C code with; Python, Java with JNI, some language using libFFI, or Go via cgo; it’s C’s world, you’re just living in it.

Go code and C code have to agree on how resources like address space, signal handlers, and thread TLS slots are to be shared — and when I say agree, I actually mean Go has to work around the C code’s assumption. C code that can assume it always runs on one thread, or blithely be unprepared to work in a multi threaded environment at all.

You’re not writing a Go program that uses some logic from a C library, instead you’re writing a Go program that has to coexist with a belligerent piece of C code that is hard to replace, has the upper hand negotiations, and doesn’t care about your problems.

Deployment gets more complicated

Any presentation on Go to a general audience will contain at least one slide with these words:

Single, static binary

This is Go’s ace in the hole that has lead it to become a poster child of the movement away from virtual machines and managed runtimes. Using cgo, you give that up.

Depending on your environment, it’s probably possible to build your Go project into a deb or rpm, and assuming your other dependencies are also packaged, add them as an install dependency and push the problem off the operating system’s package manager. But that’s several significant changes to a build and deploy process that was previously as straight forward as go build && scp.

It is possible to compile a Go program entirely statically, but it is by no means simple and shows that the ramifications of including cgo in your project will ripple through your entire build and deploy life cycle.

Choose wisely

To be clear, I am not saying that you should not use cgo. But before you make that Faustian bargain, please consider carefully the qualities of Go that you’ll be giving up in return.

Are Go maps sensitive to data races ?

Panic messages from unexpected program crashes are often reported on the Go issue tracker. An overwhelming number of these panics are caused by data races, and an overwhelming number of those reports centre around Go’s built in map type.

unexpected fault address 0x0
fatal error: fault
[signal 0x7 code=0x80 addr=0x0 pc=0x40873b]

goroutine 97699 [running]:
runtime.throw(0x17f5cc0, 0x5)
    /usr/local/go/src/runtime/panic.go:527
runtime.sigpanic()
    /usr/local/go/src/runtime/sigpanic_unix.go:21
runtime.mapassign1(0x12c6fe0, 0xc88283b998, 0xc8c9b63c68, 0xc8c9b63cd8)
    /usr/local/go/src/runtime/hashmap.go:446

Why is this so ? Why is a map commonly involved with a crash ? Is Go’s map implementation inherently fragile ?

To cut to the chase: no, there is nothing wrong with Go’s map implementation. But if there is nothing wrong with the implementation, why do maps and panic reports commonly find themselves in close proximity ?

There are three reasons that I can think of.

Maps are often used for shared state

Maps are fabulously useful data structures and this makes them perfect for tasks such as a shared cache of precomputed data or a lookup table of outstanding requests. The common theme here is the map is being used to store data shared across multiple goroutines.

Maps are more complex structures

Compared to the other built in data types like channels and slices, Go maps are more complex — they aren’t just views onto a backing array of elements. Go maps contain significant internal state, and map iterators (for k, v := range m) contain even more.

Go maps are not goroutine safe, you must use a sync.Mutex, sync.RWMutex or other memory barrier primitive to ensure reads and writes are properly synchronised. Getting your locking wrong will corrupt the internal structure of the map.

Maps move things

Of all of Go’s built in data structures, maps are the only ones that move data internally. When you insert or delete entries, the map may need to rebalance itself to retain its O(1) guarantee. This is why map values are not addressable.

Without proper synchronisation different CPUs will have different representations of the map’s internal structure in their caches. Although the language lawyers will tell you that a program with a data race exhibits undefined behaviour, it’s easy to see how having a stale copy of a map’s internal structure can lead to following a stale pointer to oblivion.

Please use the race detector

Go ships with a data race detector that works on Windows, Linux, FreeBSD and OSX. The race detector will spot this issue, and many more.

Please use it when testing your code.

How will you be programming in a decade ?

What does the computing landscape look like in a decade ?

In a word, bifurcated.

At the individual level there will be range of battery powered devices; watches, mobile phones, tablets with removable keyboards, and those without. They will be numerous, at a wide range of price points, allowing them to be dedicated to the individual. A personal computer if you will.

Of course these devices will have to always be connected to a network and the eponymous cloud, and thus the other half of the puzzle. If you think you’re going to be able to walk downstairs in a decade and touch the hardware your software runs on — you’re in for a rude shock.

What happened to the middle ?

Well, Steve Jobs blew up the desktop market, and with it the outlook for PC shipments.

Desktop PC shipments, 2010-2019

Desktop PC shipments, 2010-2019

But, but, I hear you say. You, the reader, might have a desktop computer for gaming, or enjoy software development on a workstation, rather than a laptop, or your phone.

That’s fine, nobody said you’re wrong, but you are increasingly a minority, and the economics of scale are not working in your favour.

What about us developers ?

Yongsan Electronics Market. Korea’s strategic reserve of whitebox desktop PCs stretches to the horizon.

So we know where the hardware is going, but a16z says software is eating the world. Who’s going to write all this software ? And if there are no desktop computers, how ?

Maybe, companies like Nitrous.io and Koding are right, and we’ll all be using online tools. In which case tablets with all day battery life and WiFi are the ticket — the market is certainly betting on that.

But I think there are serious and persistent problems with the idea of always on that cannot be fixed with money.

Broadband cellular or WiFi data has an upper limit, sure you can stack channels to make a single TCP flow go faster, but when everyone wants fast flows — and good upload bandwidth, will that scale to cities with tens of millions of individual ? Will it scale to people who don’t want to live in said Megalopolis ?

Probably not.

The other outcome is the developer PC continues to exist, in an increasingly rarified (and expensive) form as workstations migrate to the economies of scale that drive server chip sets.

Video editing suite

Video editing suite

This is what video editing looks like today, part PC, mostly custom packaged single use solution. Imagine what it would look like if this was what was required to produce software ?

How will you be programming in a decade ?

LISP Machine

A whirlwind tour of Go’s runtime environment variables

Introduction

The Go runtime, in addition to providing the usual services of garbage collection, goroutine scheduling, timers, network polling and so forth, contains facilities to enable extra debugging output and even alter the behaviour of the runtime itself.

These facilities are controlled by environment variables passed to the Go program. This post describes the function of the major environment variables supported by the runtime.

GOGC

GOGC is one of the oldest environment variable supported by the Go runtime. It’s possibly older than GOROOT, but nowhere near as well known.

GOGC controls the aggressiveness of the garbage collector. By default this value is assumed to be 100, which means garbage collection will not be triggered until the heap has grown by 100% since the previous collection. Effectively GOGC=100 (the default) means the garbage collector will run each time the live heap doubles.

Setting this value higher, say GOGC=200, will delay the start of a garbage collection cycle until the live heap has grown to 200% of the previous size. Setting the value lower, say GOGC=20 will cause the garbage collector to be triggered more often as less new data can be allocated on the heap before triggering a collection.

Setting GOGC=off will disable garbage collection entirely.

With the introduction of the low latency collector in Go 1.5, phrases like “trigger a garbage collection cycle” become more fluid, but the underlying message that values of GOGC greater than 100 mean the garbage collector will run less often, and for values of GOGC less than 100, more often, remains the same.

GOTRACEBACK

GOTRACEBACK controls the level of detail when a panic hits the top of your program. In Go 1.5 GOTRACEBACK has four valid values.

  • GOTRACEBACK=0 will suppress all tracebacks, you only get the panic message.
  • GOTRACEBACK=1 is the default behaviour, stack traces for all goroutines are shown, but stack frames related to the runtime are suppressed.
  • GOTRACEBACK=2 is the same as the previous value, but frames related to the runtime are also shown, this will reveal goroutines started by the runtime itself.
  • GOTRACEBACK=crash is the same as the previous value, but rather than calling os.Exit, the runtime will cause the process to segfault, triggering a core dump if permitted by the operating system.

The effect of GOTRACEBACK can be seen with a simple program.

package main

func main() {
        panic("kerboom")
}

Compiling and running this program with GOTRACEBACK=0 shows the suppression of all goroutine stack traces.

% env GOTRACEBACK=0 ./crash 
panic: kerboom
% echo $?
2

Experimentation with the other possible values of GOTRACEBACK is left as an exercise to the reader.

Changes to GOTRACEBACK coming in Go 1.6

For Go 1.6 the interpretation of GOTRACEBACK is changing. The new values of GOTRACEBACK will be:

  • GOTRACEBACK=none will suppress all tracebacks, you only get the panic message.
  • GOTRACEBACK=single is the new default behaviour that prints only the goroutine believed to have caused the panic.
  • GOTRACEBACK=all causes stack traces for all goroutines to be shown, but stack frames related to the runtime are suppressed.
  • GOTRACEBACK=system is the same as the previous value, but frames related to the runtime are also shown, this will reveal goroutines started by the runtime itself.
  • GOTRACEBACK=crash is unchanged from Go 1.5.

For compatibility with Go 1.5, a value of 0 maps to none, 1 maps to all, and 2 maps to system.

The major take away from this change is, by default in Go 1.6, panic messages will only print the stack trace for the faulting goroutine. This change is detailed in issue 12366 and CL 16512.

GOMAXPROCS

GOMAXPROCS is the well known (and cargo culted via its runtime.GOMAXPROCS counterpart), value that controls the number of operating system threads allocated to goroutines in your program.

As of Go 1.5, the default value of GOMAXPROCS is the number of CPUs (whatever your operating system considers to be a CPU) visible to the program at startup.

note: the number of operating system threads in use by a Go program includes threads servicing cgo calls, thread blocked on operating system calls, and may be larger than the value of GOMAXPROCS.

GODEBUG

Saving the best for last is GODEBUG. The contents of GODEBUG are interpreted as a list of name=value pairs separated by commas, where each name is a runtime debugging facility. Here is an example invoking godoc with garbage collection and schedule tracing enabled:

% env GODEBUG=gctrace=1,schedtrace=1000 godoc -http=:8080

The remainder of this post will discuss the GODEBUG debugging facilities that I find useful to diagnosing Go programs.

gctrace

Of all the GODEBUG facilities, gctrace is the one I find most useful. Here is the output of the first few milliseconds of a godoc -http server with gctrace debugging enabled:

% env GODEBUG=gctrace=1 godoc -http=:8080 -index
gc #1 @0.042s 4%: 0.051+1.1+0.026+16+0.43 ms clock, 0.10+1.1+0+2.0/6.7/0+0.86 ms cpu, 4->32->10 MB, 4 MB goal, 4 P
gc #2 @0.062s 5%: 0.044+1.0+0.017+2.3+0.23 ms clock, 0.044+1.0+0+0.46/2.0/0+0.23 ms cpu, 4->12->3 MB, 8 MB goal, 4 P
gc #3 @0.067s 6%: 0.041+1.1+0.078+4.0+0.31 ms clock, 0.082+1.1+0+0/2.8/0+0.62 ms cpu, 4->6->4 MB, 8 MB goal, 4 P
gc #4 @0.073s 7%: 0.044+1.3+0.018+3.1+0.27 ms clock, 0.089+1.3+0+0/2.9/0+0.54 ms cpu, 4->7->4 MB, 6 MB goal, 4 P

The format of this output changes with every version of Go, but you will always find commonalities like the amount of time of the various gc phases; 0.051+1.1+0.026+16+0.43 ms clock, and the various heap sizes during garbage collection cycle; 4->6->4 MB. This trace also includes the timestamp the gc cycle completed, relative to the start time of the program, however older versions of Go omit this information.

The individual output lines may be useful for analysis, but I find it more useful to view them in aggregate. For example, if you enable gc tracing and the output is continuous, it’s a clear sign that the program is allocation bound. Likewise if the reported size of the heap continues to grow over time, that is a clear sign of a memory leak where references that are expected to be freed are being retained in some global structure.

The overhead of enabling gctrace is effectively zero for production deployments as these statistics are always being collected, but are normally suppressed. I recommend that you enable it at least for some representative sample of your application’s production deployment.

note:setting gctrace to values larger than 1 causes each garbage collection cycle to be run twice. This exercises some aspects of finalisation that require two garbage collection cycles to complete. You should not use this as a mechanism to alter finalisation performance in your programs because you should not write programs who’s correctness depends on finalisation.

The heap scavenger

By far the most useful piece of output enabled by gctrace=1 is the output of the heap scavenger.

scvg143: inuse: 8, idle: 104, sys: 113, released: 104, consumed: 8 (MB)

The scavenger’s job is to periodically sweep the heap looking for unused operating system pages. The scavenger then releases them by notifying the operating system that these memory pages from the heap that are not in use. There is no facility to force the operating system to take back the page and many operating systems choose to ignore this advice, or at least defer taking any action until the a time when the machine is starved for free memory.

The output from the scavenger is the best way I know of to tell how much virtual address space is in use by your Go program. It is expected that these values will vary significantly from what tools like free(1) and top(1) report. You should trust the values reported by the scavenger.

schedtrace

Because the Go runtime manages the allocation of a large set of goroutines onto a smaller set of operating system threads, observing your program externally may not give sufficient detail to understand its performance. You may need to investigate the operation of the runtime scheduler directly.  This output is controlled with the schedtrace value:

% env GODEBUG=schedtrace=1000 godoc -http=:8080 -index
SCHED 0ms: gomaxprocs=4 idleprocs=2 threads=4 spinningthreads=1 idlethreads=0 runqueue=0 [0 0 0 0]
SCHED 1001ms: gomaxprocs=4 idleprocs=0 threads=8 spinningthreads=0 idlethreads=2 runqueue=0 [189 197 231 142]
SCHED 2004ms: gomaxprocs=4 idleprocs=0 threads=9 spinningthreads=0 idlethreads=1 runqueue=0 [54 45 38 86]
SCHED 3011ms: gomaxprocs=4 idleprocs=0 threads=9 spinningthreads=0 idlethreads=2 runqueue=2 [85 0 67 111]
SCHED 4018ms: gomaxprocs=4 idleprocs=3 threads=9 spinningthreads=0 idlethreads=4 runqueue=0 [0 0 0 0]

A detailed discussion of the schedtrace output is available in Dmitry Vyukov’s excellent blog post from the Intel DeveloperZone.

Append scheddetail=1 will cause the runtime to output the state of each individual goroutine in addition to the summary, producing very verbose output.

% env GODEBUG=scheddetail=1,schedtrace=1000 godoc -http=:8080 -index
SCHED 0ms: gomaxprocs=4 idleprocs=3 threads=3 spinningthreads=0 idlethreads=0 runqueue=0 gcwaiting=0 nmidlelocked=0 stopwait=0 sysmonwait=0
  P0: status=1 schedtick=0 syscalltick=0 m=0 runqsize=0 gfreecnt=0
  P1: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P2: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  P3: status=0 schedtick=0 syscalltick=0 m=-1 runqsize=0 gfreecnt=0
  M2: p=-1 curg=-1 mallocing=0 throwing=0 preemptoff= locks=1 dying=0 helpgc=0 spinning=false blocked=false lockedg=-1
  M1: p=-1 curg=17 mallocing=0 throwing=0 preemptoff= locks=0 dying=0 helpgc=0 spinning=false blocked=false lockedg=17
  M0: p=0 curg=1 mallocing=0 throwing=0 preemptoff= locks=2 dying=0 helpgc=0 spinning=false blocked=false lockedg=1
  G1: status=2(stack growth) m=0 lockedm=0
  G17: status=3() m=1 lockedm=1
  G2: status=1() m=-1 lockedm=-1

This output may be useful for debugging leaking goroutines, but other facilities like net/http/pprof are likely to be more useful.

Further reading

All the environment variables available for your version of Go are detailed in the godoc for the runtime package.

Wednesday pop quiz: spot the race

The following program contains a data race

package main

import (
        "fmt"
        "time"
)

type RPC struct {
        result int
        done   chan struct{}
}

func (rpc *RPC) compute() {
        time.Sleep(time.Second) // strenuous computation intensifies
        rpc.result = 42
        close(rpc.done)
}

func (RPC) version() int {
        return 1 // never going to need to change this
}

func main() {
        rpc := &RPC{done: make(chan struct{})}

        go rpc.compute()         // kick off computation in the background
        version := rpc.version() // grab some other information while we're waiting
        <-rpc.done               // wait for computation to finish
        result := rpc.result

        fmt.Printf("RPC computation complete, result: %d, version: %d\n", result, version)
}

Where is the data race, and what is the smallest change that will fix it ?

Answer: the smallest change I know that will solve the race in this program is to change the receiver of the version method from RPC to *RPC.

Postscript

The example above is derived from a larger, and more confusing example. You may be interested in the original race report.

The Legacy of Go

In October this year I had the privilege of speaking at the GothamGo conference in New York City. As I talk quite softly, and there were a few problems with the recording, I decided to write up my slide notes and present them here.

If you want to see the video of this presentation, you can find it on youtube.


The Legacy of Go

I want to open with a question — How will Go be remembered ? 

In posing this question I do not wish to appear morbid, or to suggest that Go’s best days are already behind it. Rather, in the spirit of this conference’s theme, I want to speculate on what will be the biggest mark Go will leave on our profession.

So to restate the question —  What will be the legacy of Go ?

To set the stage to answer this question, let’s look at some historical examples.

C

The C Programming LanguageC, the progenitor of an entire family of curly brace languages. If you were to briefly summarise C’s legacy, how would you describe it ? 

One of my favourite descriptions of C is a portable high level assembly language. C certainly wasn’t the first high level language, but it was one of the first to take portability seriously. Additionally C has the distinction of being the language used to write the operating system of every computer in this room, and probably many of the micro controllers as well.

Before C there was assembly language, and there is no indication that we are in the period that could be classified as after C.

This is C’s legacy.

C++

The C++ Programming LanguageLet’s try another. How would you describe C++’s legacy ?

I think for many C++ is considered the Rolls Royce of mainstream object orientation.

C++ also codified the ideas of zero cost abstractions. In fact, the only abstractions C++ developers prefer to use are the ones that come with no cost, assuming of course that you don’t consider compile time a cost.

Ruby on Rails

Ruby on RailsFor my third example, something a little different. Maybe you haven’t used Ruby on Rails, but I’d wager you’ve probably heard of it. How would you describe the legacy of Ruby on Rails ? 

My take away from Rails was the standardised project layout. Every Rails project had models in the models directory, controllers in the controllers directory, and views in the views directory. This was Rails’ mantra of convention over configuration

Compare this to the previous generation of web frameworks, all different in ways that are at best described as belligerent.

Now, post Rails, all web frameworks look alike. Words like routing, controllers, middleware, assets, are codified in our lexicon because of Rails.  This is the sort of legacy I’m talking about.

So, now I’ve established a framework, let’s turn to the conference’s namesake.

A simple programming language

At the start of the year I had the opportunity to give a talk in India entitled Simplicity and Collaboration.  As I was lucky enough to be giving the closing keynote this gave me an opportunity to try a real table thumping call to action.

I mean, who doesn’t want to be simple ? And what better way to frame a debate as simple; good, complexity; obviously bad. Could we say then that simplicity will be Go’s lasting legacy ?

History of Programming LanguagesPerhaps, but perhaps our frame of reference is a little skewed. In preparing this talk, I found numerous anecdotes from language designers, who, reflecting back on their own achievements, lamented complexity’s siren song.

Some argued that the solution to complexity was abstraction, others felt abstraction itself was the root of the problem. However, unanimously these luminaries believed that complexity must be avoided, and cautioned others to strive for simplicity in their designs.

Is Go is the language which delivers this long sought after promise ? I think it’s probably too early to say. The signs are positive, but this is probably not going to be the thing that I believe people will remember most about Go.

Fitness for purpose

Go will certainly not be remembered as an academic language, it breaks only the minimum of new ground, preferring instead to consolidate on a corpus of proven ideas.

One aspect which is contributing to our language’s success is what I term its fitness for purpose.

As Rob Pike wrote in 2012, Go is a language designed to integrate with an existing software development ecosystem. I believe Go’s popularity is due in large part to the care its designers gave to every aspect of the language’s interaction with the complete software development life cycle.

But sadly, this is not what I believe what people will remember our language for, because few outside the Go community appear to appreciate the holistic nature of the Go’s design.

However, in discussing the motivations that drove the design of the language, we see a clue to its possible legacy.

Tooling

I think it is the tooling that has grown, not in spite of the language, but in deliberate symbiosis, which deserves recognition.

In his opening keynote at Gophercon this year Russ Cox spoke about the need for mechanical refactoring and code generation to be indistinguishable from code written by a human. In particular I feel go fmt deserves most of the credit here. While more powerful translations are possible, the low barrier of entry to using gofmt has ensured its ubiquity.

It’s not enough that the code is well formatted according to local custom, but instead that there is precisely one way Go code should be formatted. This cannot be understated.

The result is nowadays all Go code is go formatted, and the little which is not is viewed with deep suspicion.

Just as no web framework will call a controller something different, I believe that no future language will be considered complete without a canonical style, and a tool to enforce it.

Pop Culture

Pop CultureEarlier this year my colleague Katherine Cox-Buday gave a talk at Gophercon where she built upon a conjecture by Alan Kay to illustrate some concerning dogma that she observed amongst Go developers.

I enjoyed this talk very much, especially Kay’s rebuttal that our profession is not a science but a pop culture.

We live in what has been described as the information age. An age of digitisation. An age of the transistor and the computer. These are pervasive forces reshaping our society. Thus I think it is impossible to separate the role of those who program, from the impact of the computer itself.

Accepting Kay’s critique of our industry, pop culture is responsible for some of the most iconic ideas in our society, and in answering the question of Go’s own legacy, it makes good sense to investigate the role of the programmer with a wider social context.

To explain what I mean, permit me to digress for a moment.

Denim, sunglasses and portable music

In 1870 Jacob Davis, a Californian tailor, was asked by one of his customers to create a hard wearing pair of trousers for her husband.

PatentBy reinforcing the weak points on the seams and pockets with copper rivets, Davis created a durable garment that became an overnight sensation.

Within a year he was unable to keep up with demand and approached his supplier of denim, one Levi Strauss, for financial support in patenting his design.

Today we know them simply as Levi jeans. Although Davis’ signature rivet was later removed over concerns that it was damaging the furniture, the signature Levi 501 became a cultural icon.

A symbol of rugged individualism, and smart-casual iconoclasm.

In 1936, Bausch and Lomb, a medical instrument company from Rochester, working under contract for the US army, produced a pair of sunglasses designed to aid pilots suffering from eye strain and migraines.

Initially called the “anti-glare”, but renamed a year later when the eyewear division was spun out into a subsidiary, the “ray ban” company, we now know them, released their signature Ray Ban 3025 Aviator.

With the help of strong promotional efforts, Ray Ban’s Aviators became synonymous with action and adventure. The hero, and the maverick.

Maverick

In 1979, the Sony Corporation released the TPS-L2. Better known in most markets as the Walkman, Sony had created a new genre of music consumption.

Walkman

The walkman arrived at the perfect time to ride a wave of lifestyle advertising, allowing everyone to be a music enthusiast, not just the audiophile shut in.

Sony effectively created the idea of personal music, liberating it from the tyranny of the radio disc jockeys, and making it private, personal, and portable. But Sony’s hubris, and a misplaced focus on the music producers, not consumers, caused them to stumble with the mini disc. On that last point, I’ll leave you to draw your own parallels with the software industry.

These are three unrelated tales which weave a story of cultural memes in our society.

Aviator sunglasses, created for a utilitarian purpose have continued to represent an heroic, self confident, ideal. The same could be said for denim jeans. Both have continued to be remixed and riffed on in a way that would have pleased Warhol.

We see similar patterns in language design. Languages are not immune to the whim of popular fashion, unless of course they are Lisp, the perennial tie dye of languages.

A new language has to be sufficiently different

A new language, to be successful, has to be sufficiently different. Being only a minor improvement on a theme makes it too hard to capture mindshare.

Go epitomises this, it is a language that is heavily informed by the past, and at the same time, different enough to justify overcoming the opportunity cost of change.

A trend toward static typing

Types, like denim jeans, are the mainstay of our industry. Types have been decomposed, algebratised, templated, rejected, inferred, pushed to the side, then rediscovered. Types, like kitschy sunglasses or ripped jeans, may go out of style for a time, but never for very long.

Go sits atop the crest of a number of popular waves, one of these is a movement towards static type checking which other languages are now retrofitting. Retrofitting, not for performance, mind you — type inference has pretty much solved that problem, but instead for the productivity of their programmers.

But, I think it is unlikely that Go will be the poster child of this movement, it is merely a participant, and few medals are given out for simply being present.

Interfaces

For me, it is Go’s interfaces, which represent a refinement of C++’s virtual base classes, an evolution through Java’s interface type, and have reached a point where they are divorced completely from both value and hierarchy.

Interfaces represent pure behaviour. In some ways Go’s interfaces are closer to Smalltalk’s vision of objects; defined not by class membership, but instead by behaviour.

I believe interfaces are the iconic feature of Go. They represent a refinement of many previous attempts but are themselves unique among mainstream languages.

So that’s two possible ideas for the legacy of Go, I hope you will indulge me one more.

The things Go took away

In his essay The last programming language, Robert C. Martin asks:

Are languages successful because they offer programmers more choice, or are they successful for the opposite reason, they remove choice ?

And while I’ll have to disagree with Martin’s conclusion that Clojure represents the last programming language, he does make a compelling argument.

In hindsight, the programming languages which have been successful, which have been remembered, and which have established a legacy, are the languages which have successfully removed a commonly accepted tenet of the programming establishment.

Languages like FORTRAN and COBOL removed the requirement to program directly in assembly language, despite howls that efficient programs could never be written in a high level language.

A decade later, languages like Pascal, PL/I and Algol removed direct transfer of control, replacing it with the pillars of structured programming; sequence, selection and iteration, and they did so to cries that a real programmer would have goto prised from their cold dead fingers.

Can we find supporting evidence of Go’s legacy in the things that it chose to remove ?

Inheritance

I think a strong contender could be a lack of inheritance; Go took away subtypes. Everyone knows composition is more powerful than inheritance, Go just makes this non optional.

In the cacophony of hand wringing over a lack of templated types, nobody seems to be complaining that a lack of inheritance is hurting their ability to write programs in Go. It seems that nobody missed inheritance much after all.

However it also seems to me that people are not going to remember Go for taking away something that they never missed in the first place.

Semicolons;

Go is a curly braced, block structured language, but with a cute trick of the lexer. The semicolons are still there, we just hide them from the author. But, this is also not a new trick. Javascript made semicolons optional, and sometimes it works.

Go’s implementation is also not an original idea, it was taken directly from Martin Richard’s BCPL.

So while Go finally removed semicolons, I don’t think it can claim credit for the idea.

Threads

In my mind, of all the possible candidates that Go has removed, it is the removal of threads that will be its most profound contribution.

This is not to say that Go programs do not use threads, any more than you can say structured programs are not compiled into branch and jump instructions.

But Go programmers no longer have to concern themselves with thread management, or as Uncle Bob would say, Go programmers are restricted from directly controlling the thread their code runs on.

Co-incidentally, the removal of threads from the Go programmers’ model means the removal of a requirement to care about the stack, unlocking the much older technique of recursion as an alternative to state machines or mutable state.

Goroutines are cheap, so cheap that we can structure our programs in a straightforward imperative fashion without having to worry about the overhead of one operating system thread per goroutine.

Go took away many things, but in removing threads and a need to care about the stack, replacing them instead with a more coherent idea of goroutines and channels, I think it has made its most powerful mark.

gofmt, interfaces, goroutines

To recap, go fmt, interfaces, and goroutines. Some of these are new ideas, others are not.

The goal of this presentation was not to identify three new innovations in Go, but rather to attempt to predict how a future generation of programmers will look back upon Go’s legacy.

Go is still young, with a long productive life ahead of it, and that means that almost all of the Go code that will be written, has yet to be written.

Similarly, while the community of Go users is growing, compared to the number of people who will use Go during its lifetime, we are but a tiny fraction. Therefore we should optimise for this larger group of people who have yet to write any Go code.

This means more examples, more blog posts, and more books.

This means more education and more training — I think Go is a fantastic language to teach the theory and the practice of programming, and we’ve barely scratched the surface here.

This means more user groups, more conferences, and more diversity — there is so much more to be done here, and we should look to established language communities, like Python for example, for guidance.

And this is the part where you come in. This is the time for you to lean in. This is the time for you to get involved.

Because this is your opportunity to decide how you want Go to be remembered.

Thank you.

Let’s talk about logging

This is a post inspired by a thread that Nate Finch started on the Go Forum. This post focuses on Go, but if you can see your way past that, I think the ideas presented here are widely applicable.

Why no love ?

Go’s log package doesn’t have leveled logs, you have to manually add prefixes like debug, info, warn, and error, yourself. Also, Go’s logger type doesn’t have a way to turn these various levels on or off on a per package basis. By way of comparison let’s look at a few replacements from third parties.

glog from Google provides the following levels:

  • Info
  • Warning
  • Error
  • Fatal (which terminates the program)

Looking at another library, loggo, which we developed for Juju, provides the following levels:

  • Trace
  • Debug
  • Info
  • Warning
  • Error
  • Critical

Loggo also provides the ability to adjust the verbosity of logging on a per package basis.

So here are two examples, clearly influenced by other logging libraries in other languages. In fact their linage can be traced back to syslog(3), maybe even earlier. And I think they are wrong.

I want to take a contradictory position. I think that all logging libraries are bad because the offer too many features; a bewildering array of choice that dazzles the programmer right at the point they must be thinking clearly about how to communicate with the reader from the future; the one who will be consuming their logs.

I posit that successful logging packages need far less features, and certainly fewer levels.

Let’s talk about warnings

Let’s start with the easiest one. Nobody needs a warning log level.

Nobody reads warnings, because by definition nothing went wrong. Maybe something might go wrong in the future, but that sounds like someone else’s problem.

Furthermore, if you’re using some kind of leveled logging then why would you set the level at warning ? You’d set the level at info, or error. Setting the level to warning is an admission that you’re probably logging errors at warning level.

Eliminate the warning level, it’s either an informational message, or an error condition.

Let’s talk about fatal

Fatal level is effectively logging the message, then calling os.Exit(1). In principal this means:

  • defer statements in other goroutines don’t run.
  • buffers aren’t flushed.
  • temporary files and directories aren’t removed.

In effect, log.Fatal is a less verbose than, but semantically equivalent to, panic.

It is commonly accepted that libraries should not use panic1, but if calling log.Fatal2 has the same effect, surely this should also be outlawed.

Suggestions that this cleanup problem can be solved by registering shutdown handlers with the logging system introduces tight coupling between your logging system and every place where cleanup operations happen; its also violates the separation of concerns.

Don’t log at fatal level, prefer instead to return an error to the caller. If the error bubbles all the way up to main.main then that is the right place to handle any cleanup actions before exiting.

Let’s talk about error

Error handling and logging are closely related, so on the face of it, logging at error level should be easily justifiable. I disagree.

In Go, if a function or method call returns an error value, realistically you have two options:

  • handle the error.
  • return the error to your caller. You may choose to gift wrap the error, but that is not important to this discussion.

If you choose to handle the error by logging it, by definition it’s not an error any more — you handled it. The act of logging an error handles the error, hence it is no longer appropriate to log it as an error.

Let me try to convince you with this code fragment:

err := somethingHard()
if err != nil {
        log.Error("oops, something was too hard", err)
        return err // what is this, Java ?
}

You should never be logging anything at error level because you should either handle the error, or pass it back to the caller.

To be clear, I am not saying you should not log that a condition occurred

if err := planA(); err != nil {
        log.Infof("could't open the foo file, continuing with plan b: %v", err)
        planB()
}

but in effect log.Info and log.Error have the same purpose.

I am not saying DO NOT LOG ERRORS! Instead the question is, what is the smallest possible logging API ? And when it comes to errors, I believe that an overwhelming proportion of items logged at error level are simple done that way because they are related to an error. They are in fact, just informational, hence we can remove logging at error level from our API.

What’s left ?

We’ve ruled out warnings, argued that nothing should be logged at error level, and shown that only the top level of the application should have some kind of log.Fatal behaviour. What’s left ?

I believe that there are only two things you should log:

  1. Things that developers care about when they are developing or debugging software.
  2. Things that users care about when using your software.

Obviously these are debug and info levels, respectively.

log.Info should simply write that line to the log output. There should not be an option to turn it off as the user should only be told things which are useful for them. If an error that cannot be handled occurs, it should bubble up main.main where the program terminates. The minor inconvenience of having to insert the FATAL prefix in front of the final log message, or writing directly to os.Stderr with fmt.Fprintf is not sufficient justification for a logging package growing a log.Fatal method.

log.Debug, is an entirely different matter. It is for the developer or support engineer to control. During development, debugging statements should be plentiful, without resorting to trace or debug2 (you know who you are) level. The log package should support fine grained control to enable or disable debug, and only debug, statements at the package or possibly even finer scope.

Wrapping up

If this were a twitter poll, I’d ask you to choose between

  • logging is important
  • logging is hard

But the fact is, logging is both. The solution to this problem must be to de-construct and ruthlessly pair down unnecessary distraction.

What do you think ? Is this just crazy enough to work, or just plain crazy ?


Notes

  1. Some libraries may use panic/recover as an internal control flow mechanism, but the overriding mantra is they must not let these control flow operations leak outside the package boundary.
  2. Ironically while it lacks a debug level output, the Go standard log package has both Fatal and Panic functions. In this package the number of functions that cause a program to exit abruptly outnumber those that do not.

Programming language markets

Last night at the Sydney Go Users’ meetup, Jason Buberel, product manager for the Go project, gave an excellent presentation on a product manager’s perspective on the Go project.

As part of his presentation, Buberel broke down the marketplace for a programming language into seven segments.

Programming Language market for Go, courtesy Jason Buberel

Programming language market for Go, courtesy Jason Buberel

As a thought experiment, I’ve taken Buberel’s market segments and applied them across a bunch of contemporary languages.

Disclaimer: I’m not a product manager, I’ve just seen one on stage.

Language Embedded and
devices1
Systems and
drivers2
Server and
infrastructure3
Web and mobile4 Big data and
scientific computing
Desktop applications5 Mobile applications
Go 0 0 3 2 1 16 1
Rust 1 1 0 2 0 26, 11 0
Java 214 0 2 3 3 27 3
Python 1 0 312 3 3 26, 10 0
Ruby 0 0 3 3 0 16 0
Node.js (Javascript / v8) 113 0 0 2 0 0 28
Objective-C / Swift 0 3 2 29 0 3 3
C/C++ 3 3 3 2 3 3 2

Is your favourite language missing ? Feel free to print this table out and draw in the missing row.

Scoring system: 0 – no presence, lack of interest or technical limitation. 1 – emerging presence or proof of concept. 2 – active competitor. 3 – market leader.

Conclusion

If there is a conclusion to be drawn from this rather unscientific study, every language is in competition to be the language of the backend. As for the other market segments, everyone competes with C and C++, even Java.


Notes:

  1. The internet of things that are too small to run linux; micrcontrollers, arduino, esp8266, etc.
  2. Can you write a kernel, kernel module, or operating system in it ?
  3. Monitoring systems, databases, configuration management systems, that sort of thing.
  4. Web application backends, REST APIs, microservices of all sorts.
  5. Desktop applications, including games, because the mobile applications category would certainly include games.
  6. OpenGL libraries or SDL bindings.
  7. Swing, ugh.
  8. Phonegap, React Native.
  9. Who remembers WebObjects ?
  10. Python is a popular scripting language for games.
  11. Servo, the browser rendering engine is targeting Firefox.
  12. Openstack.
  13. Technically Lua pretending to be Javascript, but who’s counting.
  14. Thanks to @rakyll for reminding me about the Blu Ray drives, and j2me running in everyone’s credit cards.

Bootstrapping Go 1.5 on non Intel platforms

This post is a continuation of my previous post on bootstrapping Go 1.5 on the Raspberry Pi.

Now that Go 1.5 is written entirely in Go there is a bootstrapping problem — you need Go to build Go. For most people running Windows, Mac or Linux, this isn’t a big issue as the Go project provides installers for Go 1.4. However, if you’re using one of the new platforms added in Go 1.5; ARM64 or PPC64, there is no pre built installer as Go 1.4 did not support those platforms.

This post explains how to bootstrap Go 1.5 onto a platform that has no pre built version of Go 1.4. The process assumes you are building on a Darwin or Linux host, if you’re on Windows, sorry you’re out of luck.

We use the names host and target to describe the machine preparing the bootstrap installation and the non Intel machine receiving the bootstrap installation respectively.

As always, please uninstall any version of Go you may have on your workstation before beginning; check your $PATH and do not set $GOROOT.

Build Go 1.4 on your host

After uninstalling any previous version of Go from the host, fetch the source for Go 1.4 and build it locally.

% git clone https://go.googlesource.com/go $HOME/go1.4
% cd $HOME/go1.4/src
% git checkout release-branch.go1.4
% ./make.bash

This process will build Go 1.4 into the directory $HOME/go1.4.
Notes:

  • This procedure assumes that Go 1.4 is built in $HOME/go1.4, if you choose to use another path, please adjust accordingly.
  • We use ./make.bash to skip running the full unit tests, you can use ./all.bash to run the unit tests if you prefer.
  • Do not add $HOME/go1.4/bin to your $PATH.

Build Go 1.5 on your host

Now you have Go 1.4 on your host, you can use that to bootstrap Go 1.5 on your host.

% git clone https://go.googlesource.com/go $HOME/go
% cd $HOME/go/src
% git checkout release-branch.go1.5
% env GOROOT_BOOTSTRAP=$HOME/go1.4 ./make.bash

This process will build Go 1.5 into the directory $HOME/go.
Notes:

  • Again, we use ./make.bash to skip running the full unit tests, you can use ./all.bash to run the unit tests if you prefer.
  • You should add $HOME/go/bin to your $PATH to use the version of Go 1.5 you just built as your Go install.

Build a bootstrap distribution for your target

From your Go 1.5 installation, build a bootstrap version of Go for your target.

The process is similar to cross compiling and uses the same GOOS and GOARCH environment variables. In this example we’ll build a bootstrap for linux/ppc64. To build for other architectures, adjust accordingly.

% cd $HOME/go/src
% env GOOS=linux GOARCH=ppc64 ./bootstrap.bash
...
Bootstrap toolchain for linux/ppc64 installed in /home/dfc/go-linux-ppc64-bootstrap.
Building tbz.
-rw-rw-r-- 1 dfc dfc 46704160 Oct 16 10:39 /home/dfc/go-linux-ppc64-bootstrap.tbz

The bootstrap script is hard coded to place the output tarball two directories above ./bootstrap.bash, which will be $HOME/go-linux-ppc64-bootstrap.tbz in this case.

Now scp go-linux-ppc64-bootstrap.tbz to the target, and unpack it to $HOME.
Notes:

  • This bootstrap distribution should only be used for bootstrapping Go 1.5 on the target.

Build Go 1.5 on your target

On the target you should have the bootstrap distribution in your home directory, ie $HOME/go-linux-ppc64-bootstrap. We’ll use that as our GOROOT_BOOTSTRAP and build Go 1.5 on the target.

% git clone https://go.googlesource.com/go $HOME/go
% cd $HOME/go/src
% git checkout release-branch.go1.5
% env GOROOT_BOOTSTRAP=$HOME/go-linux-ppc64-bootstrap ./all.bash

Now you’ll have Go 1.5 built natively on your target, in this case linux/ppc64, but this procedure has been tested on linux/arm64 and also linux/ppc64le.