Category Archives: Go

I’m speaking at GopherChina and GopherCon Singapore

In April and May I’ll be speaking at GopherChina and GopherCon Singapore, respectively. This post is a teaser for the talks that were selected by the organisers. If you’re in the area, I hope you’ll come and hear me speak.


GopherChina is the third event in this conference series and this year will return to Shanghai. I was lucky to attend the event in 2016 and am looking forward to 2017.

The hidden #pragmas of Go

Go isn’t like C. It doesn’t have a preprocessor, it doesn’t have macros, and it certainly doesn’t have #define, but Go does have pragmas.

What are pragmas? The name come from the #pragma declaration that tells C compilers to alter their interpretation of a piece of code. Now, Go doesn’t have a #pragma directive, but it does have ways of altering the operation of the Go compiler via directive syntax hidden in comments.

This talk will explore the history of these directives, how and why they are used, and how you can, but probably shouldn’t, use them in your own code.

GopherCon Singapore

GopherCon Singapore is the latest in the GopherCon franchise, and as flight times go, relatively close to home. I’m delighted to have the opportunity to present at their inaugural conference in May.

Concurrency made Easy

In my experience, many people who come to Go do so because they have a problem where being able to run more than one task at a time in their program would be beneficial. Ruby and Python programmers come to Go because the concurrency story is much better, the same is true of Node programmers; the event loop is still inherently single threaded.

But, most programmers who stick with Go for a while tend to look back on their early efforts and say things like “wow, I really went overboard with channels” or “I went crazy with goroutines when I started writing Go. It was impossible to understand what the program did”. For people who learn Go formally from an instructor or a book, the concurrency section is always the last section they cover.

So there is a dichotomy here. Go’s headline feature is simple, lightweight concurrency. As a product the language sells itself on that feature alone. On the other hand, there is a narrative that concurrency isn’t actually that easy to use, otherwise people wouldn’t make it the last things in their books or classes, or perhaps more accurately, concurrency is not the solution to every problem.

With this as a background, I’d like to explore some strategies for using concurrency in Go without the pitfalls of convoluted code, the importance of memory ownership, and the best way to structure a Go program using goroutines.

Context is for cancelation

In my previous post I suggested that the best way to break the compile time coupling between the logger and the loggee was passing in a logger interface when constructing each major type in your program. The suggestion has been floated several times that logging is context specific, so maybe a logger can be passed around via a context.Context. I think this suggestion is flawed (as are most uses of context.Value, but that’s another story). This post explains why.

context.Value() is goroutine thread local storage

Using context.Context to pass a logger into a function is a poor design pattern. In effect context.Context is being used as a conduit to arbitrarily extend the API of any method that takes a context.Context value. It’s like Python’s **kwargs, or whatever the name is for that Ruby pattern of always passing a hash. Using context.Context in this way avoids an API break by smuggling data in the unstructured bag of values attached to the context. It’s thread local storage in a cheap suit.

It’s not just that values are boxed into an interface{} inside context.WithValue that I object to. The far more serious concern is there is no schema to this data, so there is no way for a method that takes a context to ensure that it contains the specific key required to complete the operation. context.Value returns nil if the key is not found, which means any code doing the naïve

log := ctx.Value("logger").(log.Logger)
log.Warn("something you'll ignore later")

will blow up if the "logger" key is not present.

Sure, you can check that the assertion succeeded, but I feel pretty confident that if this pattern were to become popular then people would eschew the two arg form of type assertion and just expect that the key always returned a valid logger. This would be especially true as logging in error paths is rarely tested, so you’ll hit this when you need it the most.

In my opinion passing loggers inside context.Context would be the worst solution to the problem of decoupling loggers from implementations. We’d have gone from an explicit compile time dependency to an implicit run time dependency, one that could not be enforced by the compiler.

To quote @freeformz

Loggers should be injected into dependencies. Full stop.

It’s verbose, but it’s the only way to achieve decoupled design.

The package level logger anti pattern

This post is a spin-off from various conversations around improving (I’m trying not to say standardising, otherwise I’ll have to link to XKCD) the way logging is performed in Go projects.

Consider this familiar pattern for establishing a package level log variable.

package foo

import “mylogger”

var log = mylogger.GetLogger(“”)

What’s wrong with this pattern?

The first problem with declaring a package level log variable is the tight coupling between package foo and package mylogger. Package foo now depends directly on package mylogger at compile time.

The second problem is the tight coupling between package foo and package mylogger is transitive. Any package that consumes package foo is itself dependant on mylogger at compile time.

This leads to a third problem, Go projects composed of packages using multiple logging
libraries, or fiefdoms of projects who can only consume packages that use their particular logging library.

Avoid source level coupling

The solution to this anti pattern is to delay the binding between the type that does the logging, and the type that needs to log, until it is needed. That is, until the variable is declared.

package foo

import ""

type T struct {
        logger log.Logger
        // other fields

Now, the consumer of  type T supplies a value of type log.Logger when constructing new T‘s, and the methods on T use the logger they were provided when they want to log.

Interfaces to the rescue

The eagle eyed reader will note that the previous selection removed the package level log variable, but the coupling between package foo and package log remains.

However, this can be remedied by the consumer of the logger type declaring its own interface for the behaviour it expects.

package foo

type logger interface {
        Printf(string, ...interface{})

type T struct {
        // other fields

As long as the type assigned to foo.T.logger implements foo.logger the decision for which specific type to use can be deferred until run time in the same way that io.Copy escapes any knowledge of the io.Reader and io.Writer implementations in use until it is invoked.

It’s not just logging

Logging is a cross cutting concern, but the anti patterns associated with it also apply to other common areas like metrics, telemetry, and auditing.

Get involved

The Go 1.9 development window is opening next month. If this topic is important to you, get involved.

Never start a goroutine without knowing how it will stop

In Go, goroutines are cheap to create and efficient to schedule. The Go runtime has been written for programs with tens of thousands of goroutines as the norm, hundreds of thousands are not unexpected. But goroutines do have a finite cost in terms of memory footprint; you cannot create an infinite number of them.

Every time you use the go keyword in your program to launch a goroutine, you must know how, and when, that goroutine will exit. If you don’t know the answer, that’s a potential memory leak.

Consider this trivial code snippet:

ch := somefunction()
go func() {
        for range ch { }

This code obtains a channel of int from somefunction and starts a goroutine to drain it. When will this goroutine exit? It will only exit when ch is closed. When will that occur? It’s hard to say, ch is returned by somefunction. So, depending on the state of somefunction, ch might never be closed, causing the goroutine to quietly leak.

In your design, some goroutines may run until the program exits, for example a background goroutine watching a configuration file, or the main conn.Accept loop in your server. However, these goroutines are rare enough I don’t consider them an exception to this rule.

Every time you write the statement go in a program, you should consider the question of how, and under what conditions, the goroutine you are about to start, will end.

Thinking about $GOPATH

This is a short blog post about my thoughts on using Go in anger through several workplaces, as a developer and an advocate.

What is $GOPATH?

Back when Go was first announced we used Makefiles to compile Go code. These Makefiles referenced some shared logic stored in the Go distribution. This is where $GOROOT comes from.

Back then, if you wrote Go code, you’d probably also used these Makefiles, and while you could check out your source code anywhere, most people would put their own Go code in what today we’d call $GOROOT/src as you must’ve compiled Go from source, so this directory was always going to be present.

Towards the 1.0 release goinstall, then go get, solidified the use of domain names in import paths to provide a globally unique namespace. These tools introduced a new location into which Go code would be fetched. This location was separate from $GOROOT to make clear the distinction between code provided by the Go project, and code written by the developer. By the time Go 1.1 was released in 2013, $GOROOT was removed as a fallback option.

Why does $GOPATH exist?

$GOPATH exists for two main reasons:

  1. In Go, the import declaration references a package via its fully qualified import path. $GOPATH exist so that from any directory inside $GOPATH/src the go tool can compute the absolute import path of the package in question.1
  2. A location to store dependencies fetched by go get.

Having a per user $GOPATH environment variable also means developers could use the go tool from any directory on their system to build, test and install code, but I suspect only a minority utilise this feature.

What’s wrong with $GOPATH?

In my experience, many newcomers to Go are frustrated with the single workspace $GOPATH model. They are confused that $GOPATH doesn’t let them check out the source of a project in a directory of their choice like they are used to with other languages. Additionally, $GOPATH does not let the developer have more than one copy of a project (or its dependencies)  checked out at the same time without having to update $GOPATH constantly.

I think it is important to recognise that these issues are legitimate points of confusion for many newcomers (including those on the Go team) and act as a drag on Go adoption. As we’re on the cusp of a blessed dependency management tool for Go, I think it’s equally important to continue to question the base assumptions that this new tool will build on, namely requiring a $GOPATH.

In my opinion, any Go build tool needs to provide (in addition to actually building and testing code) a way for Go code checked out in an arbitrary location on disk to recover its intended fully qualified import path; the path other code will import it as.

The $GOPATH model answers this question by subtracting the prefix of $GOPATH/src from the path to the directory of the current package; the remainder is the package’s fully qualified import path. This is why if you check out a package outside a $GOPATH workspace, the go tool cannot figure out the packages’ fully qualified import path and everything falls apart.

What are some alternatives to $GOPATH?

I attempted to address both issues with gb, which gives developers the ability to check out a project anywhere you want, but has no solution for libraries, and gb projects were not go gettable. However gb showed that writing a new build tool that did not wrap the go tool meant it was not forced to reorganise the world to fit into the $GOPATH model allowing gb users to include the source of all their dependencies in their project without the pitfalls of the Go 1.6’s vendor/ directory.

Recently, on a suggestion from Bill Kennedy, I built an experimental build tool that recorded the expected import prefix in a manifest file. That prefix, rather than one computed by $GOPATH directory arithmetic, is used to determine the fully qualified import path.

I’m working on a similar tool (unfinished) based on a suggestion from Brad Fitzpatrick that uses the .git directory as a sentinel to determine the root of the project and hopefully infer the full import path from the git remote configuration.

While these experiments are unfinished, both demonstrate that you can avoid the $GOPATH restrictions and retain compatibility with the go get ecosystem. Potentially in the case of Kodos, even avoid a manifest file.


Kang and Kodos use a lot of forked code from gb, which I hope to rectify over the new years’ break. If you are interesting in contributing or better yet, building your own Go tool to explore this problem space, Kang, Kodos, and gb are permissively licensed.


  1. This is notably different from the way imports work in scripting languages like Python and Ruby, which use directly scanning and inserting onto a global search path source code directories.

Declaration scopes in Go

This post is about declaration scopes and shadowing in Go.

package main

import "fmt"

func f(x int) {
	for x := 0; x < 10; x++ {

var x int

func main() {
	var x = 200

This program declares x four times. All four are different variables because they exist in different scopes.

package main

import "fmt"

func f() {
	x := 200
	fmt.Println("inside f: x =", x)

func main() {
	x := 100
	fmt.Println("inside main: x =", x)
	fmt.Println("inside main: x =", x)

In Go the scope of a declaration is bound to the closest pair of curly braces, { and }. In this example, we declare x to be 100 inside main, and 200 inside f.

What do you expect this program will print?

package main

import "fmt"

func main() {
	x := 100
	for i := 0; i < 5; i++ {
		x := i

There are several scopes in a Go program; block scope, function scope, file scope, package scope, and universe scope. Each scope encompasses the previous. What you are seeing is called shadowing.

var x = 100

func main() {
        var x = 200

Most developers are comfortable with a function scoped variable shadowing a package scoped variable.

func f() {
        var x = 99
        if x > 90 {
                x := 60

But a block scoped variable shadowing a function scoped variable may be surprising.

The justification for a declaration in one scope shadowing another is consistency, prohibiting just block scoped declarations from shadowing another scope, would be inconsistent.

Go 1.8 toolchain improvements

This is a progress report on the Go toolchain improvements during the 1.8 development cycle.

Now we’re well into November, the 1.8 development window is closing fast on the few remaining in fly change lists, with the remainder being told to wait until the 1.9 development season opens when Go 1.8 ships in February 2017.

For more in this series, read my previous post on the Go 1.8 toolchain improvements from September, and my post on the improvements to the Go toolchain in the 1.7 development cycle.

Faster compilation

Since Go 1.5, released in August 2015, compile times have been significantly slower than Go 1.4. Work on addressing this slow down started in ernest in the Go 1.7 cycle, and is still ongoing.

Robert Griesemer and Matthew Dempsky’s worked on rewriting the parser to make it faster and remove many of the package level variables inherited from the previous yacc based parser. This parser produces a new abstract syntax tree while the rest of the compiler expects the previous yacc syntax tree. For 1.8 the new parser must transform its output into the previous syntax tree for consumption by the rest of the compiler. Even with this extra transformation step the new parser is no slower than the previous version and plans are being made to remove this transformation requirement in Go 1.9.

Compile time for full build relative to Go 1.4.3

Compile time for full build relative to Go 1.4.3

The take away is Go 1.8 is on target to improve compile times by an average of 15% over Go 1.7. Compared to the 3-5% improvements reported two months prior, it’s nice to know that there is still blood in this stone.

Note: The benchmark scripts for jujud, kube-controller-manager, and gogs are online. Please try them yourself and report your findings.

Code generation improvements

The big feature of the previous 1.7 cycle was the new SSA backend for 64 bit Intel. In Go 1.8 the SSA backend has been rolled out to all the other architectures that Go supports and the old backend code has been deleted.

amd64, by virtue of being the most popular production architecture, has always been the fastest. As I reported a few months ago, the results comparing Go 1.8 to Go 1.7 on Intel architectures show middling improvement driven equally by improvements to code generation, escape analysis improvements, and optimisations to the std library.

name                     old time/op    new time/op    delta
BinaryTree17-4              3.04s ± 1%     3.03s ± 0%     ~     (p=0.222 n=5+5)
Fannkuch11-4                3.27s ± 0%     3.39s ± 1%   +3.74%  (p=0.008 n=5+5)
FmtFprintfEmpty-4          60.0ns ± 3%    58.3ns ± 1%   -2.70%  (p=0.008 n=5+5)
FmtFprintfString-4          177ns ± 2%     164ns ± 2%   -7.47%  (p=0.008 n=5+5)
FmtFprintfInt-4             169ns ± 2%     157ns ± 1%   -7.22%  (p=0.008 n=5+5)
FmtFprintfIntInt-4          264ns ± 1%     243ns ± 1%   -8.10%  (p=0.008 n=5+5)
FmtFprintfPrefixedInt-4     254ns ± 2%     244ns ± 1%   -4.02%  (p=0.008 n=5+5)
FmtFprintfFloat-4           357ns ± 1%     348ns ± 2%   -2.35%  (p=0.032 n=5+5)
FmtManyArgs-4              1.10µs ± 1%    0.97µs ± 1%  -11.03%  (p=0.008 n=5+5)
GobDecode-4                9.85ms ± 1%    9.31ms ± 1%   -5.51%  (p=0.008 n=5+5)
GobEncode-4                8.75ms ± 1%    8.17ms ± 1%   -6.67%  (p=0.008 n=5+5)
Gzip-4                      282ms ± 0%     289ms ± 1%   +2.32%  (p=0.008 n=5+5)
Gunzip-4                   50.9ms ± 1%    51.7ms ± 0%   +1.67%  (p=0.008 n=5+5)
HTTPClientServer-4          195µs ± 1%     196µs ± 1%     ~     (p=0.095 n=5+5)
JSONEncode-4               21.6ms ± 6%    19.8ms ± 3%   -8.37%  (p=0.008 n=5+5)
JSONDecode-4               70.2ms ± 3%    71.0ms ± 1%     ~     (p=0.310 n=5+5)
Mandelbrot200-4            5.20ms ± 0%    4.73ms ± 1%   -9.05%  (p=0.008 n=5+5)
GoParse-4                  4.38ms ± 3%    4.28ms ± 2%     ~     (p=0.056 n=5+5)
RegexpMatchEasy0_32-4      96.7ns ± 2%    98.1ns ± 0%     ~     (p=0.127 n=5+5)
RegexpMatchEasy0_1K-4       311ns ± 1%     313ns ± 0%     ~     (p=0.214 n=5+5)
RegexpMatchEasy1_32-4      97.9ns ± 2%    89.8ns ± 2%   -8.33%  (p=0.008 n=5+5)
RegexpMatchEasy1_1K-4       519ns ± 0%     510ns ± 2%   -1.70%  (p=0.040 n=5+5)
RegexpMatchMedium_32-4      158ns ± 2%     146ns ± 0%   -7.71%  (p=0.016 n=5+4)
RegexpMatchMedium_1K-4     46.3µs ± 1%    47.8µs ± 2%   +3.12%  (p=0.008 n=5+5)
RegexpMatchHard_32-4       2.53µs ± 3%    2.46µs ± 0%   -2.91%  (p=0.008 n=5+5)
RegexpMatchHard_1K-4       76.1µs ± 0%    74.5µs ± 2%   -2.12%  (p=0.008 n=5+5)
Revcomp-4                   563ms ± 2%     531ms ± 1%   -5.78%  (p=0.008 n=5+5)
Template-4                 86.7ms ± 1%    82.2ms ± 1%   -5.16%  (p=0.008 n=5+5)
TimeParse-4                 433ns ± 3%     399ns ± 4%   -7.90%  (p=0.008 n=5+5)
TimeFormat-4                467ns ± 2%     430ns ± 1%   -7.76%  (p=0.008 n=5+5)

name                     old speed      new speed      delta
GobDecode-4              77.9MB/s ± 1%  82.5MB/s ± 1%   +5.84%  (p=0.008 n=5+5)
GobEncode-4              87.7MB/s ± 1%  94.0MB/s ± 1%   +7.15%  (p=0.008 n=5+5)
Gzip-4                   68.8MB/s ± 0%  67.2MB/s ± 1%   -2.27%  (p=0.008 n=5+5)
Gunzip-4                  381MB/s ± 1%   375MB/s ± 0%   -1.65%  (p=0.008 n=5+5)
JSONEncode-4             89.9MB/s ± 5%  98.1MB/s ± 3%   +9.11%  (p=0.008 n=5+5)
JSONDecode-4             27.6MB/s ± 3%  27.3MB/s ± 1%     ~     (p=0.310 n=5+5)
GoParse-4                13.2MB/s ± 3%  13.5MB/s ± 2%     ~     (p=0.056 n=5+5)
RegexpMatchEasy0_32-4     331MB/s ± 2%   326MB/s ± 0%     ~     (p=0.151 n=5+5)
RegexpMatchEasy0_1K-4    3.29GB/s ± 1%  3.27GB/s ± 0%     ~     (p=0.222 n=5+5)
RegexpMatchEasy1_32-4     327MB/s ± 2%   357MB/s ± 2%   +9.20%  (p=0.008 n=5+5)
RegexpMatchEasy1_1K-4    1.97GB/s ± 0%  2.01GB/s ± 2%   +1.76%  (p=0.032 n=5+5)
RegexpMatchMedium_32-4   6.31MB/s ± 2%  6.83MB/s ± 1%   +8.31%  (p=0.008 n=5+5)
RegexpMatchMedium_1K-4   22.1MB/s ± 1%  21.4MB/s ± 2%   -3.01%  (p=0.008 n=5+5)
RegexpMatchHard_32-4     12.6MB/s ± 3%  13.0MB/s ± 0%   +2.98%  (p=0.008 n=5+5)
RegexpMatchHard_1K-4     13.4MB/s ± 0%  13.7MB/s ± 2%   +2.19%  (p=0.008 n=5+5)
Revcomp-4                 451MB/s ± 2%   479MB/s ± 1%   +6.12%  (p=0.008 n=5+5)
Template-4               22.4MB/s ± 1%  23.6MB/s ± 1%   +5.43%  (p=0.008 n=5+5)

The big improvements from the switch to the SSA backend show up on non intel architectures. Here are the results for Arm64:

name                     old time/op    new time/op     delta
BinaryTree17-8              10.6s ± 0%       8.1s ± 1%  -23.62%  (p=0.016 n=4+5)
Fannkuch11-8                9.19s ± 0%      5.95s ± 0%  -35.27%  (p=0.008 n=5+5)
FmtFprintfEmpty-8           136ns ± 0%      118ns ± 1%  -13.53%  (p=0.008 n=5+5)
FmtFprintfString-8          472ns ± 1%      331ns ± 1%  -29.82%  (p=0.008 n=5+5)
FmtFprintfInt-8             388ns ± 3%      273ns ± 0%  -29.61%  (p=0.008 n=5+5)
FmtFprintfIntInt-8          640ns ± 2%      438ns ± 0%  -31.61%  (p=0.008 n=5+5)
FmtFprintfPrefixedInt-8     580ns ± 0%      423ns ± 0%  -27.09%  (p=0.008 n=5+5)
FmtFprintfFloat-8           823ns ± 0%      613ns ± 1%  -25.57%  (p=0.008 n=5+5)
FmtManyArgs-8              2.69µs ± 0%     1.96µs ± 0%  -27.12%  (p=0.016 n=4+5)
GobDecode-8                24.4ms ± 0%     17.3ms ± 0%  -28.88%  (p=0.008 n=5+5)
GobEncode-8                18.6ms ± 0%     15.1ms ± 1%  -18.65%  (p=0.008 n=5+5)
Gzip-8                      1.20s ± 0%      0.74s ± 0%  -38.02%  (p=0.008 n=5+5)
Gunzip-8                    190ms ± 0%      130ms ± 0%  -31.73%  (p=0.008 n=5+5)
HTTPClientServer-8          205µs ± 1%      166µs ± 2%  -19.27%  (p=0.008 n=5+5)
JSONEncode-8               50.7ms ± 0%     41.5ms ± 0%  -18.10%  (p=0.008 n=5+5)
JSONDecode-8                201ms ± 0%      155ms ± 1%  -22.93%  (p=0.008 n=5+5)
Mandelbrot200-8            13.0ms ± 0%     10.1ms ± 0%  -22.78%  (p=0.008 n=5+5)
GoParse-8                  11.4ms ± 0%      8.5ms ± 0%  -24.80%  (p=0.008 n=5+5)
RegexpMatchEasy0_32-8       271ns ± 0%      225ns ± 0%  -16.97%  (p=0.008 n=5+5)
RegexpMatchEasy0_1K-8      1.69µs ± 0%     1.92µs ± 0%  +13.42%  (p=0.008 n=5+5)
RegexpMatchEasy1_32-8       292ns ± 0%      255ns ± 0%  -12.60%  (p=0.000 n=4+5)
RegexpMatchEasy1_1K-8      2.20µs ± 0%     2.38µs ± 0%   +8.38%  (p=0.008 n=5+5)
RegexpMatchMedium_32-8      411ns ± 0%      360ns ± 0%  -12.41%  (p=0.000 n=5+4)
RegexpMatchMedium_1K-8      118µs ± 0%      104µs ± 0%  -12.07%  (p=0.008 n=5+5)
RegexpMatchHard_32-8       6.83µs ± 0%     5.79µs ± 0%  -15.27%  (p=0.016 n=4+5)
RegexpMatchHard_1K-8        205µs ± 0%      176µs ± 0%  -14.19%  (p=0.008 n=5+5)
Revcomp-8                   2.01s ± 0%      1.43s ± 0%  -29.02%  (p=0.008 n=5+5)
Template-8                  259ms ± 0%      158ms ± 0%  -38.93%  (p=0.008 n=5+5)
TimeParse-8                 874ns ± 1%      733ns ± 1%  -16.16%  (p=0.008 n=5+5)
TimeFormat-8               1.00µs ± 1%     0.86µs ± 1%  -13.88%  (p=0.008 n=5+5)

name                     old speed      new speed       delta
GobDecode-8              31.5MB/s ± 0%   44.3MB/s ± 0%  +40.61%  (p=0.008 n=5+5)
GobEncode-8              41.3MB/s ± 0%   50.7MB/s ± 1%  +22.92%  (p=0.008 n=5+5)
Gzip-8                   16.2MB/s ± 0%   26.1MB/s ± 0%  +61.33%  (p=0.008 n=5+5)
Gunzip-8                  102MB/s ± 0%    150MB/s ± 0%  +46.45%  (p=0.016 n=4+5)
JSONEncode-8             38.3MB/s ± 0%   46.7MB/s ± 0%  +22.10%  (p=0.008 n=5+5)
JSONDecode-8             9.64MB/s ± 0%  12.49MB/s ± 0%  +29.54%  (p=0.016 n=5+4)
GoParse-8                5.09MB/s ± 0%   6.78MB/s ± 0%  +33.02%  (p=0.008 n=5+5)
RegexpMatchEasy0_32-8     118MB/s ± 0%    142MB/s ± 0%  +20.29%  (p=0.008 n=5+5)
RegexpMatchEasy0_1K-8     605MB/s ± 0%    534MB/s ± 0%  -11.85%  (p=0.016 n=5+4)
RegexpMatchEasy1_32-8     110MB/s ± 0%    125MB/s ± 0%  +14.23%  (p=0.029 n=4+4)
RegexpMatchEasy1_1K-8     465MB/s ± 0%    430MB/s ± 0%   -7.72%  (p=0.008 n=5+5)
RegexpMatchMedium_32-8   2.43MB/s ± 0%   2.77MB/s ± 0%  +13.99%  (p=0.016 n=5+4)
RegexpMatchMedium_1K-8   8.68MB/s ± 0%   9.87MB/s ± 0%  +13.71%  (p=0.008 n=5+5)
RegexpMatchHard_32-8     4.68MB/s ± 0%   5.53MB/s ± 0%  +18.08%  (p=0.016 n=4+5)
RegexpMatchHard_1K-8     5.00MB/s ± 0%   5.83MB/s ± 0%  +16.60%  (p=0.008 n=5+5)
Revcomp-8                 126MB/s ± 0%    178MB/s ± 0%  +40.88%  (p=0.008 n=5+5)
Template-8               7.48MB/s ± 0%  12.25MB/s ± 0%  +63.74%  (p=0.008 n=5+5)

These are pretty big improvements from just recompiling your binary.

Defer and cgo improvements

The question of if defer can be used in hot code paths remains open, but during the 1.8 cycle Austin reduced the overhead of using defer by a half, according to some benchmarks.

The runtime package benchmarks are a little less rosy.

name         old time/op  new time/op  delta
Defer-4       101ns ± 1%    66ns ± 0%  -34.73%  (p=0.000 n=20+20)
Defer10-4    93.2ns ± 1%  62.5ns ± 8%  -33.02%  (p=0.000 n=20+20)
DeferMany-4   148ns ± 3%   131ns ± 3%  -11.42%  (p=0.000 n=19+19)

According to them defer improved by a third in most common circumstances where the statement closes over no more than a single variable.

Additionally, an optimisation by David Crawshaw reduced the overhead of defer in the cgo path by nearly half.

name       old time/op  new time/op  delta
CgoNoop-8  93.5ns ± 0%  51.1ns ± 1%  -45.34%  (p=0.016 n=4+5)

One more thing

Go 1.7 supported 64 bit mips platforms, thanks to the work of Minux and Cherry. However, the less powerful but plentiful, 32 bit mips platforms were not supported. As a bonus, thanks to the work of Vladimir Stefanovic, Go 1.8 will ship will support for 32 bit mips.

% env GOARCH=mips go build -o godoc.mips
% file godoc.mips 
godoc.mips: ELF 32-bit MSB  executable, MIPS, MIPS32 version 1 (SYSV), statically linked, not stripped

While 32 bit mips hosts are probably too small to compile Go programs natively, you can always cross compile from your development workstation for linux/mips.

Do not fear first class functions

This is the text of my dotGo 2016 presentation. A recording and slide deck are also available.


Hello, welcome to dotGo.

Two years ago I stood on a stage, not unlike this one, and told you my opinion for how configuration options should be handled in Go. The cornerstone of my presentation was Rob Pike’s blog post, Self-referential functions and the design of options.

Since then it has been wonderful to watch this idea mature from Rob’s original blog post, to the gRPC project, who in my opinion have continued to evolve this design pattern into its best form to date.

But, when talking to Gophers at a conference in London a few months ago, several of them expressed a concern that while they understood the notion of a function that returns a function, the technique that powers functional options, they worried that other Go programmers—I suspect they meant less experienced Go programmers—wouldn’t be able to understand this style of programming.

And this made me a bit sad because I consider Go’s support of first class functions to be a gift, and something that we should all be able to take advantage of. So I’m here today to show you, that you do not need to fear first class functions.

Functional options recap

To begin, I’ll very quickly recap the functional options pattern

type Config struct{ ... }

func WithReticulatedSplines(c *Config) { ... }

type Terrain struct {
        config Config

func NewTerrain(options ...func(*Config)) *Terrain {
        var t Terrain
        for _, option := range options {
        return &t


func main() {
        t := NewTerrain(WithReticulatedSplines)
        // [ simulation intensifies ]

We start with some options, expressed as functions which take a pointer to a structure to configure. We pass those functions to a constructor, and inside the body of that constructor each option function is invoked in order, passing in a reference to the Config value. Finally, we call NewTerrain with the options we want, and away we go.

Okay, everyone should be familiar with this pattern. Where I believe the confusion comes from, is when you need an option function which take a parameter. For example, we have WithCities, which lets us add a number of cities to our terrain model.

 // WithCities adds n cities to the Terrain model
func WithCities(n int) func(*Config) { ... }

func main() {        
        t := NewTerrain(WithCities(9))      
        // ...

Because WithCities takes an argument, we cannot simply pass WithCities to NewTerrain, its signature does not match. Instead we evaluate WithCities, passing in the number of cities to create, and use the result as the value to pass to NewTerrain.

Functions as first class values

What’s going on here? Let’s break it down. Fundamentally, evaluating a function returns a value. We have functions that take two numbers and return a number.

package math

func Min(a, b float64) float64

We have functions that take a slice, and return a pointer to a structure.

package bytes

func NewReader(b []byte) *Reader

and now we have a function which returns a function.

func WithCities(n int) func(*Config)

The type of the value that is returned from WithCities is a function which takes a pointer to a Config. This ability to treat functions as regular values leads to their name: first class functions.


Another way to think about what is going on here is to try to rewrite the functional option pattern using an interface.

type Option interface {

Rather than a function type we declare an interface, we’ll call it Option, and give it a single method, Apply which takes a pointer to a Config.

func NewTerrain(options ...Option) *Terrain {
        var config Config
        for _, option := range options {
        // ...

Whenever we call NewTerrain we pass in one or more values that implement the Option interface. Inside NewTerrain, just as before, we loop over the slice of options and call the Apply method on each.

This doesn’t look too different to the previous example. Rather than ranging over a slice of functions and calling them, we range over a slice of interface values and call a method on each. Let’s take a look at the other side, declaring the WithReticulatedSplines option.

type splines struct{}

func (s *splines) Apply(c *Config) { ... }

func WithReticulatedSplines() Option {
        return new(splines)

Because we’re passing around interface implementations, we need to declare a type to hold the Apply method. We also need to declare a constructor function to return our splines option implementation–you can already see that this is going to be more code.

To write WithCities using our Option interface we need to do a bit more work.

type cities struct {
        cities int

func (c *cities) Apply(c *Config) { ... }

func WithCities(n int) Option {
        return &cities{
                cities: n,

In the previous, functional, version the value of n, the number of cities to create, was captured lexically for us in the declaration of the anonymous function. Because we’re using an interface we need to declare a type to hold the count of cities and we need a constructor to assign the field during construction.

func main() {
        t := NewTerrain(WithReticulatedSplines(), WithCities(9))
        // ...

Putting it all together, we call NewTerrain with the results of evaluating WithReticulatedSplines and WithCities.

At GopherCon last year Tomás Senart spoke about the duality of a first class function and an interface with one method. You can see this duality play out in our example; an interface with one method and a function are equivalent.

But, you can also see that using functions as first class values involves much less code.

Encapsulating behaviour

Let’s leave interfaces for a moment and talk about some other properties of first class functions.

When we invoke a function or a method, we do so passing around data. The job of that function is often to interpret that data and take some action. Function values allow you to pass behaviour to be executed, rather that data to be interpreted. In effect, passing a function value allows you to declare code that will execute later, perhaps in a different context.

To illustrate this, here is a simple calculator.

type Calculator struct {
        acc float64

const (
        OP_ADD = 1 << iota

It has a set of operations it understands.

func (c *Calculator) Do(op int, v float64) float64 {
        switch op {
        case OP_ADD:
                c.acc += v
        case OP_SUB:
                c.acc -= v
        case OP_MUL:
                c.acc *= v
                panic("unhandled operation")
        return c.acc

It has one method, Do, which takes an operation and an operand, v. For convenience, Do also returns the value of the accumulator after the operation is applied.

func main() {
        var c Calculator
        fmt.Println(c.Do(OP_ADD, 100))     // 100
        fmt.Println(c.Do(OP_SUB, 50))      // 50
        fmt.Println(c.Do(OP_MUL, 2))       // 100

Our calculator only knows how to add, subtract, and multiply. If we wanted to implement division, we’d have to allocate an operation constant, then open up the Do method and add the code to implement division. Sounds reasonable, it’s only a few lines, but what if we wanted to add square root and exponentiation?

Each time we did this, Do grows longer and become harder to follow, because each time we add an operation we have to encode into Do knowledge of how to interpret that operation.

Let’s rewrite our calculator a little.

type Calculator struct {
        acc float64

type opfunc func(float64, float64) float64

func (c *Calculator) Do(op opfunc, v float64) float64 {
        c.acc = op(c.acc, v)
        return c.acc

As before we have a Calculator, which manages its own accumulator. The Calculator has a Do method, which this time takes an function as the operation, and a value as the operand. Whenever Do is called, it calls the operation we pass in, using its own accumulator and the operand we provide.

So, how do we use this new Calculator? You guessed it, by writing our operations as functions.

func Add(a, b float64) float64 { return a + b }

This is the code for Add. What about the other operations? It turns out they aren’t too hard either.

func Sub(a, b float64) float64 { return a - b }
func Mul(a, b float64) float64 { return a * b }

func main() {
        var c Calculator
        fmt.Println(c.Do(Add, 5))       // 5
        fmt.Println(c.Do(Sub, 3))       // 2
        fmt.Println(c.Do(Mul, 8))       // 16

As before we construct a Calculator and call it passing operations and an operand.

Extending the calculator

Now we can describe operations as functions, we can try to extend our calculator to handle square root.

func Sqrt(n, _ float64) float64 {
        return math.Sqrt(n)

But, it turns out there is a problem. math.Sqrt takes one argument, not two. However our Calculator’s Do method’s signature requires an operation function that takes two arguments.

func main() {
        var c Calculator
        c.Do(Add, 16)
        c.Do(Sqrt, 0) // operand ignored

Maybe we just cheat and ignore the operand. That’s a bit gross, I think we can do better.

Let’s redefine Add from a function that is called with two values and returns a third, to a function which returns a function that takes a value and returns a value.

func Add(n float64) func(float64) float64 {
        return func(acc float64) float64 {
                return acc + n

func (c *Calculator) Do(op func(float64) float64) float64 {
        c.acc = op(c.acc)
        return c.acc

Do now invokes the operation function passing in its own accumulator and recording the result back in the accumulator.

func main() {
        var c Calculator
        c.Do(Add(10))   // 10
        c.Do(Add(20))   // 30

Now in main we call Do not with the Add function itself, but with the result of evaluating Add(10). The type of the result of evaluating Add(10) is a function which takes a value, and returns a value, matching the signature that Do requires.

func Sub(n float64) func(float64) float64 {
        return func(acc float64) float64 {
                return acc - n

func Mul(n float64) func(float64) float64 {
        return func(acc float64) float64 {
                return acc * n

Subtraction and multiplication are similarly easy to implement. But what about square root?

func Sqrt() func(float64) float64 {
        return func(n float64) float64 {
                return math.Sqrt(n)

func main() {
        var c Calculator
        c.Do(Sqrt())   // 1.41421356237

This implementation of square root avoids the awkward syntax of the previous calculator’s operation function, as our revised calculator now operates on functions which take and return only one value.

Hopefully you’ve noticed that the signature of our Sqrt function is the same as math.Sqrt, so we can make this code smaller by reusing any function from the math package that takes a single argument.

func main() {
        var c Calculator
        c.Do(Add(2))      // 2
        c.Do(math.Sqrt)   // 1.41421356237
        c.Do(math.Cos)    // 0.99969539804

We started with a model of hard coded, interpreted logic. We moved to a more functional model, where we pass in the behaviour we want. Then, by taking it a step further, we generalised our calculator to work for operations regardless of their number of arguments.

Let’s talk about actors

Let’s change tracks a little and talk about why most of us are here at a Go conference; concurrency, specifically actors. To give due credit, the examples here are inspired by Bryan Boreham’s talk from GolangUK, you should check it out.

Suppose we’re building a chat server, we plan to be the next Hipchat or Slack, but we’ll start small for the moment.

type Mux struct {
        mu    sync.Mutex
        conns map[net.Addr]net.Conn

func (m *Mux) Add(conn net.Conn) {
        m.conns[conn.RemoteAddr()] = conn

We have a way to register new connections.

func (m *Mux) Remove(addr net.Addr) {
        delete(m.conns, addr)

Remove old connections.

func (m *Mux) SendMsg(msg string) error {
        for _, conn := range m.conns {
                err := io.WriteString(conn, msg)
                if err != nil {
                        return err
        return nil

And a way to send a message to all the registered connections. Because this is a server, all of these methods will be called concurrently, so we need to use a mutex to protect the conns map and prevent data races. Is this what you’d call idiomatic Go code?

Don’t communicate by sharing memory, share memory by communicating.

Our first proverb–don’t mediate access to shared memory with locks and mutexes, instead share that memory by communicating. So let’s apply this advice to our chat server.

Rather than using a mutex to serialise access to the Mux‘s conns map, we can give that job to a goroutine, and communicate with that goroutine via channels.

type Mux struct {
        add     chan net.Conn
        remove  chan net.Addr
        sendMsg chan string

func (m *Mux) Add(conn net.Conn) {
        m.add <- conn

Add sends the connection to add to the add channel.

func (m *Mux) Remove(addr net.Addr) {
        m.remove <- addr

Remove sends the address of the connection to the remove channel.

func (m *Mux) SendMsg(msg string) error {
        m.sendMsg <- msg
        return nil

And send message sends the message to be transmitted to each connection to the sendMsg channel.

func (m *Mux) loop() {
        conns := make(map[net.Addr]net.Conn)
        for {
                select {
                case conn := <-m.add:
                        m.conns[conn.RemoteAddr()] = conn
                case addr := <-m.remove:
                        delete(m.conns, addr)
                case msg := <-m.sendMsg:
                        for _, conn := range m.conns {
                                io.WriteString(conn, msg)

Rather than using a mutex to serialise access to the conns map, loop will wait until it receives an operation in the form of a value sent over one of the add, remove, or sendMsg channels and apply the relevant case. We don’t need a mutex anymore because the shared state, our conns map, is local to the loop function.

But, there’s still a lot of hard coded logic here. loop only knows how to do three things; add, remove and broadcast a message. As with the previous example, adding new features to our Mux type will involve:

  • creating a channel.
  • adding a helper to send the data over the channel.
  • extending the select logic inside loop to process that data.

Just like our Calculator example we can rewrite our Mux to use first class functions to pass around behaviour we want to executed, not data to interpret. Now, each method sends an operation to be executed in the context of the loop function, using our single ops channel.

type Mux struct {
        ops chan func(map[net.Addr]net.Conn)

func (m *Mux) Add(conn net.Conn) {
        m.ops <- func(m map[net.Addr]net.Conn) {
                m[conn.RemoteAddr()] = conn

In this case the signature of the operation is a function which takes a map of net.Addr’s to net.Conn’s. In a real program you’d probably have a much more complicated type to represent a client connection, but it’s sufficient for the purpose of this example.

func (m *Mux) Remove(addr net.Addr) {
        m.ops <- func(m map[net.Addr]net.Conn) {
                delete(m, addr)

Remove is similar, we send a function that deletes its connection’s address from the supplied map.

func (m *Mux) SendMsg(msg string) error {
        m.ops <- func(m map[net.Addr]net.Conn) {
                for _, conn := range m {
                        io.WriteString(conn, msg)
        return nil

SendMsg is a function which iterates over all connections in the supplied map and calls io.WriteString to send each a copy of the message.

func (m *Mux) loop() {

        conns := make(map[net.Addr]net.Conn)
        for op := range m.ops {

You can see that we’ve moved the logic from the body of loop into anonymous functions created by our helpers. So the job of loop is now to create a conns map, wait for an operation to be provided on the ops channel, then invoke it, passing in its map of connections.

But there are a few problems still to fix. The most pressing is the lack of error handling in SendMsg; an error writing to a connection will not be communicated back to the caller. So let’s fix that now.

func (m *Mux) SendMsg(msg string) error {
        result := make(chan error, 1)
        m.ops <- func(m map[net.Addr]net.Conn) {
                for _, conn := range m.conns {
                        err := io.WriteString(conn, msg)
                        if err != nil {
                                result <- err
                result <- nil
        return <-result

To handle the error being generated inside the anonymous function we pass to loop we need to create a channel to communicate the result of the operation. This also creates a point of synchronisation, the last line of SendMsg blocks until the function we passed into loop has been executed.

func (m *Mux) loop() {
        conns := make(map[net.Addr]net.Conn)
        for op := range m.ops {

Note that we didn’t have the change the body of loop at all to incorporate this error handling. And now we know how to do this, we can easily add a new function to Mux to send a private message to a single client.

func (m *Mux) PrivateMsg(addr net.Addr, msg string) error {
        result := make(chan net.Conn, 1)
        m.ops <- func(m map[net.Addr]net.Conn) {
                result <- m[addr]
        conn := <-result
        if conn == nil {
                return errors.Errorf("client %v not registered", addr)
        return io.WriteString(conn, msg)

To do this we pass a “lookup function” to loop via the ops channel, which will look in the map provided to it—this is loop‘s conns map—and return the value for the address we want on the result channel.

In the rest of the function we check to see if the result was nil—the zero value from the map lookup implies that the client is not registered. Otherwise we now have a reference to the client and we can call io.WriteString to send them a message.

And just to reiterate, we did this all without changing the body of loop, or affecting any of the other operations.


In summary

  • First class functions bring you tremendous expressive power. They let you pass around behaviour, not just dead data that must be interpreted.
  • First class functions aren’t new or novel. Many older languages have offered them, even C. In fact it was only somewhere along the lines of removing pointers did programmers in the OO stream of languages lose access to first class functions. If you’re a Javascript programmer, you’ve probably spent the last 15 minutes wondering what the big deal is.
  • First class functions, like the other features Go offers, should be used with restraint. Just as it is possible to make an overcomplicated program with the overuse of channels, it’s possible to make an impenetrable program with an overuse of first class functions. But that does not mean you shouldn’t use them at all; just use them in moderation.
  • First class functions are something that I believe every Go programmer should have in their toolbox. First class functions aren’t unique to Go, and Go programmers shouldn’t be afraid of them.
  • If you can learn to use interfaces, you can learn to use first class functions. They aren’t hard, just a little unfamiliar, and unfamiliarity is something that I believe can be overcome with time and practice.

So next time you define an API that has just one method, ask yourself, shouldn’t it really just be a function?

Introducing Go 2.0

Just so we’re clear, this post is a thought experiment, not any form of commitment to deliver Go 2.0 in any time frame. While I personally believe there will be a Go 2.0 in the future, I’m in no position to influence its creation; hence, this post is mere speculation.

Why introduce a new major version of Go?

Go 1.0 was released over 4 years ago, and since then the Go 1 compatibility contract has been a boon to anyone investing in Go as the language to build their product.  So, why introduce a new version of Go?

By the time that Go 1.8 is released at the start of 2017, the standard library will have accumulated cruft and hacks for five years, and if you consider that Go started life in 2007, it’s closer to ten. An opportunity to address this cruft and remove some of the packages which are now understood to be a bad idea would make the standard library more consistent and approachable to newcomers.

It is possible the language itself could become smaller. Rob Pike noted in 2014 that there are too many ways to declare a variable in Go, and this could be rationalised. Similarly the incongruence between make and new might be resolved. Then there is the problem of non latin characters not being considered upper case. So, lots of little cleanups to do.

Obviously some kind of solution for templated types would have to be part of any Go 2.0 discussion and, as David Symonds pointed out several years ago, they would have to be used to rewrite the standard library, both causing, and justifying, the compatibility break.

Backward compatibility

Backwards compatibility is not about syntax or features, backwards compatibility is about investment. Investment in the language; both at a technical and career level. Investment in libraries. Investment in backends that generate machine code. Investment in the mid part of the compiler that transforms and optimises code. Investment in build scripts and toolchains that embeds one piece of compiled code into another.

Brian Goetz, the Java language architect, describes the commitment to backward compatibility as the “central park effect“. This is something our cousins in the hardware world have long understood–never let the customer unbolt your product from the rack, ‘cos they might take the opportunity to use that space for your competition.

The lessons of Python 3000 are prescient; ignore backward compatibility at your peril. No matter how compelling the new version of your language, if you make it incompatible with the investment in the previous version, you are launching a new product which is in direct competition with itself. And just to make it clear, I’m not picking on Python specifically, there are plenty of other examples; D 2.0, Perl 6, and also come to mind.

All of these examples show the danger of creating a new version of a language that requires its users to rewrite all the source of their program, including all their dependencies (which may be non trivial), before it will compile and run.

A plausible implementation

So, how to create a new Go 2.0 language, with a new syntax and a new standard library, without making it incompatible every piece of Go code written to date? How could we avoid the all or nothing stand-off in which other languages place their users?

What if we could combine code written in Go 1.0 and a proposed Go 2.0 in one program using the package level as the boundary between language versions? Go 2.0 would be a new language, with a new standard library built upon a runtime shared between itself and Go 1.0, thereby allowing users to work outwards from their Go 2.0 main package to the limbs of their dependency graph, one package at a time.

A Go 2.0 package would be able to call down to Go 1.0, but not the other way around. Go 2.0 types would be able to interoperate with Go 1.0 types, but Go 1.0 types would be unaware of Go 2.0 constructed code. Perhaps calling from Go 2.0 to Go 1.0 looks conceptually like using cgo to call C code, except without the overhead as both languages would be compiled to the same intermediary form.

The key is both language versions would be compiled to a single intermediate representation, one that can represent the superset of both syntaxes. This has been done before; in the first few versions of Go, C code and Go code was compiled to an intermediate representation, Ken Thompson’s universal assembly language, then converted to machine code at link time. Now with Keith Randall’s SSA compiler, there is a single low level intermediate representation (similar to gcc’s GIMPLE and LLVM’s IR) that describes all the things that make Go programs Go1.

There is a strong precedent for this; the ~Sun~ Oracle JVM. For more than a decade the JVM has hosted byte-code that was not compiled from .java source file. Combined with a version of gofix that could automate some of the effort in migrating a package to Go 2.0 syntax, this could be a plausible way to introduce a new version of Go without abrogating the investment in code written for Go 1.0.

  1. This also raises the possibility of developing other language front-ends using the Go toolchain. If you look at what LLVM has done for projects like Pony, Crystal, and Rust, think of what a portable, cross platform, optimising compiler, with user space concurrency built in, and written in Go, not C++, would mean for language experimentation.

Go 1.8 performance improvements, one month in

Sunday September the 18th marks a month since the Go 1.8 cycle opened officially. I’m passionate about the performance of Go programs, and of the compiler itself. This post is a brief look at the state of play, roughly 1/2 way into the development cycle for Go 1.81.

Note: these results are of course preliminary and represent only a point in time, not the performance of the final Go 1.8 release.

Compile times

Nothing much to report here. Using the methodology from my previous Go 1.7 benchmarks, there is a 3.22%–5.11% improvement in full compile time compared to Go 1.7.

Go 1.4.3, Go 1.7, Go tip

Performance improvements

Intel amd64

Better code generation and small improvements to the runtime and standard library show some small improvements for amd642, but really nothing to write home about yet.

name                       old time/op    new time/op  delta
BinaryTree17-4              3.07s ± 2%     3.06s ± 2%    ~      (p=0.661 n=10+9)
Fannkuch11-4                3.23s ± 1%     3.22s ± 0%  -0.43%   (p=0.008 n=9+10)
FmtFprintfEmpty-4          64.4ns ± 0%    61.8ns ± 4%  -4.17%   (p=0.005 n=9+10)
FmtFprintfString-4          162ns ± 0%     162ns ± 0%    ~      (p=0.065 n=10+9)
FmtFprintfInt-4             142ns ± 0%     142ns ± 0%    ~      (p=0.137 n=8+10)
FmtFprintfIntInt-4          220ns ± 0%     217ns ± 0%  -1.18%   (p=0.000 n=9+10)
FmtFprintfPrefixedInt-4     224ns ± 0%     224ns ± 1%    ~       (p=0.206 n=9+9)
FmtFprintfFloat-4           313ns ± 0%     312ns ± 0%  -0.26%   (p=0.001 n=10+9)
FmtManyArgs-4               906ns ± 0%     894ns ± 0%  -1.32%    (p=0.000 n=7+6)
GobDecode-4                8.88ms ± 1%    8.81ms ± 0%  -0.81%  (p=0.003 n=10+10)
GobEncode-4                7.93ms ± 1%    7.88ms ± 0%  -0.66%   (p=0.008 n=9+10)
Gzip-4                      272ms ± 1%     277ms ± 0%  +1.95%   (p=0.000 n=10+9)
Gunzip-4                   47.4ms ± 0%    47.4ms ± 0%    ~      (p=0.720 n=9+10)
HTTPClientServer-4          201µs ± 4%     202µs ± 2%    ~     (p=0.631 n=10+10)
JSONEncode-4               19.3ms ± 0%    19.3ms ± 0%    ~     (p=0.063 n=10+10)
JSONDecode-4               61.0ms ± 0%    61.2ms ± 0%  +0.33%   (p=0.000 n=10+8)
Mandelbrot200-4            5.20ms ± 0%    5.20ms ± 0%    ~      (p=0.475 n=10+7)
GoParse-4                  3.95ms ± 1%    3.97ms ± 1%  +0.65%    (p=0.003 n=9+9)
RegexpMatchEasy0_32-4      88.4ns ± 0%    88.7ns ± 0%  +0.34%   (p=0.001 n=10+9)
RegexpMatchEasy0_1K-4      1.14µs ± 0%    1.14µs ± 0%    ~       (p=0.369 n=9+6)
RegexpMatchEasy1_32-4      82.6ns ± 0%    82.0ns ± 0%  -0.70%   (p=0.000 n=9+10)
RegexpMatchEasy1_1K-4       469ns ± 0%     463ns ± 0%  -1.23%    (p=0.000 n=6+9)
RegexpMatchMedium_32-4      138ns ± 1%     136ns ± 0%  -1.38%   (p=0.000 n=10+9)
RegexpMatchMedium_1K-4     43.6µs ± 1%    42.0µs ± 0%  -3.74%    (p=0.000 n=9+9)
RegexpMatchHard_32-4       2.25µs ± 1%    2.23µs ± 0%  -0.57%    (p=0.000 n=8+8)
RegexpMatchHard_1K-4       68.8µs ± 0%    68.6µs ± 0%  -0.37%    (p=0.000 n=8+8)
Revcomp-4                   477ms ± 1%     472ms ± 0%  -1.03%    (p=0.000 n=8+8)
Template-4                 76.1ms ± 0%    76.4ms ± 0%  +0.35%    (p=0.000 n=9+9)
TimeParse-4                 367ns ± 0%     366ns ± 0%  -0.16%   (p=0.003 n=10+8)
TimeFormat-4                386ns ± 0%     384ns ± 0%  -0.58%    (p=0.000 n=9+9)

name                     old speed      new speed      delta
GobDecode-4              86.4MB/s ± 1%  87.1MB/s ± 0%  +0.81%  (p=0.003 n=10+10)
GobEncode-4              96.7MB/s ± 1%  97.4MB/s ± 0%  +0.66%   (p=0.007 n=9+10)
Gzip-4                   71.4MB/s ± 1%  70.0MB/s ± 0%  -1.91%   (p=0.000 n=10+9)
Gunzip-4                  409MB/s ± 0%   410MB/s ± 0%    ~      (p=0.703 n=9+10)
JSONEncode-4              101MB/s ± 0%   100MB/s ± 0%    ~     (p=0.084 n=10+10)
JSONDecode-4             31.8MB/s ± 0%  31.7MB/s ± 0%  -0.33%   (p=0.000 n=10+8)
GoParse-4                14.7MB/s ± 1%  14.6MB/s ± 1%  -0.67%    (p=0.002 n=9+9)
RegexpMatchEasy0_32-4     362MB/s ± 0%   361MB/s ± 0%  -0.36%   (p=0.000 n=10+9)
RegexpMatchEasy0_1K-4     898MB/s ± 0%   898MB/s ± 0%    ~       (p=0.762 n=9+8)
RegexpMatchEasy1_32-4     387MB/s ± 0%   390MB/s ± 0%  +0.70%   (p=0.000 n=9+10)
RegexpMatchEasy1_1K-4    2.18GB/s ± 0%  2.21GB/s ± 0%  +1.20%    (p=0.000 n=9+9)
RegexpMatchMedium_32-4   7.23MB/s ± 1%  7.32MB/s ± 0%  +1.19%   (p=0.000 n=10+9)
RegexpMatchMedium_1K-4   23.5MB/s ± 1%  24.4MB/s ± 0%  +3.88%    (p=0.000 n=9+9)
RegexpMatchHard_32-4     14.2MB/s ± 1%  14.3MB/s ± 0%  +0.58%    (p=0.000 n=8+8)
RegexpMatchHard_1K-4     14.9MB/s ± 0%  14.9MB/s ± 0%  +0.34%    (p=0.000 n=8+7)
Revcomp-4                 533MB/s ± 1%   539MB/s ± 0%  +1.04%    (p=0.000 n=8+8)
Template-4               25.5MB/s ± 0%  25.4MB/s ± 0%  -0.36%    (p=0.000 n=9+9)


The major improvement that landed recently in the development branch is the conversion of the remaining architecture backends to use the compiler’s SSA form. This has brought a substantial improvement in generated code for non Intel architectures, like ARM3.

name                       old time/op    new time/op    delta
BinaryTree17-4              33.8s ± 1%      27.7s ± 0%  -18.06%  (p=0.000 n=10+10)
Fannkuch11-4                42.0s ± 0%      19.3s ± 0%  -54.10%  (p=0.000 n=10+10)
FmtFprintfEmpty-4           670ns ± 1%      581ns ± 1%  -13.30%  (p=0.000 n=10+10)
FmtFprintfString-4         2.04µs ± 1%     1.65µs ± 0%  -19.09%  (p=0.000 n=10+10)
FmtFprintfInt-4            1.71µs ± 0%     1.21µs ± 0%  -29.39%   (p=0.000 n=10+9)
FmtFprintfIntInt-4         2.69µs ± 1%     1.94µs ± 0%  -27.77%  (p=0.000 n=10+10)
FmtFprintfPrefixedInt-4    2.70µs ± 0%     1.85µs ± 0%  -31.41%   (p=0.000 n=10+9)
FmtFprintfFloat-4          5.15µs ± 0%     3.65µs ± 0%  -29.01%   (p=0.000 n=9+10)
FmtManyArgs-4              11.3µs ± 0%      8.5µs ± 0%  -24.79%   (p=0.000 n=10+9)
GobDecode-4                 112ms ± 0%       77ms ± 1%  -31.04%    (p=0.000 n=9+9)
GobEncode-4                88.5ms ± 1%     77.2ms ± 1%  -12.78%  (p=0.000 n=10+10)
Gzip-4                      4.79s ± 0%      3.34s ± 0%  -30.18%    (p=0.000 n=9+9)
Gunzip-4                    702ms ± 0%      463ms ± 0%  -34.05%  (p=0.000 n=10+10)
HTTPClientServer-4          645µs ± 3%      571µs ± 3%  -11.45%  (p=0.000 n=10+10)
JSONEncode-4                227ms ± 0%      186ms ± 0%  -18.16%  (p=0.000 n=10+10)
JSONDecode-4                845ms ± 0%      618ms ± 0%  -26.81%  (p=0.000 n=10+10)
Mandelbrot200-4            59.3ms ± 0%     40.0ms ± 0%  -32.47%  (p=0.000 n=10+10)
GoParse-4                  45.0ms ± 0%     37.0ms ± 0%  -17.68%    (p=0.000 n=9+9)
RegexpMatchEasy0_32-4       974ns ± 0%      878ns ± 0%   -9.81%   (p=0.000 n=10+9)
RegexpMatchEasy0_1K-4      4.60µs ± 0%     4.48µs ± 0%   -2.57%  (p=0.000 n=10+10)
RegexpMatchEasy1_32-4      1.02µs ± 0%     0.94µs ± 0%   -8.08%   (p=0.000 n=8+10)
RegexpMatchEasy1_1K-4      6.92µs ± 0%     6.08µs ± 0%  -12.10%  (p=0.000 n=10+10)
RegexpMatchMedium_32-4     1.61µs ± 0%     1.27µs ± 0%  -20.98%    (p=0.000 n=9+6)
RegexpMatchMedium_1K-4      447µs ± 0%      317µs ± 0%  -29.05%   (p=0.000 n=10+9)
RegexpMatchHard_32-4       24.9µs ± 0%     18.4µs ± 0%  -25.89%  (p=0.000 n=10+10)
RegexpMatchHard_1K-4        740µs ± 0%      552µs ± 0%  -25.36%  (p=0.000 n=10+10)
Revcomp-4                  81.0ms ± 1%     65.2ms ± 0%  -19.53%    (p=0.000 n=9+9)
Template-4                  1.17s ± 0%      0.81s ± 0%  -31.28%    (p=0.000 n=9+9)
TimeParse-4                5.52µs ± 0%     3.79µs ± 0%  -31.42%   (p=0.000 n=10+9)
TimeFormat-4               10.6µs ± 0%      8.5µs ± 0%  -19.14%  (p=0.000 n=10+10)

name                     old speed      new speed        delta
GobDecode-4              6.86MB/s ± 0%   9.95MB/s ± 1%  +45.00%    (p=0.000 n=9+9)
GobEncode-4              8.67MB/s ± 1%   9.94MB/s ± 1%  +14.69%  (p=0.000 n=10+10)
Gzip-4                   4.05MB/s ± 0%   5.81MB/s ± 0%  +43.32%   (p=0.000 n=10+9)
Gunzip-4                 27.6MB/s ± 0%   41.9MB/s ± 0%  +51.63%  (p=0.000 n=10+10)
JSONEncode-4             8.53MB/s ± 0%  10.43MB/s ± 0%  +22.20%  (p=0.000 n=10+10)
JSONDecode-4             2.30MB/s ± 0%   3.14MB/s ± 0%  +36.39%   (p=0.000 n=9+10)
GoParse-4                1.29MB/s ± 0%   1.56MB/s ± 0%  +20.93%   (p=0.000 n=9+10)
RegexpMatchEasy0_32-4    32.8MB/s ± 0%   36.4MB/s ± 0%  +10.87%  (p=0.000 n=10+10)
RegexpMatchEasy0_1K-4     222MB/s ± 0%    228MB/s ± 0%   +2.64%  (p=0.000 n=10+10)
RegexpMatchEasy1_32-4    31.3MB/s ± 0%   34.0MB/s ± 0%   +8.75%   (p=0.000 n=9+10)
RegexpMatchEasy1_1K-4     148MB/s ± 0%    168MB/s ± 0%  +13.76%  (p=0.000 n=10+10)
RegexpMatchMedium_32-4    620kB/s ± 0%    790kB/s ± 0%  +27.42%   (p=0.000 n=10+8)
RegexpMatchMedium_1K-4   2.29MB/s ± 0%   3.23MB/s ± 0%  +41.05%  (p=0.000 n=10+10)
RegexpMatchHard_32-4     1.29MB/s ± 0%   1.74MB/s ± 0%  +34.88%   (p=0.000 n=9+10)
RegexpMatchHard_1K-4     1.38MB/s ± 0%   1.85MB/s ± 0%  +34.06%  (p=0.000 n=10+10)
Revcomp-4                31.4MB/s ± 1%   39.0MB/s ± 0%  +24.26%    (p=0.000 n=9+9)
Template-4               1.65MB/s ± 0%   2.41MB/s ± 0%  +45.71%   (p=0.000 n=10+9)


  1. Despite the Go 1.8 development cycle opening 18 days late, in order to keep to the 6 month cadence, the feature freeze for this cycle will still occur on the 1st of November.
  2. Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz, 3.13.0-95-generic #142-Ubuntu
  3. Freescale i.MX6, 3.14.77-1-ARCH