Gophers, please tag your releases

What do we want? Version management for Go packages! When do we want it? Yesterday!

What does everyone want? We want our Go build tool of choice to fetch the latest stable version when you start using the package in your project. We want them to grab security updates and bug fixes automatically, but not upgrade to a version where the author deleted a method you were using.

But as it stands, today, in 2016, there is no way for a human, or a tool, to look at an arbitrary git (or mercurial, or bzr, etc) repository of Go code and ask questions like:

  • What versions of this project have been released?
  • What is the latest stable release of this software?
  • If I have version 1.2.3, is there a bugfix or security update that I should apply?

The reason for this is Go projects (repositories of Go packages) do not have versions, at least not in the way that our friends in other languages use that word. Go projects do not have versions because there is no formalised release process.

But there’s vendor/ right?

Arguing about tools to manage your vendor/ directory, or which markup format a manifest file should be written in is eating the elephant from the wrong end.

Before you can argue about the format of a file that records the version of a package, you have to have some way of actually knowing what that version is. A version number has to be sortable, so you can ask, “is there a newer version available than the one you have on disk?” Ideally the version number should give you a clue to how large the jump between versions is, perhaps even give a clue to backwards or forwards compatibility between two versions.

SemVer is no one’s favourite, yet one format is everyone’s favourite.

I recommend that Go projects adopt SemVer 2.0.0. It’s a sound standard, it is well understood by many, not just Go programmers, and semantic versioning will let people write tools to build a dependency management ecosystem on top of a minimal release process.

Following the lead of the big three Go projects, Docker, Kubernetes, and CoreOS (and GitHub’s on releases page), the format of the tag must be:

v<SemVer>

That is, the letter v followed by a string which is SemVer 2.0.0 compliant. Here are some examples:

git tag -a v1.2.3
git tag -a v0.1.0
git tag -a v1.0.0-rc.1

Here are some incorrect examples:

git tag -a 1.2.3        // missing v prefix
git tag -a v1.0         // 1.0 is not SemVer compliant
git tag -a v2.0.0beta3  // also not SemVer compliant

Of course, if you’re using hg, bzr, or another version control system, please adjust as appropriate. This isn’t just for git or GitHub repos.

What do you get for this?

Imagine if godoc.org could show you the documentation for the version of the package you’re using, not just the latest from HEAD.

Now, imagine if godoc.org could not just show you the documentation, but also serve you a tarball or zip file of the source code of that version. Imagine not having to install mercurial just to go get that one dependency that is still on google code (rest in peace), or bitbucket in hg form.

Establishing a single release process for Go projects and adopting semantic versioning will let your favourite Go package management or vendoring tool provide you things like a real upgrade command. Instead of letting you figure out which revision to switch to, SemVer gives tool writers the ability to do things like upgrade a dependency to the latest patch release of version 1.2.

Build it and they will come

Tagging releases is pointless if people don’t write tools to consume the information. Just like writing tools that can, at the moment, only record git hashes is pointless.

Here’s the deal: If you release your Go projects with the correctly formatted tags, then there are a host of developers who are working dependency management tools for Go packages that want to consume this information.

How can I declare which versions of other packages my project depends on?

If you’ve read this far you are probably wondering how using tagging releases in your own repository is going to help specify the versions of your Go project’s dependencies.

The Go import statement doesn’t contain this version information, all it has is the import path. But whether you’re in the camp that wants to add version information to the import statement, a comment inside the source file, or you would prefer to put that information in a metadata file, everyone needs version information, and that starts with tagging your release of your Go projects.

No version information, no tools, and the situation never improves. It’s that simple.

Automatically run your package’s tests with inotifywait

This is a short post to illustrate how I use the inotifywait command as a cheap and cheerful way to run my tests automatically on save.

Note: inotify is only available on linux, sorry OS X users.

Step 1. Install inotify-tools

On Debian/Ubuntu, inotifywait and friends live in the inotify-tools package.

% sudo apt-get install inotify-tools

If you live in an RPM universe the package name will hopefully be similar.

Step 2. Create a helper function

Remembering the full inotifywait incantation can be taxing, so save yourself some effort and define a function in .bashrc (or your shell of choice’s startup script).

watch() { while inotifywait --exclude .swp -e modify -r .; do $@; done; }

If you use /usr/bin/watch frequently, you might want to pick another name for this function.

Step 3. Run a command on save

Using tmux (you do use tmux, right?), split the window and run

% watch go test .

Any time that a file in the current working directory is modified, inotifywait will return, which runs the command you provided, then loops back around.

watch will trigger on a modification to anything in the current working directory or below it. The command that runs when inotifywait detects a modification can be anything you like. For example you could be working in one package inside your project, and have watch rebuild all the commands any time you save, like this:

% cd $GOPATH/src/github.com/you/yourproject/pkg/pkg/pkg
% watch go install -v github/com/you/yourproject/cmd/...

Stack traces and the errors package

A few months ago I gave a presentation on my philosophy for error handling. In the talk I introduced a small errors package designed to support the ideas presented in the talk.

This post is an update to my previous blog post which reflects the changes in the errors package as I’ve put it into service in my own projects.

Wrapping and stack traces

In my April presentation I gave examples of using the Wrap function to produce an annotated error that could be unwrapped for inspection, yet mirrored the recommendations from Kernighan and Donovan’s book.

package main

import "fmt"
import "github.com/pkg/errors"

func main() {
        err := errors.New("error")
        err = errors.Wrap(err, "open failed")
        err = errors.Wrap(err, "read config failed")

        fmt.Println(err) // read config failed: open failed: error
}

Wraping an error added context to the underlying error and recorded the file and line that the error occurred. This file and line information could be retrieved via a helper function, Fprint, to give a trace of the execution path leading away from the error. More on that later.

However, when I came to integrate the errors package into my own projects, I found that using Wrap at each call site in the return path often felt redundant. For example:

func readconfig(file string) {
        if err := openfile(file); err != nil {
                return errors.Wrap(err, "read config failed")
        }
        // ...
}

If openfile failed it would likely annotate the error it returned with open failed, and that error would also include the file and line of the openfile function. Similarly, readconfig‘s wrapped error would be annotated with read config failed as well as the file and line of the call to errors.Wrap inside the readconfig function.

I realised that, at least in my own code, it is likely that the name of the function contains sufficient information to frequently make the additional context passed to Wrap redundant. But as Wrap requires a message, even if I had nothing useful to add, I’d still have to pass something:

if err != nil {
        return errors.Wrap(err, "") // ewww
}

I briefly considered making Wrap variadic–to make the second parameter optional–before realising that rather than forcing the user to manually annotate each stack frame in the return path, I can just record the entire stack trace at the point that an error is created by the errors package.

I believe that for 90% of the use cases, this natural stack trace–that is the trace collected at the point New or Errorf are called–is correct with respect to the information required to investigate the error’s cause. In the other cases, Wrap and Wrapf can be used to add context when needed.

This lead to a large internal refactor of the package to collect and expose this natural stack trace.

Fprint and Print have been removed

As mentioned earlier, the mechanism for printing not just the err.Error() text of an error, but also its stack trace, has also changed with feedback from early users.

The first attempts were a pair of functions; Print(err error), which printed the detailed error to os.Stderr, and Fprint(w io.Writer, err error) which did the same but allowed the caller to control the destination. Neither were very popular.

Print was removed in version 0.4.0 because it was just a wrapper around Fprint(os.Stderr, err) and was hard to test, harder to write an example test for, and didn’t feel like its three lines paid their way. However, with Print gone, users were unhappy that Fprint required you to pass an io.Writer, usually a bytes.Buffer, just to retrieve a string form of the error’s trace.

So, Print and Fprint were the wrong API. They were too opinionated, without it being a useful opinion. Fprint has been slowly gutted over the period of 0.5, 0.6 and now has been replaced with a much more powerful facility inspired by Chris Hines’ go-stack/stack package.

The errors package now leverages the powerful fmt.Formatter interface to allow it to customise its output when any error generated, or wrapped by this package, is passed to fmt.Printf. This extended format is activated by the %+v verb. For example,

func main() {
        err := parseArgs(os.Args[1:])
        fmt.Printf("%v\n", err)
}

Prints, as expected,

not enough arguments, expected at least 3, got 0

However if we change the formatting verb to %+v,

func main() {
        err := parseArgs(os.Args[1:])
        fmt.Printf("%+v\n", err)
}

the same error value now results in

not enough arguments, expected at least 3, got 0
main.parseArgs
        /home/dfc/src/github.com/pkg/errors/_examples/wrap/main.go:12
main.main
        /home/dfc/src/github.com/pkg/errors/_examples/wrap/main.go:18
runtime.main
        /home/dfc/go/src/runtime/proc.go:183
runtime.goexit
        /home/dfc/go/src/runtime/asm_amd64.s:2059

For those that need more control the Cause and StackTrace behaviours return values who have their own fmt.Formatter implementations. The latter is alias for a slice of Frame values which represent each frame in a call stack. Again, Frame implements several fmt.Formatter verbs that allow its output to be customised as required.

Putting it all together

With the changes to the errors package, some guidelines on how to use the package are in order.

  • In your own code, use errors.New or errors.Errorf at the point an error occurs.
    func parseArgs(args []string) error {
            if len(args) < 3 {
                    return errors.Errorf("not enough arguments, expected at least 3, got %d", len(args))
            }
            // ...
    }
  • If you receive an error from another function, it is often sufficient to simply return it.
    if err != nil {
           return err
    }
  • If you interact with a package from another repository, consider using errors.Wrap or errors.Wrapf to establish a stack trace at that point. This advice also applies when interacting with the standard library.
    f, err := os.Open(path)
    if err != nil {
            return errors.Wrapf(err, "failed to open %q", path)
    }
  • Always return errors to their caller rather than logging them throughout your program.
  • At the top level of your program, or worker goroutine, use %+v to print the error with sufficient detail.
    func main() {
            err := app.Run()
            if err != nil {
                    fmt.Printf("FATAL: %+v\n", err)
                    os.Exit(1)
            }
    }
  • If you want to exclude some classes of error from printing, use errors.Cause to unwrap errors before inspecting them.

Conclusion

The errors package, from the point of view of the four package level functions, New, Errorf, Wrap, and Wrapf, is done. Their API signatures are well tested, and now this package has been integrated into over 100 other packages, are unlikely to change at this point.

The extended stack trace format, %+v, is still very new and I encourage you to try it and leave feedback via an issue.

Test fixtures in Go

This is a quick post to describe how you can use test fixtures, data files on disk, with the Go testing package. Using fixtures with the Go testing package is quite straight forward because of two convenience features built into the go tool.

First, when you run go test, for each package in scope, the test binary will be executed with its working directory set to the source directory of the package under test. Consider this test in the example package:

package example
import (
"os"
"testing"
)

func TestWorkingDirectory(t *testing.T) {
wd, _ := os.Getwd()
t.Log(wd)
}

Running this from a random directory, and remembering that go test takes a path relative to your $GOPATH, results in:

% pwd /tmp % go test -v github.com/davecheney/example
=== RUN TestWorkingDirectory
--- PASS: TestWorkingDirectory (0.00s)
example_test.go:10: /Users/dfc/src/github.com/davecheney/example
PASS ok github.com/davecheney/example 0.013s

Second, the Go tool will ignore any directory in your $GOPATH that starts with a period, an underscore, or matches the word testdata.

Putting this together, locating a fixture from your test code is as simple as

f, err := os.Open("testdata/somefixture.json")

(technically this code should use filepath.Join but in these simple cases Windows copes fine with the forward slash). Here are some random examples from the standard library:

  1. debug/elf
  2. net/http
  3. image

Happy testing!

Don’t just check errors, handle them gracefully

This post is an extract from my presentation at the recent GoCon spring conference in Tokyo, Japan.


Don't just check errors, handle them gracefully

Errors are just values

I’ve spent a lot of time thinking about the best way to handle errors in Go programs. I really wanted there to be a single way to do error handling, something that we could teach all Go programmers by rote, just as we might teach mathematics, or the alphabet.

However, I have concluded that there is no single way to handle errors. Instead, I believe Go’s error handling can be classified into the three core strategies.

Sentinel errors

The first category of error handling is what I call sentinel errors.

if err == ErrSomething { … }

The name descends from the practice in computer programming of using a specific value to signify that no further processing is possible. So to with Go, we use specific values to signify an error.

Examples include values like io.EOF or low level errors like the constants in the syscall package, like syscall.ENOENT.

There are even sentinel errors that signify that an error did not occur, like go/build.NoGoError, and path/filepath.SkipDir from path/filepath.Walk.

Using sentinel values is the least flexible error handling strategy, as the caller must compare the result to predeclared value using the equality operator. This presents a problem when you want to provide more context, as returning a different error would will break the equality check.

Even something as well meaning as using fmt.Errorf to add some context to the error will defeat the caller’s equality test. Instead the caller will be forced to look at the output of the error‘s Error method to see if it matches a specific string.

Never inspect the output of error.Error

As an aside, I believe you should never inspect the output of the error.Error method. The Error method on the error interface exists for humans, not code.

The contents of that string belong in a log file, or displayed on screen. You shouldn’t try to change the behaviour of your program by inspecting it.

I know that sometimes this isn’t possible, and as someone pointed out on twitter, this advice doesn’t apply to writing tests. Never the less, comparing the string form of an error is, in my opinion, a code smell, and you should try to avoid it.

Sentinel errors become part of your public API

If your public function or method returns an error of a particular value then that value must be public, and of course documented. This adds to the surface area of your API.

If your API defines an interface which returns a specific error, all implementations of that interface will be restricted to returning only that error, even if they could provide a more descriptive error.

We see this with io.Reader. Functions like io.Copy require a reader implementation to return exactly io.EOF to signal to the caller no more data, but that isn’t an error.

Sentinel errors create a dependency between two packages

By far the worst problem with sentinel error values is they create a source code dependency between two packages. As an example, to check if an error is equal to io.EOF, your code must import the io package.

This specific example does not sound so bad, because it is quite common, but imagine the coupling that exists when many packages in your project export error values, which other packages in your project must import to check for specific error conditions.

Having worked in a large project that toyed with this pattern, I can tell you that the spectre of bad design–in the form of an import loop–was never far from our minds.

Conclusion: avoid sentinel errors

So, my advice is to avoid using sentinel error values in the code you write. There are a few cases where they are used in the standard library, but this is not a pattern that you should emulate.

If someone asks you to export an error value from your package, you should politely decline and instead suggest an alternative method, such as the ones I will discuss next.

Error types

Error types are the second form of Go error handling I want to discuss.

if err, ok := err.(SomeType); ok { … }

An error type is a type that you create that implements the error interface. In this example, the MyError type tracks the file and line, as well as a message explaining what happened.

type MyError struct {
        Msg string
        File string
        Line int
}

func (e *MyError) Error() string { 
        return fmt.Sprintf("%s:%d: %s”, e.File, e.Line, e.Msg)
}

return &MyError{"Something happened", “server.go", 42}

Because MyError error is a type, callers can use type assertion to extract the extra context from the error.

err := something()
switch err := err.(type) {
case nil:
        // call succeeded, nothing to do
case *MyError:
        fmt.Println(“error occurred on line:”, err.Line)
default:
// unknown error
}

A big improvement of error types over error values is their ability to wrap an underlying error to provide more context.

An excellent example of this is the os.PathError type which annotates the underlying error with the operation it was trying to perform, and the file it was trying to use.

// PathError records an error and the operation
// and file path that caused it.
type PathError struct {
        Op   string
        Path string
        Err  error // the cause
}

func (e *PathError) Error() string

Problems with error types

So the caller can use a type assertion or type switch, error types must be made public.

If your code implements an interface whose contract requires a specific error type, all implementors of that interface need to depend on the package that defines the error type.

This intimate knowledge of a package’s types creates a strong coupling with the caller, making for a brittle API.

Conclusion: avoid error types

While error types are better than sentinel error values, because they can capture more context about what went wrong, error types share many of the problems of error values.

So again my advice is to avoid error types, or at least, avoid making them part of your public API.

Opaque errors

Now we come to the third category of error handling. In my opinion this is the most flexible error handling strategy as it requires the least coupling between your code and caller.

I call this style opaque error handling, because while you know an error occurred, you don’t have the ability to see inside the error. As the caller, all you know about the result of the operation is that it worked, or it didn’t.

This is all there is to opaque error handling–just return the error without assuming anything about its contents. If you adopt this position, then error handling can become significantly more useful as a debugging aid.

import “github.com/quux/bar”

func fn() error {
        x, err := bar.Foo()
        if err != nil {
                return err
        }
        // use x
}

For example, Foo‘s contract makes no guarantees about what it will return in the context of an error. The author of Foo is now free to annotate errors that pass through it with additional context without breaking its contract with the caller.

Assert errors for behaviour, not type

In a small number of cases, this binary approach to error handling is not sufficient.

For example, interactions with the world outside your process, like network activity, require that the caller investigate the nature of the error to decide if it is reasonable to retry the operation.

In this case rather than asserting the error is a specific type or value, we can assert that the error implements a particular behaviour. Consider this example:

type temporary interface {
        Temporary() bool
}
 
// IsTemporary returns true if err is temporary.
func IsTemporary(err error) bool {
        te, ok := err.(temporary)
        return ok && te.Temporary()
}

We can pass any error to IsTemporary to determine if the error could be retried.

If the error does not implement the temporary interface; that is, it does not have a Temporary method, then then error is not temporary.

If the error does implement Temporary, then perhaps the caller can retry the operation if Temporary returns true.

The key here is this logic can be implemented without importing the package that defines the error or indeed knowing anything about err‘s underlying type–we’re simply interested in its behaviour.

Don’t just check errors, handle them gracefully

This brings me to a second Go proverb that I want to talk about; don’t just check errors, handle them gracefully. Can you suggest some problems with the following piece of code?

func AuthenticateRequest(r *Request) error {
        err := authenticate(r.User)
        if err != nil {
                return err
        }
        return nil
}

An obvious suggestion is that the five lines of the function could be replaced with

return authenticate(r.User)

But this is the simple stuff that everyone should be catching in code review. More fundamentally the problem with this code is I cannot tell where the original error came from.

If authenticate returns an error, then AuthenticateRequest will return the error to its caller, who will probably do the same, and so on. At the top of the program the main body of the program will print the error to the screen or a log file, and all that will be printed is: No such file or directory.
No such file or directory
There is no information of file and line where the error was generated. There is no stack trace of the call stack leading up to the error. The author of this code will be forced to a long session of bisecting their code to discover which code path trigged the file not found error.

Donovan and Kernighan’s The Go Programming Language recommends that you add context to the error path using fmt.Errorf

func AuthenticateRequest(r *Request) error {
        err := authenticate(r.User)
        if err != nil {
                return fmt.Errorf("authenticate failed: %v", err)
        }
        return nil
}

But as we saw earlier, this pattern is incompatible with the use of sentinel error values or type assertions, because converting the error value to a string, merging it with another string, then converting it back to an error with fmt.Errorf breaks equality and destroys any context in the original error.

Annotating errors

I’d like to suggest a method to add context to errors, and to do that I’m going to introduce a simple package. The code is online at github.com/pkg/errors. The errors package has two main functions:

// Wrap annotates cause with a message.
func Wrap(cause error, message string) error

The first function is Wrap, which takes an error, and a message and produces a new error.

// Cause unwraps an annotated error.
func Cause(err error) error

The second function is Cause, which takes an error that has possibly been wrapped, and unwraps it to recover the original error.

Using these two functions, we can now annotate any error, and recover the underlying error if we need to inspect it. Consider this example of a function that reads the content of a file into memory.

func ReadFile(path string) ([]byte, error) {
        f, err := os.Open(path)
        if err != nil {
                return nil, errors.Wrap(err, "open failed")
        } 
        defer f.Close()
 
        buf, err := ioutil.ReadAll(f)
        if err != nil {
                return nil, errors.Wrap(err, "read failed")
        }
        return buf, nil
}

We’ll use this function to write a function to read a config file, then call that from main.

func ReadConfig() ([]byte, error) {
        home := os.Getenv("HOME")
        config, err := ReadFile(filepath.Join(home, ".settings.xml"))
        return config, errors.Wrap(err, "could not read config")
}
 
func main() {
        _, err := ReadConfig()
        if err != nil {
                fmt.Println(err)
                os.Exit(1)
        }
}

If the ReadConfig code path fails, because we used errors.Wrap, we get a nicely annotated error in the K&D style.

could not read config: open failed: open /Users/dfc/.settings.xml: no such file or directory

Because errors.Wrap produces a stack of errors, we can inspect that stack for additional debugging information. This is the same example again, but this time we replace fmt.Println with errors.Print

func main() {
        _, err := ReadConfig()
        if err != nil {
                errors.Print(err)
                os.Exit(1)
        }
}

We’ll get something like this:

readfile.go:27: could not read config
readfile.go:14: open failed
open /Users/dfc/.settings.xml: no such file or directory

The first line comes from ReadConfig, the second comes from the os.Open part of ReadFile, and the remainder comes from the os package itself, which does not carry location information.

Now we’ve introduced the concept of wrapping errors to produce a stack, we need to talk about the reverse, unwrapping them. This is the domain of the errors.Cause function.

// IsTemporary returns true if err is temporary.
func IsTemporary(err error) bool {
        te, ok := errors.Cause(err).(temporary)
        return ok && te.Temporary()
}

In operation, whenever you need to check an error matches a specific value or type, you should first recover the original error using the errors.Cause function.

Only handle errors once

Lastly, I want to mention that you should only handle errors once. Handling an error means inspecting the error value, and making a decision.

func Write(w io.Writer, buf []byte) {
        w.Write(buf)
}

If you make less than one decision, you’re ignoring the error. As we see here, the error from w.Write is being discarded.

But making more than one decision in response to a single error is also problematic.

func Write(w io.Writer, buf []byte) error {
        _, err := w.Write(buf)
        if err != nil {
                // annotated error goes to log file
                log.Println("unable to write:", err)
 
                // unannotated error returned to caller
                return err
        }
        return nil
}

In this example if an error occurs during Write, a line will be written to a log file, noting the file and line that the error occurred, and the error is also returned to the caller, who possibly will log it, and return it, all the way back up to the top of the program.

So you get a stack of duplicate lines in your log file, but at the top of the program you get the original error without any context. Java anyone?

func Write(w io.Write, buf []byte) error {
        _, err := w.Write(buf)
        return errors.Wrap(err, "write failed")
}

Using the errors package gives you the ability to add context to error values, in a way that is inspectable by both a human and a machine.

Conclusion

In conclusion, errors are part of your package’s public API, treat them with as much care as you would any other part of your public API.

For maximum flexibility I recommend that you try to treat all errors as opaque. In the situations where you cannot do that, assert errors for behaviour, not type or value.

Minimise the number of sentinel error values in your program and convert errors to opaque errors by wrapping them with errors.Wrap as soon as they occur.

Finally, use errors.Cause to recover the underlying error if you need to inspect it.

The value of TDD

What is the value of test driven development?

Is the value writing tests at the same time as you write the code? Sure, I like that property. It means that at any time you’re one control-Z away from your tests passing; either revert your test change, or fix the code so the test pass. The nice property of this method is once you’ve implemented your feature, by definition, it’s already tested. Push that branch and lean in for the code review.

Another important property of TDD is it forces you to think about writing code that is testable, as a first class citizen. You don’t add testing after the fact, in the same way you don’t add performance or security after the code is “done” — right?

But for me, the most important property of TDD is it forces you to write your tests as a consumer of your own code, making you think about its API, continuously.

Many times people have said to me that they like the idea of TDD in principle, but have found they felt slower when they tried it. I understand completely. TDD does slow you down if you don’t have a design to work from. TDD doesn’t relieve you of the responsibility of designing your code first.

How much design you do is really up to you, but if you find yourself in the situation where you find TDD is slowing you down because you’re fighting the double whammy of changing the code and the tests at the same time, that’s a sure fire sign that you’ve run off the edge of your design map.

Robert Martin says you should not write a line of production code without a failing unit test–the key word is production code. It’s 100% OK to skip writing tests you’re exploring the design space, just remember to budget time to rewrite this code in a TDD fashion. The good news is it won’t take you very long, you’ve already designed the code, and built one to throw away.

Constant errors

This is a thought experiment about sentinel error values in Go.

Sentinel errors are bad, they introduce strong source and run time coupling, but are sometimes necessary. io.EOF is one of these sentinel values. Ideally a sentinel value should behave as a constant, that is it should be immutable and fungible.

The first problem is io.EOF is a public variable–any code that imports the io package could change the value of io.EOF. It turns out that most of the time this isn’t a big deal, but it could be a very confusing problem to debug.

fmt.Println(io.EOF == io.EOF) // true
x := io.EOF
fmt.Println(io.EOF == x)      // true
	
io.EOF = fmt.Errorf("whoops")
fmt.Println(io.EOF == io.EOF) // true
fmt.Println(x == io.EOF)      // false

The second problem is io.EOF behaves like a singleton, not a constant. Even if we follow the exact procedure used by the io package to create our own EOF value, they are not comparable.

err := errors.New("EOF")   // io/io.go line 38
fmt.Println(io.EOF == err) // false

Combine these properties and you have a set of weird behaviours stemming from the fact that sentinel error values in Go, those traditionally created with errors.New or fmt.Errorf, are not constants.

Constant errors

Before I introduce my solution, let’s recap how the error interface works in Go. Any type with an Error() string method fulfils the error interface. This includes primitive types like string, including constant strings.

With that background, consider this error implementation.

type Error string

func (e Error) Error() string { return string(e) }

It looks similar to the errors.errorString implementation that powers errors.New. However unlike errors.errorString this type is a constant expression.

const err = Error("EOF") 
const err2 = errorString{"EOF"} // const initializer errorString literal is not a constant

As constants of the Error type are not variables, they are immutable.

const err = Error("EOF") 
err = Error("not EOF") // error, cannot assign to err

Additionally, two constant strings are always equal if their contents are equal, which means two Error values with the same contents are equal.

const err = Error("EOF") 
fmt.Println(err == Error("EOF")) // true

Said another way, equal Error values are the same, in the way that the constant 1 is the same as every other constant 1.

const eof = Error("eof")

type Reader struct{}

func (r *Reader) Read([]byte) (int, error) {
        return 0, eof
}

func main() {
        var r Reader
        _, err := r.Read([]byte{})
        fmt.Println(err == eof) // true
}

Could we change the definition of io.EOF to be a constant? It turns out that this compiles just fine and passes all the tests, but it’s probably a stretch for the Go 1 contract.

However this does not prevent you from using this idiom in your own code. Although, you really shouldn’t be using sentinel errors anyway.

Go 1.7 toolchain improvements

This is a progress report on the Go toolchain improvements during the 1.7 development cycle. All measurements were taken using a Thinkpad x220, Core i5-2520M, running Ubuntu 14.04 linux.

Faster compilation

Since Go 1.5, when the compiler itself was translated from C to Go, compile times are slower than they used to be. Everyone knows it, nobody is happy about it, and we’re working on fixing it.

A huge amount of effort in the 1.7 cycle has gone into reducing the amount of memory and the wall time the compiler uses for various benchmark jobs. The results so far are:

Compile time for full build relative to Go 1.4.3

Compile time for full build relative to Go 1.4.3

Previously I reported the current 1.7 compiler was 2x slower than Go 1.4.3 for the Jujud test. After re-benchmarking everything for this post, the slowdown is closer to 2.2x. Jujud is the largest of the three benchmarks–512 packages, vs 304 and 102 packages respectively–and shows the largest slowdown.

The cmd/go result is a little misleading as the code being compiled changes with every release, vs the fixed codebases of the other benchmarks.

Note: The benchmark scripts for jujud, kube-controller-manager, and gogs are online. Please try them yourself and report your findings.

Improved linker performance

A significant part of the build time improvements observed above come from improvements to the linker. Relative to Go 1.6, the linker is now roughly 66% faster.

Link time relative to Go 1.4.3

Link time relative to Go 1.4.3

Relative to Go 1.4.3, linking is 10% faster for any non trivial binary, and up to 30% faster for large binaries like jujud. These figures are for ELF targets only, Mach-o and PE targets have not improved as much.

This isn’t just useful for building final binaries. A faster linker improves the edit/compile/test cycle as each test is itself a program that must be linked and run. Anecdotally the linker now uses a third less memory, which is valuable when linking large binaries.

Smaller binaries

With code generation and linker improvements, binaries produced by the tip compiler are substantially smaller than Go 1.6. This work has been spearheaded by David Crawshaw.

Binary sizes

Binary size (bytes)

At this point, with the exception of its own cmd/go tool, Go 1.7 produces smaller binaries than Go 1.4.3.

Code generation improvements

The big feature for the Go 1.7 cycle is the new SSA backend for 64bit Intel.

While not the focus of this post, it would be remiss not to include some information about performance improvements in compiled code (not just the compiler’s code). Stressing of course that these are preliminary figures as there is still four months to go before the new backend becomes the default for amd64.

Go 1 benchmarks, Go 1.6 vs 683448a

Go 1 benchmarks, Go 1.6 vs 683448a

These numbers match the figures reported by Keith Randall a few weeks ago, and are in line with his thesis in the SSA design doc.

I think it would be fairly easy to make the generated programs 20% smaller and 10% faster. — khr

These improvements are not just the work of the SSA backend. The standard library and garbage collector continue to see improvements, including a 20% improvement to the fmt package by Martin Möhrmann. These benefits flow to all the platforms that Go supports.

The sole regression above is caused by a current limitation in the register optimiser which manages to registerise one less variable in the Mandelbrot inner loop.

Looking ahead

According to the release schedule, approximately one month remains before the 1.7 change window closes and the dev cycle enters the bug fix phase. There is still lots of work to do, but the improvements so far will easily make Go 1.7 the best Go release to date.

Threads are a strange abstraction

When you think about it, threads are a strange abstraction.

From the programmer’s point of view, threads are great. It’s as if you can create virtual CPUs, on the fly, and the operating system will take care of simulating these virtual CPUs on top of real ones.

But on an implementation level, what is a simple abstraction can lead you, the programmer, into a trap. You don’t have an infinite number of CPUs and applying threaded abstractions in an unbounded manner invites you to overwhelm the real CPU if you try to actually use all these virtual CPUs simultaneously, or overwhelm the address space if they sit idle due to the overhead needed to maintain the illusion.

Careful tuning of one or both of the number of threads in use by the program, or the amount of memory each thread is allocated is needed whenever threads as a concurrency primitive are used in anger. So much for abstraction.

Go’s goroutines sit right in the middle, the same programmer interface as threads, a nice imperative coding model, but also efficient implementation model based on coroutines.

Go project contributors by the numbers

In December 2014 the Go project moved from Google Code to GitHub. Along with the move to GitHub, the Go project moved from Mercurial to Git, which necessitated a move away from Rietveld to Gerrit for code review.

A healthy open source project lives and dies by its contributors. People come and people go as time, circumstance, their jobs, and their interests change. I wanted to investigate if these moves were a net positive for the project.

Go project contributors

As of writing 744 people have contributed to the Go project. This number is slightly lower than OpenHub‘s count, which also includes the golang.org/x sub-repositories that this analysis ignores. The number is slightly higher than GitHub‘s count for reasons which are unclear.

Go project contributors (click to enlarge)

Go project contributors (click to enlarge)

This graph shows contributors over time. As a new contributor lands their first commit, a line representing the current count of contributors is placed on the graph. The graph also records the number of contributors whose email addresses end in golang.org or google.com as a proxy for the Go team and Google employees respectively.

Did the move to Git, Gerrit, and GitHub increase the number of contributors, and thereby contributions? Almost certainly. The graph shows a clear up-tick in the number of new contributors to the project from January 2015 onwards. However with so many factors in play it is not possible to identify a single cause.

It should also be noted that even prior to the move, the project has always attracted a healthy stream of new contributors.

Runtime contributors

Runtime contributors (click to enlarge)

Runtime contributors (click to enlarge)

From June 2014 to December 2014, the Go runtime was rewritten (with the help of some automated tools) from C to Go. This graph records the number of contributors to the parts of the standard library considered the runtime. This has changed over time as paths have been renamed, but is currently what we think of as the tree rooted at $GOROOT/src/runtime.

Did the rewrite of the runtime from a dialect of C that was compiled with the project’s own C compiler to Go increase the number of individual contributors willing to work on the runtime? Unfortunately not, although the number of Googlers contributing to the the runtime did increase slightly.  An increase in individual runtime contributions did not occur till the following January once the move to GitHub was complete.

Compiler contributors

Compiler contributors (click to enlarge)

Compiler contributors (click to enlarge)

After the move to GitHub, the Go compiler was translated from C to Go for the Go 1.5 release. This process was completed by May 2015. Did this have an impact on the number of new contributors specifically targeting the compiler? Possibly, there was a short lived spurt of new contributions around July 2015. The reason for this could be attributed to the strong message from the Go team that the 1.5 release was focusing on the runtime, specifically the garbage collector, and that the current compiler was to be replaced with the SSA back end being developed for the 1.6 time frame (since delayed til 1.7).

This latter point is supported by the clear spike in contributors after February 2016, when the 1.7 tree opened for development with the SSA back end attracting a number of talented new contributors.

An open source project

At the moment there is no question that the largest contributors by total commits to Go are the Go team themselves. This stands to reason as they are sponsored by Google itself to develop the language. However, out of the current top 16 contributors to the project, the number 7th, 9th, 11th, 14th, and 16th contributors are neither members of the Go team or employed by Google.

The charge is commonly levelled at Google that Go is not an open source project. This analysis shows that claim to be false. The number of contributors from the Go team, or Google, continues to be a dwindling fraction of the total number of contributors to the project.