Category Archives: Go

Lost in translation

Over the last year I have had the privilege of travelling to meet Go communities in Japan, Korea and India. In every instance I have met experienced, passionate, pragmatic programmers ready to accept Go for what it can do for them.

At the same time the message from each of these communities was the same; where is the documentation, where are the examples, where are the tutorials ? Travelling to these communities has been an humbling experience and has made me realise my privileged position as a native English speaker.

The documentation, and the tutorials, and the examples will come, slowly; this is open source after all. But what I can offer is the fact that all the content on this blog is licensed under a Creative Commons licence.

In short, I don’t have the skills, but if you do, you are welcome to translate any content on this site, and I’ll help you in any way I can.

 

Thanks Brainman

This is a short post to recognise the incredible contribution Alex Brainman has made to the Go project.

Alex was responsible for the port of Go to Windows way back before Go 1 was even released. Since that time he has virtually single-handedly supported Go and Go users on Windows. It’s no wonder that he is the 10th most active contributor to the project.

The Windows build is consistently the most popular download from the official go site.

While I may not use Windows, and you may not use Windows, spare a thought for the large body of developers who do use Windows to develop Go programs and are able to do so because of Alex’s efforts.

Even if your entire business doesn’t use Windows, consider the moment when your product manager comes to you and asks “so, we’ve got a request from a big customer to port our product to Windows, that’s not going to be hard, right ?”. Your answer is directly attributable to Alex’s contributions.

Alex, every Go programmer owes you a huge debt of gratitude. So let me be the first to say it, thank you for everything you have done for Go. None of us would be as successful as we are today without your work.

Errors and Exceptions, redux

In my previous post, I doubled down on my claim that Go’s error handling strategy is, on balance, the best.

In this post, I wanted to take this a bit further, and prove that multiple returns and error values are the best,

When I say best, I obviously mean, of the set of choices available to programmers that write real world programs — because real world programs have to handle things going wrong.

The language we have

I am only going to use the Go language that we have today, not any version of the language which might be available in the future — it simply isn’t practical to hold my breath for that long. As I will show, additions to the language like dare I say, exceptions, would not change the outcome.

A simple problem

For this discussion, I’m going to start with a made up, but very simple function, which demonstrates the requirement for error handling.

package main

import "fmt"

// Positive returns true if the number is positive, false if it is negative.
func Positive(n int) bool {
        return n > -1
}

func Check(n int) {
        if Positive(n) {
                fmt.Println(n, "is positive")
        } else {
                fmt.Println(n, "is negative")
        }
}

func main() {
	Check(1)
	Check(0)
	Check(-1)
}

If you run this code, you get the following output

1 is positive
0 is positive
-1 is negative

which is wrong.

How can this single line function be wrong ? It is wrong because zero is neither positive or negative, and that cannot be accurately captured by the boolean return value from Positive.

This is a contrived example, but hopefully one that can be adapted to discuss the costs and benefits of the various methods of error handling.

Preconditions

No matter what solution is determined to be the best, a check will have to be added to Positive to test the non zero precondition. Here is an example with the precondition added

// Positive returns true if the number is positive, false if it is negative.
// The second return value indicates if the result is valid, which in the case
// of n == 0, is not valid.
func Positive(n int) (bool, bool) {
        if n == 0 {
                return false, false
        }
        return n > -1, true
}

func Check(n int) {
        pos, ok := Positive(n)
        if !ok {
                fmt.Println(n, "is neither")
                return
        }
        if pos {
                fmt.Println(n, "is positive")
        } else {
                fmt.Println(n, "is negative")
        }
}

Running this program we see that the bug is fixed,

1 is positive
0 is neither
-1 is negative

albeit in an ungainly way. For those interested, I also tried a version using a switch which was harder to read for the saving of one line of code.

This then is the baseline to compare other solutions.

Error

Returning a boolean is uncommon, it’s far more common to return an error value, even if the set of errors is fixed. For completeness, and because this simple example is supposed to hold up in more complex circumstances, here is an example using a value that conforms to the error interface.

// Positive returns true if the number is positive, false if it is negative.
func Positive(n int) (bool, error) {
        if n == 0 {
                return false, errors.New("undefined")
        }
        return n > -1, nil
}

func Check(n int) {
        pos, err := Positive(n)
        if err != nil {
                fmt.Println(n, err)
                return
        }
        if pos {
                fmt.Println(n, "is positive")
        } else {
                fmt.Println(n, "is negative")
        }
}

The result is a function which performs the same, and the caller must check the result in an near identical way.

If anything, this underlines the flexibility of Go’s errors are values methodology. When an error occurs, indicating only success or failure (think of the two result form of map lookup), a boolean can be substituted instead of an interface value, which removes the any confusion arising from typed nils and nilness of interface values.

More boolean

Here is an example which allows Positive to return three states, true, false, and nil (Anyone with a background in set theory or SQL will be twitching at this point).

// If the result not nil, the result is true if the number is
// positive, false if it is negative.
func Positive(n int) *bool {
        if n == 0 {
                return nil
        }
        r := n > -1
        return &r
}

func Check(n int) {
        pos := Positive(n)
        if pos == nil {
                fmt.Println(n, "is neither")
                return
        }
        if *pos {
                fmt.Println(n, "is positive")
        } else {
                fmt.Println(n, "is negative")
        }
}

Positive has grown another line, because of the requirement to capture the address of the result of the comparison.

Worse, now before the return value can be used anywhere, it must be checked to make sure that it points to a valid address. This is the situation that Java developers face constantly and leads to deep seated hatred of nil (with good reason). This clearly isn’t a viable solution.

Let’s try panicking

For completeness, let’s look at a version of this code that tries to simulate exceptions using panic.

// Positive returns true if the number is positive, false if it is negative.
// In the case that n is 0, Positive will panic.
func Positive(n int) bool {
        if n == 0 {
                panic("undefined")
        }
        return n > -1
}

func Check(n int) {
        defer func() {
                if recover() != nil {
                        fmt.Println("is neither")
                }
        }()
        if Positive(n) {
                fmt.Println(n, "is positive")
        } else {
                fmt.Println(n, "is negative")
        }
}

… this is just getting worse.

Not exceptional

For the truly exceptional cases, the ones that represent either unrecoverable programming mistakes, like index out of bounds, or unrecoverable environmental problem, like running out of stack, we have panic.

For all of the remaining cases, any error conditions that you will encounter in a Go program, are by definition not exceptional — you expect them because regardless of returning a boolean, an error, or pancing, it is the result of a test in your code.

Forgetting to check

I consider the argument that Developers forget to check error codes is cancelled out by the counter argument Developers forget to handle exceptions. Either may be true, depending on the language you are basing your argument on, but neither commands a winning position.

With that said, you only need to check the error value if you care about the result.

Knowing the difference between which errors to ignore and which to check is why we’re paid as professionals.

Conclusion

I have shown in the article that multiple returns and error values the simplest, and most reliable to use. Easier to use than any other form of error handling, including ones that do not even exist in Go as it stands today.

A challenge

So this is the best demonstration I can come up with, but I expect others can do better, particularly where the monadic style is used. I look forward to your feedback.

Inspecting errors

The common contract for functions which return a value of the interface type error, is the caller should not presume anything about the state of the other values returned from that call without first checking the error.

In the majority of cases, error values returned from functions should be opaque to the caller. That is to say, a test that error is nil indicates if the call succeeded or failed, and that’s all there is to it.

A small number of cases, generally revolving around interactions with the world outside your process, like network activity, require that the caller investigate the nature of the error to decide if it is reasonable to retry the operation.

A common request for package authors is to return errors of a known public type, so the caller can type assert and inspect them. I believe this practice leads to a number of undesirable outcomes:

  • Public error types increase the surface area of the package’s API.
  • New implementations must only return types specified in the interface’s declaration, even if they are a poor fit.
  • The error type cannot be changed or deprecated after introduction without breaking compatibility, making for a brittle API.

Callers should feel no more comfortable asserting an error is a particular type than they would be asserting the string returned from Error() matches a particular pattern.

Instead I present a suggestion that permits package authors and consumers to communicate about their intention, without having to overly couple their implementation to the caller.

Assert errors for behaviour, not type

Don’t assert an error value is a specific type, but rather assert that the value implements a particular behaviour.

This suggestion fits the has a nature of Go’s implicit interfaces, rather than the is a [subtype of] nature of inheritance based languages. Consider this example:

func isTimeout(err error) bool {
        type timeout interface {
                Timeout() bool
        }
        te, ok := err.(timeout)
        return ok && te.Timeout()
}

The caller can use isTimeout() to determine if the error is related to a timeout, via its implementation of the timeout interface, and then confirm if the error was timeout related — all without knowing anything about the type, or the original source of the error value.

Gift wrapping errors, usually by libraries that annotate the error path, is enabled by this method; providing that the wrapped error types also implement the interfaces of the error they wrap.

This may seem like an insoluble problem, but in practice there are relatively few interface methods that are in common use, so Timeout() bool and Temporary() bool would cover a large set of the use cases.

In conclusion

Don’t assert errors for type, assert for behaviour.

For package authors, if your package generates errors of a temporary nature, ensure you return error types that implement the respective interface methods. If you wrap error values on the way out, ensure that your wrappers respect the interface(s) that the underlying error value implemented.

For package users, if you need to inspect an error, use interfaces to assert the behaviour you expect, not the error’s type. Don’t ask package authors for public error types; ask that they make their types conform to common interfaces by supplying Timeout() or Temporary() methods as appropriate.

Friday pop quiz: the size of things

In this program, the size of variables of type x and y in memory varies by platform.

package main

func main() {
        const n = 4
        type x [n]uint
        type y [n]int
}

By changing only one line can you ensure that variables of type x, and y always consume 16 bytes on all platforms that Go 1.4 supports ?

Rules

The code must continue to be correctly formatted.

Bonus points will be awarded for the most creative solution.

Points will be deducted for arguing with the judge (me).


Answers

The solution obviously involved setting n to 4 on 32 bit platforms, and 2 on 64 bit. There were a wide number of variations on this, involving a menagerie of subtraction, shifting and multiplication. The solution I came up with used only one operator:

const n = 4 >> (^uint(0) >> 63)

^uint(0) gives you a number whose bits are all 1, then >> 63 shifts the number 63 binary places to the right. If we’re on a 64 bit platform, this evaluates to 1, shifting 4 one place to the right leaves 2, otherwise 32 ones shifted 63 places to the right gives zero, and 4 shifted right zero times is still 4.

So I was feeling rather chuffed with myself until Paul Hankin quietly dropped this solution:

const n = ^uint(6) % 7

Paul, my hat is off to you Sir.

Minimum one liner followup

It’s a little unfair to announce winners in some kind of order as I did post the quiz at an unfriendly hour of the day for most of the planet.

With that said, Tim and William came up with a great map based solution at roughly the same time. You’ll have to split the winnings between yourselves.

Gary came an interesting solution that works for almost all the integers.

Gustavo Niemeyer takes double points for demonstrating that the first version of this problem could be defeated easily, and then proceeded to demonstrate his very mathy solution to fix Gary’s proposal. Several others also proposed some great shift tricks.

Honourable mentions go to Charlie Somerville, for playing the man and not the ball and Francesc who proved that even with two attempts I couldn’t make the problem sufficiently water tight.

Although the prohibition on adding more than one line was lost on Brendan Tracey, I think this proposal deserves to be highlighted.

So with sensible, workable, and sometimes beautiful solutions out in the open, the race was on for the bonus points for the most creative.

The first was my entry, which was the genesis for this quiz and goes to show, this why we cannot have nice things.

func f(a int, b uint) {
        var min = 0
        min = copy(make([]struct{}, a), make([]struct{}, b))
        fmt.Printf("The min of %d and %d is %d\n", a, b, min)
}

Props for figuring this out goes to Arnaud Porterie and Gustavo Niemeyer who were both good sports and deleted their answer.

I was feeling rather pleased with myself until Paul Hankin emailed me this fabulously creative effort. After that others tweaked to the loop hole that I had inadvertently left open by importing the fmt package.

Congratulations to the winners, and thank you all for contributing.

Friday pop quiz: minimum one liner

This program is incorrect

package main

import "fmt"

func f(a, b int) {
        var min = 0
        fmt.Printf("The min of %d and %d is %d\n", a, b, min)
}

func main() {
        f(9000, 314)
}

By adding only one line can you make it print the correct answer ?

The code must continue to be correctly formatted.

Bonus points will be awarded for the most creative solution.

Points will be deducted for arguing with the judge (me).

Update: thanks to Gustavo Niemeyer who pointed out the first version of the quiz made it way to easy.

Update: a few people have provided some very sound, rational solutions. Good job, give yourself a code review gold star.

The bonus points for the most creative solution are still on the table. As a hint my solution will also work for this variant of the problem.


The answer(s) will be posted tomorrow.

Five suggestions for setting up a Go project

The question of how to set up a new Go project appears commonly on the golang-nuts mailing list.

Normally the advice for how to structure Go code centres around “read the standard library”, but the standard library is not a great deal of use to newcomers in the respect as:

  • You don’t go get packages from the standard library, they’re always present as part of your Go installation
  • The standard library doesn’t live in your $GOPATH so its layout is less useful as an example.

This article attempts to illustrate common patterns for structuring Go projects using real life packages as examples.

Creating a package

A package is a directory inside your $GOPATH/src directory containing, amongst other things, .go source files.

The name of the package should match the name of the directory in which its source is located. If you package is called logger, then its source files may be located in

$GOPATH/src/github.com/yourname/logger

Package names should be all lower case. Sorry, it’s 2014, and there are still operating systems that can’t cope with mixed case.

Package names, and thus the package’s directory, should contain only letters, numbers if you must, but absolutely no punctuation.

The name of a package is part of the name of every type, constant, variable, or function, exported by that package. It may look odd when inside the package, but always consider what it looks like the caller.

Avoid repetition. bytes.Buffer not bytes.BytesBuffer, strings.Reader not strings.StringReader, etc.

For more advice on naming, see Andrew Gerrand’s excellent talk on Go naming.

All the files in a package’s directory must have the same package declaration, with one exception.

For testing, your test files, those ending with _test.go, may declare themselves to be in the same package, but with _test appended to the package declaration. This is known as an external test. For now, just accept that you can’t put the code for multiple packages into one directory.

Main packages

Some packages are actually commands, these carry the declaration package main.

Main packages deviate from the previous statement about the package declaration and the packages’ directory being the same. In the case of commands, the name of the command is taken from the name of the package’s directory.

This obviates the need to use flags like -o when building or installing Go programs — the name of the command is automatically inferred from the name of the directory containing the program.

Everything in Go works with packages.

The go commands; go build, go install, go test, go get, all work with packages, not individual files.

go run is the exception to this rule. It is intended only to be a local version of the go playground. Avoid using it for anything more trivial than a program you would otherwise run in the playground.

The import path

All packages exist inside a directory tree rooted at $GOPATH/src. Because of this, a package’s import path and a package’s name are often different.

Don’t confuse this with the previous statement that a package’s name, its package declaration, should match the directory in which the package’s files live.

The import path is effectively the full path to your package. It is what differentiates your logger package from the dozens of others that are also named logger.

Note: There is no concept of sub packages in Go. This is why the ioutil package is called ioutil, not util with an import path of io/util. This avoids local namespace collisions.

VCS names in import paths

In other languages it is quite common to ensure your package has a unique namespace by prefixing it with your company name, say com.sun.misc.Unsafe. If everyone only writes packages corresponding to domains that they control, then there is little possibility of a collision.

In Go, the convention is to include the location of the source code in the package’s import path, ie

$GOPATH/src/github.com/golang/glog

However there are several important points to remember:

  1. This is not required by the language, it is just a feature of go get.
    go get recognises paths that start with known code hosting sites, Github, Bitbucket, Google code, and knows how to convert the import path of the package (not the name) into the correct command to check out the code.
  2. By following this convention you can point go get at some source code you have in your $GOPATH and it will recursively fetch any required packages. You can even have it fetch all the source code by calling go get import/path.
    This has turned out to be a very simple way of distributing Go programs.
  3. This does not mean that you need to be online to use the Go compiler, or that you need to have made your project public. Remember, the naming of packages is only an aide to go get, and go get is an optional command.

Sample repositories

With this background in place, I’m going to walk through some examples of the various types, or styles, of Go projects. Hopefully by studying them you will understand how to structure your projects in a way that interoperates well with others.

A single package

The simplest example of a Go project is a repository that contains only one package. The example I have chosen is Keith Rarick’s fs package, https://github.com/kr/fs.

This is the simplest Go project, a single package with the code at the root of the repository. The import path for this project would be

import "github.com/kr/fs"

Multiple packages

The next logical step after creating a repository containing a single package is a more complicated project with multiple packages in a single repository.

I’ve chosen my own term project, https://github.com/pkg/term, which contains two packges, github.com/pkg/term, and github.com/pkg/term/termios, containing syscalls to handle the various termios(3) syscalls.

Even though Go does not have a notion of sub packages, term and term/termios live in the same repository. I could have created two projects, https://github.com/pkg/term and https://github.com/pkg/termios, but as they are closely related, it made sense to place the source for both packages in the same Github repository.

To use this project, you would import it with

import "github.com/pkg/term"

A command

godep, https://github.com/tools/godep, is an example of a repository containing one command package at its root.

Because the source for this package declares it to be in package main when compiled the program will appear as $GOPATH/bin/godep.

% go get -v github.com/tools/godep
github.com/tools/godep (download)
github.com/kr/fs
golang.org/x/tools/go/vcs
github.com/tools/godep

A command and a package

The fourth example shows how to structure a Go project that includes shared logic in a package, and a command which uses that logic. The project I have chosen is the platinum searcher by Monochromegane, https://github.com/monochromegane/the_platinum_searcher, an excellent replacement for ack or ag written in pure Go.

At the root of the project is the the_platinum_searcher package (this does break the prohibition on punctuation in package names) containing the logic. In the cmd/pt subdirectory is the main package. Using the globbing feature of go get installing pt is simply

% go get -u github.com/monochromegane/the_platinum_searcher/...

This is not the only way to lay out this style of package. Other examples may place the command, package main, at the root of the repository and the packages containing the logic of the project in a subdirectory. An example of this is Steve Francia’s Hugo, https://github.com/spf13/hugo.

In both examples the intention is to keep as much logic out of the command, as commands cannot be imported by other packages, limiting the reuse of code inside main packages.

Multiple commands and multiple packages

The final example, the go.tools subrepo, https://code.google.com/p/go/source/browse/?repo=tools, combines all of the above.

The tools repo contains many Go packages, and a burgeoning cmd subdirectory of Go programs. As a resource of well written, contemporary, Go code, you could do far worse.

Further reading

Visualising dependencies

Juju is a pretty large project. Some of our core packages have large complex dependency graphs and this is undesirable because the packages which import those core packages inherit these dependencies raising the spectre of an inadvertent import loop.

Reducing the coupling between our core packages has been a side project for some time for me. I’ve written several tools to try to help with this, including a recapitulation of the venerable graphviz wrapper.

However none of these tools were particularly useful, in fact the graphical tools I wrote became unworkable visually well before their textual counterparts — at least you can grep the output of go list to verify if a package is part of the import set of not.

Visualising the import graph

I’d long had a tab in my browser open reminding me to find an excuse to play with d3. After a few false starts I came up with tool that took a package import path and produced a graph of the imports.

Graph of math/rand's imports

math/rand tree graph

In this simple example showing math/rand importing its set of five packages, the problem with my naive approach is readily apparent — unsafe is present three times.

This repetition is both correct, each time unsafe is mentioned it is because its parent package directly imports it, and incorrect as unsafe does not appear three times in the final binary.

After sharing some samples on twitter, rf and Russ Cox suggested that if an import was mentioned at several levels of the tree it could be pushed down to the lowest limb without significant loss of information. This is what the same graph looks like with a simple attempt to implement this push down logic.

math/rand pushdown tree

math/rand pushdown tree

This approach, or at least my implementation of it, was successful in removing some duplication. You can see that the import of unsafe by sync has been pruned as sync imports sync/atomic which in turn imports unsafe.

However, there remains the second occurrence of unsafe rooted in an unrelated part of the tree which has not been eliminated. Still, it was better than the original method, so I kept it and moved on to graphing more complex trees.

crypto/rand pushdown tree

crypto/rand pushdown tree

In this example, crypto/rand, though the pushdown transformation has been applied, the number of duplicated imports is overwhelming. What I realised looking at this graph was even though pushdown was pruning single limbs, there are entire forks of the import graph repeated many times. The clusters starting at sync, or io are heavily duplicated.

While it might be possible to enhance pushdown to prune duplicated branches of imports, I decided to abandon this method because this aggressive pruning would ultimately reduce the import grpah to a trunk with individual imports jutting out as singular limbs.

While an interested idea, I felt that it would obscure the information I needed to unclutter the Juju dependency hierarchy.

However, before moving on I wanted to show an example of a radial visualisation which I played with briefly

crypto/rand radial graph

crypto/rand radial graph

Force graphs

Although I had known intuitively that the set of imports of a package are not strictly a tree, it wasn’t clear to me until I started to visualise them what this meant. In practice, the set of imports of a package will fan out, then converge onto a small set of root packages in the standard library. A tree was the wrong method for visualising this.

math/rand force graph

math/rand force graph

While perusing the d3 examples I came across another visualisation which I felt would be useful to apply, the force directed graph. Technical this is a directed acyclic graph, but the visualisation applies a repulsion algorithm that forces nodes in the graph to move away from each other, hopefully forming a useful display. Here is a small example using the math/rand package

Comparing this to the tree examples above the force graph has dealt with the convergence on the unsafe package well. All three imports paths are faithfully represented without pruning.

But, when applied to larger examples, the results are less informative.

crypto/rand force graph

crypto/rand force graph

I’m pretty sure that part of the issue with this visualisation is my lack of ability with d3. With that said, after playing with this approach for a while it was clear to me that the force graph is not the right tool for complex import graphs.

Compared to this example, applying force graph techniques to Go import graphs is unsuccessful because the heavily connected core packages gravitate towards the center of the graph, rather than the edge.

Chord graphs

The third type I investigated is called a chord graph, or at least that is what it is called in the d3 examples. The chord graph focuses on the interrelationship between nodes, rather than the node itself, and has proved to be best, or at least most appealing way, of visualising dependencies so far.

crypto/rand chord graph

crypto/rand chord graph

While initially overwhelming, the chord graph is aided by d3’s ability to disable rendering of part of the graph as you mouse over them. In addition the edges have tool tips for each limb.

crypto/rand chord graph highlighting bufio

crypto/rand chord graph highlighting bufio

In this image i’ve highlighted bufio. All the packages that bufio imports directly are indicated by lines of the same color leading away from bufio. Likewise the packages that import bufio directly are highlighted, in different color and in a different direction, in this example there is only one, crypto/rand itself.

The size of the segments around the circumference of the circle is somewhat arbitrary, indicating the number of packages that each directly import.

For some packages in the standard library, their import graph is small enough to be interpreted directly. Here is an shot of fmt which shows all the packages that are needed to provide fmt.Println("Hello world!").

fmt chord graph

fmt chord graph

Application to large projects

In the examples I’ve shown so far I’ve been careful to always show small packages from the standard library, and this is for a good reason. This is best explained with the following image

github.com/juju/juju/state chord graph

github.com/juju/juju/state chord graph

This is a graph of the code that I spend most of my time in, the data model of Juju.

I’m hesitant to say that chord graphs work at this size, although it is more successful than the tree or force graph attempts. If you are patient, and have a large enough screen, you can use the chord graph method to trace from any package on the circumference to discover how it relates to the package at the root of the graph.

What did I discover ?

I think I’ve made some pretty pictures, but it’s not clear that chord graphs can replace the shell scripts I’ve been using up until this point. As I am neither a graph theorist, nor a visual designer, this isn’t much of a surprise to me.

Next steps

The code is available in the usual place, but at this stage I don’t intend to release or support it; caveat emptor.

Alan Donovan’s work writing tools for semantic analysis of Go programs is fascinating and it would be interesting to graph the use of symbols from one package by another, or the call flow between packages.