Monthly Archives: May 2014

On declaring variables

Go has several ways to declare a variable. Possibly there are more ways than are strictly required but with the Go 1 contract in effect it’s not going to change.

This short post gives examples of how I decide which variable declaration syntax to use. These are just suggestions, they make sense to me, but I’m sure others have equally strong justifications for alternative arrangements.

When declaring (but not initialising) a variable consider using this syntax

var num int

As Go does not permit uninitialised variables, num will be initialised to the zero value.

Some other examples of this form might be

var things []Thing // an empty slice of Things
for t := range ThingCreator() {
        things = append(things, t)
}

var thing Thing // empty Thing struct 
json.Unmarshall(reader, &thing)

The key is that var acts as a clue that the variable has been deliberately declared as the zero value of the indicated type.

When declaring and initialising, consider using the short declaration syntax. For example

num := rand.Int()

The lack of var is a signal that this variable has been initialised. I also find that by avoiding declaring the type of the variable and infering it from the right hand side of the assignment makes re-factoring easier in the future.

Of course, with any rule of thumb, there are exceptions.

min, max := 0, 1000

But maybe in this case min and max are really constants.

var length uint32 = 0x80

Here length may be being used with a library which requires a specific numeric type and this is probably more readable than

length := uint32(0x80)

Go 1.3 linker improvements

Go obtains much of its compilation speed from the Plan 9 compiler, of which it is a direct descendant. The Plan 9 toolchain deferred much of the work traditionally performed by a compiler to the linking stage and its performance was summarised in section 8 of this paper

The new compilers compile quickly, load slowly, and produce medium quality object code.

Because of the similar division of labour between Go’s compiler and linker, linking is commonly more expensive that compilation. This leads to several problems

Linking cannot benefit from incremental compilation, each link pass starts afresh even if only a tiny part of program has changed.
Linking speed is at least linear (often worse) with the number of packages being linked into the final executable – larger programs link slower.
While multiple commands may be linked in parallel, each individual link is single threaded – as CPU speeds stall, or go backwards, favouring additional cores, linking gets slower in real terms.

It should be noted that while linking is considered slow in terms of the other parts of the Go toolchain, it remains much faster than linking a traditional C or C++ application.

Linking speed has long been recognised as an issue by the Go developers and during the 1.3 development cycle the linker underwent major changes to improve its performance.

What does the Go linker do ?

To understand the change, Russ Cox wrote in his proposal at the beginning of the cycle

The current linker performs two separable tasks.

First, it translates an input stream of pseudo-instructions into executable code and data blocks, along with a list of relocations.

Second, it deletes dead code, merges what’s left into a single image, resolves relocations, and generates a few whole-program data structures such as the runtime symbol table.

In a nutshell, the change has moved the first task of the linker, the pseudo instruction translation, from the linker into the compiler.

What are pseudo instructions ?

Prior to Go 1.3, the output of the compiler, the .a files in your $GOPATH/pkg directory was not directly executable machine code. It was, as Russ describes, a stream of pseudo instructions, which, while not machine independent, it was also not directly executable.

During the linking phase, the linker itself was responsible for turning this pseudo instruction stream into real machine code. Dealing with pseudo instructions during the compilation phase makes the compiler simpler to write.

An example of a pseudo instructions are the unified MOV instruction available to the compiler and assembler

MOV R0, R1          // move the contents of R0 into R1
MOV $80000001, R0   // move the the constant 80000001  into R0
MOV R0, 0           // move the value into R0 into address 0

On a processor like ARM, when translated to machine code, this becomes three separate operations. The first MOV R0, R1 is a simple move from one register to another and the final opcode is MOV.

The second form stores the large constant, 80000001 into R0. ARM processors cannot encode such a large constant directly into the instruction so the linker would store the constant at the end of the function and replace the MOV with a load from an address relative to the program counter into a scratch register (R11 is reserved for the linker) then move the value from R11 to R0.

The final form also needs help from the linker as ARM processors cannot use a constant as an address to store a value. The linker will arrange to load 0 into R11 then store the contents with an instruction like MOV R0, (R11)

The work required for X86 is similar, although the specific restrictions differ.

The results

To evaluate the performance improvements of the change to the linker I’m going to compare building and linking two programs, the venerable godoc and the jujud server binary which depends on almost every line of code in the Juju repo.

In preparation I checked out copies of Go 1.2.1 and Go 1.3beta1 (this data was collected some time ago, but the changes in 1.3beta2 are unrelated to the linker).

`godoc`

% rm -rf ~/pkg
% time /tmp/go1.2.1/bin/go install code.google.com/p/go.tools/cmd/godoc
real    0m3.239s
user    0m3.307s
sys     0m0.595s

The time to compile and link godoc from scratch on this machine with Go 1.2.1 was 3.2 seconds. Let’s look at the time just to recompile the main package and link godoc

% touch ~/src/code.google.com/p/go.tools/cmd/godoc/play.go
% time /tmp/go1.2.1/bin/go install code.google.com/p/go.tools/cmd/godoc
real    0m1.578s
user    0m1.434s
sys     0m0.146s

With most of the compilation avoided, the total time drops to 1.5 seconds. Let’s look at how the linker change in Go 1.3 affects the results.

% rm -rf ~/pkg
% time /tmp/go1.3beta1/bin/go install code.google.com/p/go.tools/cmd/godoc
real    0m3.193s
user    0m3.441s
sys     0m0.530s

Under Go 1.3beta1 the time to compile and link from scratch is roughly the same as Go 1.2.1. There is perhaps a hint that more work is being done in parallel. Let’s compare the results from an incremental compilation.

% touch ~/src/code.google.com/p/go.tools/cmd/godoc/play.go
% time /tmp/go1.3beta1/bin/go install code.google.com/p/go.tools/cmd/godoc
real    0m0.996s
user    0m0.881s
sys     0m0.118s

Under Go 1.3beta1 the time to recompile godoc‘s main package and link has dropped from 1.5 seconds to just under a second. A saving of half a second, or 30% compared to the performance of Go 1.2.1.

`jujud`

% rm -rf ~/pkg
% time /tmp/go1.2.1/bin/go install  launchpad.net/juju-core/cmd/jujud
real    0m8.247s
user    0m18.110s
sys     0m3.861s

Time to compile and link jujud from scratch, roughly 220 packages at this time, was 8.2 seconds using Go 1.2.1. Let’s look at the incremental performance.

% touch ~/src/launchpad.net/juju-core/cmd/jujud/main.go
% time /tmp/go1.2.1/bin/go install launchpad.net/juju-core/cmd/jujud
real    0m3.139s
user    0m2.831s
sys     0m0.305s

Which shows the time to recompile the main package and link the executable is 3.2 seconds. You can also see that the process is almost entirely single threaded as the sum of user and sys is equal to the wall clock time, real.

Let’s turn to Go 1.3beta1

% rm -rf ~/pkg
% time /tmp/go1.3beta1/bin/go install launchpad.net/juju-core/cmd/jujud
real    0m8.107s
user    0m20.533s
sys     0m3.574s

The full rebuild times are comparable to the Go 1.2.1 results, possibly a hair faster. The real improvements show themselves in the incremental compilation scenario.

% touch ~/src/launchpad.net/juju-core/cmd/jujud/main.go
% time /tmp/go1.3beta1/bin/go install launchpad.net/juju-core/cmd/jujud
real    0m2.219s
user    0m1.929s
sys     0m0.290s

Under Go 1.3beta1 the time to recompile the main package and link has dropped from 3.2 seconds to 2.2 seconds, a saving of one second, again roughly 30%.

In conclusion

In my tests, the linker improvements coming in Go 1.3 deliver approximately a 30% reduction in link time. Depending on the size of your final executable this could be a small amount, or a significant amount of time.

Overall the linking change has several important benefits:

Because it’s performed in the compile stage, the result is stored in your $GOPATH/pkg directory. Effectively the instruction selection pass is cached, whereas previously it was recomputed for each binary linked even if they shared many common packages.
Because more work is done in the compile stage, it is done in parallel if you have more than one CPU. If you have one CPU the end result is unchanged, the same amount of work is done, just at different phases, although you will benefit from point 1 regardless
Because more work is done in the compilation stage, less work is done in the linker, so the files passed to the linker are smaller and the work it has to do, which is effectively a mark, sweep and compact over all the symbols presented, is less.

In short, the 1.3 linker rocks.

Accidental method value

This is a quick post to discuss an interesting bug that was recently unearthed by go vet.

The following code is a simplified reduction of a larger piece of code. In the original code the if statement was much larger, encompassing several complicated conditions, making the bug hard to spot visually.

package main

import "fmt"
import "io"

type Thing struct {
        Reader_ io.Reader
}

func (t *Thing) Reader() io.Reader { return t.Reader_ }

func main() {
        t := Thing{Reader_: nil}
        if t.Reader != nil {
                fmt.Println("wait a second")
        }
}

Running this code gives the result

% go run thing.go
wait a second

But … what is going on ? Thing.Reader_ is explicitly set to nil (even though this is unnecessary, the zero value of an interface field is nil), so how can the check for nil on the very next line fail?

Let’s look at what go vet thinks.

% go vet thing.go
thing.go:14: comparison of function Reader != nil is always true
exit status 1

The mistake in the original code was the author had intended to write t.Reader(), but perhaps forgot the parenthesis. The uncommon use of the underscore suffix possibly contributed to the bug.

So, a quick fix and a code review later and the bug was closed. But, why was this code valid in the first place ? The answer is, since Go 1.1, the expression t.Reader is no longer a syntax error, instead it evaluates to a Method Value.

Let’s look a little closer

t := Thing{Reader_: nil}
fmt.Printf("Reader_: %T\n", t.Reader_)
fmt.Printf("Reader(): %T\n", t.Reader())
fmt.Printf("Reader: %T\n", t.Reader)

gives the following output

Reader_: <nil>
Reader(): <nil>
Reader: func() io.Reader

The first line says t.Reader_ evaluates to nil, as expected. t.Reader() also evaluates to nil. However, on the third line t.Reader evaluates to a value whose type is func() io.Reader, and as we see from the initial example, because this method value is derived from a method defined at compile time, it cannot be nil, so the comparison is always true.

Unofficial Go 1.2.2 and 1.3beta1 tarballs for ARM now available

I have updated my unofficial ARM tarball distributions page with prebuilt Go 1.2.2 and Go 1.3beta1 tarballs. You can find them by following the link in the main header of this page.

term: low level serial with a high level interface

I have several projects on the hop at the moment which require control over a serial port, actually a serial port emulated over USB. So for the last few days I’ve let myself be distracted by writing yet another serial package for Go.

github.com/pkg/term

term is built on a lower level package, called termios which provides access to the POSIX terimos(3) functions for fine grained control of the serial and terminal settings. As termios mirrors the POSIX interface, it should be reasonably portable. Anything which differs, such as supported baud rates, can be papered over in the higher level term package.

github.com/pkg/term/termios

term and termios have been tested on Linux and OS X, and should work for the other BSDs.

Suggestions for additional features via issue or pull request are most welcome.

autobench-next updated for Go 1.3

Now that go1.3beta1 has been released I’ve updated the autobench-next branch to track Go 1.2 vs tip (go1.3beta1).

Using autobench is very simple, clone the repository and run make to produce a benchmark on your machine.

% cd devel
% git clone -b autobench-next https://github.com/davecheney/autobench.git
% cd autobench
% make

You can stay up to date with the update target

% git pull 
% make update
% make

Contributions and benchmark results are always welcome. As the Go 1.3 cycle draws to a close I will merge this branch back into master replacing the older 1.1 vs 1.2 comparisons.

Dave Cheney

The acme of foolishness