Internets of interest #2: John Ousterhout discusses a Philosophy of Software Design

Ousterhout’s opus is tearing up tech twitter at the moment. But for those outside the North American prime shipping service area, we’re shit out of luck until a digital version is available. Until then, here’s Ousterhout’s Google Tech talk:


Internets of interest #1: Brian Kernighan on the Elements of Programming Style

“It turns out that style matters in programming for the same reason that it matters in writing. It makes for better reading.”
— Douglas Crockford

I stumbled across this old (in internet years) presentation a few weeks ago and it’s been on high rotation since. If you can look past the recording difficulties (and the evacuation siren) this presentation is chock full of sound advice applicable to all programmers.


Maybe adding generics to Go IS about syntax after all

This is a short response to the recently announced Go 2 generics draft proposals

Update: This proposal is incomplete. It cannot replace two common use cases. The first is ensuring that several formal parameters are of the same type:

contract comparable(t T) {
        t > t

func max(type T comparable)(a, b T) T

Here a, and b must be the same parameterised type — my suggestion would only assert that they had at least the same contract.

Secondly the it would not be possible to parameterise the type of return values:

contract viaStrings(t To, f From) {
	var x string = f.String()

func SetViaStrings(type To, From viaStrings)(s []From) []To

Thanks to Ian Dawes and Sam Whited for their insight.


My lasting reaction to the Generics proposal is the proliferation of parenthesis in function declarations.

Although several of the Go team suggested that generics would probably be used sparingly, and the additional syntax would only be burden for the writer of the generic code, not the reader, I am sceptical that this long requested feature will be sufficiently niche as to be unnoticed by most Go developers.

It is true that type parameters can be inferred from their arguments, the declaration of generic functions and methods require a clumsy (type parameter declaration in place of the more common <T> syntaxes found in C++ and Java.

The reason for (type, it was explained to me, is Go is designed to be parsed without a symbol table. This rules out both <T> and [T] syntaxes as the parser needs ahead of time what kind of declaration a T is to avoid interpreting the angle or square braces as comparison or indexing operators respectively.

Contract as a superset of interfaces

The astute Roger Peppe quickly identified that contracts represent a superset of interfaces

Any behaviour you can express with an interface, you can do so and more, with a contract.

The remainder of this post are my suggestions for an alternative generic function declaration syntax that avoids add additional parenthesis by leveraging Roger’s observation.

Contract as a kind of type

The earlier Type Functions proposal showed that a type declaration can support a parameter. If this is correct, then the proposed contract declaration could be rewritten from

contract stringer(x T) {
        var s string = x.String()


type stringer(x T) contract {
	var s string = x.String()

This supports Roger’s observation that a contract is a superset of an interface. type stringer(x T) contract { ... } introduces a new contract type in the same way type stringer interface { ... }introduces a new interface type.

If you buy my argument that a contract is a kind of type is debatable, but if you’re prepared to take it on faith then the remainder of the syntax introduced in the generics proposal could be further simplified.

If contracts are types, use them as types

If a contract is an identifier then we can use a contract anywhere that a built-in type or interface is used. For example

func Stringify(type T stringer)(s []T) (ret []string) {
	for _, v := range s {
		ret = append(ret, v.String())
	return ret

Could be expressed as

func Stringify(s []stringer) (ret []string) {
	for _, v := range s {
		ret = append(ret, v.String())
	return ret

That is, in place of explicitly binding T to the contract stringer only for T to be referenced seven characters later, we bind the formal parameter s to a slice of stringer s directly. The similarity with the way this would previously be done with a stringer interface emphasises Roger’s observation.1

Unifying unknown type parameters

The first example in the design proposal introduces an unknown type parameter.

func Print(type T)(s []T) {
	for _, v := range s {

The operations on unknown types are limited, they are in some senses values that can only be read. Again drawing on Roger’s observation above, the syntax could potentially be expressed as:

func Print(s []contract{}) {
	for _, v := range s {

Or maybe even

type T contract {}

func Print(s []T) {
	for _, v := range s {

In essence the literal contract{} syntax defines an anonymous unknown type analogous to interface{}‘s anonymous interface type.


The great irony is, after years of my bloviation that “adding generics to Go has nothing to do with the syntax”2, it turns out that, actually, yes, the syntax is crucial.

Internets of interest #0: The future of Microprocessors

This weekend I’ve been freshening up the introductory material for a workshop that Francesc Campoy and I are teaching at GopherCon this month. As part of my research, these videos have been on high rotation.

The first video by Sophie Wilson, the designer of the first ARM chip from which both the company and the line of RISC microprocessors we know today were born.

The second presentation by John Hennessy of Computer Architecture: A Quantitative Approach fame is a reprisal of the 2017 Turing Award lecture he gave with his co-author David Patterson.

The theme of both presentations is the same; the end of Dennard scaling and the tremendous technical and economic challenges in bringing extreme UV lithography to the commercial processor production will cap the growth in processor performance to 2-3% per year for the foreseeable future.

Source: John Hennessy and David Patterson, Computer Architecture: A Quantitative Approach, 6/e. 2018

Both Wilson and Hennessy see the future of processor design as a gestalt of CPUs, GPUs, DSPs and VLIW architectures. Issue of adapting mainstream imperative programming languages to these architectures remains very much an open question.

Using Go modules with Travis CI

In my previous post I converted httpstat to use Go 1.11’s upcoming module support. In this post I continue to explore integrating Go modules into a continuous integration workflow via Travis CI.

Life in mixed mode

The first scenario is probably the most likely for existing Go projects, a library or application targeting Go 1.10 and Go 1.11. httpstat has an existing CI story–I’m using Travis CI for my examples, if you use something else, please blog about your experience–and I wanted to test against the current and development versions of Go.


The straddling of two worlds is best accomplished via the GO111MODULE environment variable. GO111MODULE dictates when the Go module behaviour will be preferred over the Go 1.5-1.10’s vendor/ directory behaviour. In Go 1.11 the Go module behaviour is disabled by default for packages within $GOPATH (this also includes the default $GOPATH introduced in Go1.8). Thus, without additional configuration, Go1.11 inside Travis CI will behave like Go 1.10.

In my previous post I chose the working directory ~/devel/httpstat to ensure I was not working within a $GOPATH workspace. However CI vendors have worked hard to make sure that their CI bots always check out of the branch under test inside a working $GOPATH.

Fortunately there is a simple workaround for this, add env GO111MODULE=on before any go buildor test invocations in your .travis.yml to force Go module behaviour and ignore any vendor/ directories that may be present inside your repo.1

language: go
  - 1.10.x
  - master

  - linux
  - osx

dist: trusty
sudo: false

install: true

  - env GO111MODULE=on go build 
  - env GO111MODULE=on go test

Creating a go.mod on the fly

You’ll note that I didn’t check in the go.mod module manifest I created in my previous post. This was initially an accident on my part, but one that turned out to be beneficial. By not checking in the go.mod file, the source of truth for dependencies remained httpstat’s Gopkg.toml file. When the call to env GO111MODULE=on go build executes on the Travis CI builder, the go tool converts my Gopkg.toml on the fly, then uses it to fetch dependencies before building.

$ env GO111MODULE=on go build
go: creating new go.mod: module
go: copying requirements from Gopkg.lock
go: finding v1.5.0
go: finding v0.0.0-20170922123423-429f518978ab
go: finding v0.0.0-20170922011244-0744d001aa84
go: finding v0.0.0-20170915090833-1cbadb444a80
go: finding v0.0.9
go: finding v0.0.3
go: downloading v1.5.0
go: downloading v0.0.9
go: downloading v0.0.3
go: downloading v0.0.0-20170922011244-0744d001aa84
go: downloading v0.0.0-20170915090833-1cbadb444a80

If you’re not using a dependency management tool that go mod knows how to convert from this advice may not work for you and you may have to maintain a go.mod manifest in parallel with you previous dependency management solution.

A clean slate

The second option I investigated, but ultimately did not pursue, was to treat the Travis CI builder, like my fresh Ubuntu 18.04 install, as a blank canvas. Rather than working around Travis CI’s attempts to check the branch out inside a working $GOPATH I experimented with treating the build as a C project2 then invoking gimme directly. This also required me to check in my go.mod file as without Travis’ language: go support, the checkout was not moved into a $GOPATH folder. The latter seems like a reasonable approach if your project doesn’t intend to be compatible with Go 1.10 or earlier.

language: c

  - linux
  - osx

dist: trusty
sudo: false

  - eval "$(curl -sL | GIMME_GO_VERSION=master bash)"

  - go build 
  - go test

You can see the output from this branch here.

Sadly when run in this mode gimme is unable to take advantage of the caching provided by the language: go environment and must build Go 1.11 from source, adding three to four minutes delay to the install phase of the build. Once Go 1.11 is released and gimme can source a binary distribution this will hopefully address the setup latency.

Ultimately this option may end up being redundant if GO111MODULE=on becomes the default behaviour in Go 1.12 and the location Travis places the checkout becomes immaterial.

Taking Go modules for a spin

Update: Since this post was written, Go 1.11beta2 has been released. I’ve updated the setup section to reflect this. Russ Cox kindly wrote to me to explain the reasoning behind storing the Go module cache in $GOPATH. I’ve included his response inline.

This weekend I wanted to play with Ubuntu 18.04 on a spare machine. This gave me a perfect excuse to try out the modules feature recently merged into the Go 1.11 development branch.

TL;DR: When Go 1.11 ships you’ll be able to download the tarball and unpack it anywhere you like. When Go 1.11 ships you’ll be able to write Go modules anywhere you like. 


The recently released Go 1.11beta2 has support for Go modules.

% curl | \
        tar xz --transform=s/^go/go1.11/g
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  169M  100  169M    0     0  23.6M      0  0:00:07  0:00:07 --:--:-- 21.2M
% go1.11/bin/go version
go version go1.11beta2 linux/amd64

That’s all you need to do to install Go 1.11beta2. Out of shot, I’ve added $HOME/go1.11/bin to my $PATH.

Kicking the tires

Now we have a version of Go with module support installed, I wanted to try to use it to manage the dependencies for httpstat, a clone of the Python tool of the same name that many collaborators swarmed on to build in late 2016.

To show that Go 1.11 won’t need you to declare a $GOPATH or use a specific directly layout for the location of your project, I’m going to use my favourite directory for source code, ~/devel.

% git clone devel/httpstat
Cloning into 'devel/httpstat'...
remote: Counting objects: 2326, done.
remote: Total 2326 (delta 0), reused 0 (delta 0), pack-reused 2326
Receiving objects: 100% (2326/2326), 8.73 MiB | 830.00 KiB/s, done.
Resolving deltas: 100% (673/673), done.
Checking out files: 100% (1361/1361), done.
% cd devel/httpstat
% go mod -init -module
go: creating new go.mod: module
go: copying requirements from Gopkg.lock

Nice, go mod -init translated my existing Gopkg.lock file into its own go.mod format.

% cat go.mod

require ( v1.5.0 v0.0.9 v0.0.3 v0.0.0-20170922011244-0744d001aa84 v0.0.0-20170922123423-429f518978ab v0.0.0-20170915090833-1cbadb444a80

Let’s give it a try

% go build 
go: finding v0.0.0-20170922011244-0744d001aa84
go: finding v0.0.9
go: finding v0.0.3
go: finding v0.0.0-20170922123423-429f518978ab
go: finding v1.5.0
go: finding v0.0.0-20170915090833-1cbadb444a80
go: downloading v1.5.0
go: downloading v0.0.3
go: downloading v0.0.0-20170922011244-0744d001aa84
go: downloading v0.0.9
go: downloading v0.0.0-20170915090833-1cbadb444a80

Very nice, go build ignored the vendor/ folder in this repository (because we’re outside $GOPATH) and fetched the revisions it needed. Let’s try out the binary and make sure it works.

% ./httpstat

Connected to

HTTP/2.0 200 OK
Server: Google Frontend
Alt-Svc: quic=":443"; ma=2592000; v="44,43,39,35"
Cache-Control: private
Content-Type: text/html; charset=utf-8
Date: Sat, 14 Jul 2018 08:20:43 GMT
Strict-Transport-Security: max-age=31536000; preload
Vary: Accept-Encoding
X-Cloud-Trace-Context: 323cd59570cc084fed506f7e85d79d9f

Body discarded

  DNS Lookup   TCP Connection   TLS Handshake   Server Processing   Content Transfer
[    171ms  |          18ms  |        559ms  |            226ms  |             5ms  ]
            |                |               |                   |                  |
   namelookup:171ms          |               |                   |                  |
                       connect:189ms         |                   |                  |
                                   pretransfer:749ms             |                  |
                                                     starttransfer:976ms            |

Move along, nothing to see here.

Go module source cache

In the previous version of this article I included a footnote mentioning that go get in module mode stored its downloaded source in $GOPATH/src/mod not the cache added in Go 1.10. Russ Cox kindly wrote to me to explain the rational behind this choice and also copied this to a recent thread on golang-dev. For completeness, here is his response:

The build cache ($GOCACHE, defaulting to $HOME/.cache/go-build) is for storing recent compilation results, so that if you need to do that exact compilation again, you can just reuse the file. The build cache holds entries that are like “if you run this exact compiler on these exact inputs. this is the output you’d get.” If the answer is not in the cache, your build uses a little more CPU to run the compiler nstead of reusing the output. But you are guaranteed to be able to run the compiler instead, since you have the exact inputs and the compiler binary (or else you couldn’t even look up the answer in the cache).

The module cache ($GOPATH/src/mod, defaulting to $HOME/go/src/mod) is for storing downloaded source code, so that every build does not redownload the same code and does not require the network or the original code to be available. The module cache holds entries that are like “if you need to download mymodule@v1.2.3, here are the files you’d get.” If the answer is not in the cache, you have to go out to the network. Maybe you don’t have a network right now. Maybe the code has been deleted. It’s not anywhere near guaranteed that you can redownload the sources and also get the same result. Hopefully you can, but it’s not an absolute certainty like for the build cache. (The go.sum file will detect if you get a different answer on re-download, but knowing you got the wrong bits doesn’t help you make progress on actually building your code. Also these paths end up in file-line information in binaries, so they show up in stack traces, and the like and feed into tools like text editors or debuggers that don’t necessarily know how to trigger the right cache refresh.)

Wrap up

You can build Go 1.11 from source right now anywhere you like. You don’t need to set an environment variable or follow a predefined location.

With Go 1.11 and modules you can write your Go modules anywhere you like. You’re no longer forced into having one copy of a project checked out in a specific sub directory of your $GOPATH.

Slices from the ground up

This blog post was inspired by a conversation with a co-worker about using a slice as a stack. The conversation turned into a wider discussion on the way slices work in Go, so I thought it would be useful to write it up.


Every discussion of Go’s slice type starts by talking about something that isn’t a slice, namely, Go’s array type. Arrays in Go have two relevant properties:

  1. They have a fixed size; [5]int is both an array of 5 ints and is distinct from [3]int.
  2. They are value types. Consider this example:
    package main
    import "fmt"
    func main() {
            var a [5]int
            b := a
            b[2] = 7
            fmt.Println(a, b) // prints [0 0 0 0 0] [0 0 7 0 0]

    The statement b := a declares a new variable, b, of type [5]int, and copies the contents of a to b. Updating b has no effect on the contents of a because a and b are independent values.1


Go’s slice type differs from its array counterpart in two important ways:

  1. Slices do not have a fixed length. A slice’s length is not declared as part of its type, rather it is held within the slice itself and is recoverable with the built-in function len.2
  2. Assigning one slice variable to another does not make a copy of the slices contents. This is because a slice does not directly hold its contents. Instead a slice holds a pointer to its underlying array3 which holds the contents of the slice.

As a result of the second property, two slices can share the same underlying array. Consider these examples:

  1. Slicing a slice:
    package main
    import "fmt"
    func main() {
            var a = []int{1,2,3,4,5}
            b := a[2:]
            b[0] = 0
            fmt.Println(a, b) // prints [1 2 0 4 5] [0 4 5]

    In this example a and b share the same underlying array–even though b starts at a different offset in that array, and has a different length. Changes to the underlying array via b are thus visible to a.

  2. Passing a slice to a function:
    package main
    import "fmt"
    func negate(s []int) {
            for i := range s {
                    s[i] = -s[i]
    func main() {
            var a = []int{1, 2, 3, 4, 5}
            fmt.Println(a) // prints [-1 -2 -3 -4 -5]

    In this example a is passed to negateas the formal parameter s. negate iterates over the elements of s, negating their sign. Even though negate does not return a value, or have any way to access the declaration of a in main, the contents of a are modified when passed to negate.

Most programmers have an intuitive understanding of how a Go slice’s underlying array works because it matches how array-like concepts in other languages tend to work. For example, here’s the first example of this section rewritten in Python:

Python 2.7.10 (default, Feb  7 2017, 00:08:15) 
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.34)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> a = [1,2,3,4,5]
>>> b = a
>>> b[2] = 0
>>> a
[1, 2, 0, 4, 5]

And also in Ruby:

irb(main):001:0> a = [1,2,3,4,5]
=> [1, 2, 3, 4, 5]
irb(main):002:0> b = a
=> [1, 2, 3, 4, 5]
irb(main):003:0> b[2] = 0
=> 0
irb(main):004:0> a
=> [1, 2, 0, 4, 5]

The same applies to most languages that treat arrays as objects or reference types.4

The slice header value

The magic that makes a slice behave both as a value and a pointer is to understand that a slice is actually a struct type. This is commonly referred to as a slice header after its counterpart in the reflect package. The definition of a slice header looks something like this:

package runtime

type slice struct {
        ptr   unsafe.Pointer
        len   int
        cap   int

This is important because unlike map and chan types slices are value types and are copied when assigned or passed as arguments to functions.

To illustrate this, programmers instinctively understand that square‘s formal parameter v is an independent copy of the v declared in main.

package main

import "fmt"

func square(v int) {
        v = v * v

func main() {
        v := 3
        fmt.Println(v) // prints 3, not 9

So the operation of square on its v has no effect on main‘s v. So too the formal parameter s of double is an independent copy of the slice s declared in mainnot a pointer to main‘s s value.

package main

import "fmt"

func double(s []int) {
        s = append(s, s...)

func main() {
        s := []int{1, 2, 3}
        fmt.Println(s, len(s)) // prints [1 2 3] 3

The slightly unusual nature of a Go slice variable is it’s passed around as a value, not than a pointer. 90% of the time when you declare a struct in Go, you will pass around a pointer to values of that struct.5 This is quite uncommon, the only other example of passing a struct around as a value I can think of off hand is time.Time.

It is this exceptional behaviour of slices as values, rather than pointers to values, that can confuses Go programmer’s understanding of how slices work. Just remember that any time you assign, subslice, or pass or return, a slice, you’re making a copy of the three fields in the slice header; the pointer to the underlying array, and the current length and capacity.

Putting it all together

I’m going to conclude this post on the example of a slice as a stack that I opened this post with:

package main

import "fmt"

func f(s []string, level int) {
        if level > 5 {
        s = append(s, fmt.Sprint(level))
        f(s, level+1)
        fmt.Println("level:", level, "slice:", s)

func main() {
        f(nil, 0)

Starting from main we pass a nil slice into f as level 0. Inside f we append to s the current level before incrementing level and recursing. Once level exceeds 5, the calls to f return, printing their current level and the contents of their copy of s.

level: 5 slice: [0 1 2 3 4 5]
level: 4 slice: [0 1 2 3 4]
level: 3 slice: [0 1 2 3]
level: 2 slice: [0 1 2]
level: 1 slice: [0 1]
level: 0 slice: [0]

You can see that at each level the value of s was unaffected by the operation of other callers of f, and that while four underlying arrays were created 6 higher levels of f in the call stack are unaffected by the copy and reallocation of new underlying arrays as a by-product of append.

Further reading

If you want to find out more about how slices work in Go, I recommend these posts from the Go blog:


How the Go runtime implements maps efficiently (without generics)

This post discusses how maps are implemented in Go. It is based on a presentation I gave at the GoCon Spring 2018 conference in Tokyo, Japan.

What is a map function?

To understand how a map works, let’s first talk about the idea of the map function. A map function maps one value to another. Given one value, called a key, it will return a second, the value.

map(key) → value

Now, a map isn’t going to be very useful unless we can put some data in the map. We’ll need a function that adds data to the map

insert(map, key, value)

and a function that removes data from the map

delete(map, key)

There are other interesting properties of map implementations like querying if a key is present in the map, but they’re outside the scope of what we’re going to discuss today. Instead we’re just going to focus on these properties of a map; insertion, deletion and mapping keys to values.

Go’s map is a hashmap

The specific map implementation I’m going to talk about is the hashmap, because this is the implementation that the Go runtime uses. A hashmap is a classic data structure offering O(1) lookups on average and O(n) in the worst case. That is, when things are working well, the time to execute the map function is a near constant.

The size of this constant is part of the hashmap design and the point at which the map moves from O(1) to O(n) access time is determined by its hash function.

The hash function

What is a hash function? A hash function takes a key of an unknown length and returns a value with a fixed length.

hash(key) → integer

this hash value is almost always an integer for reasons that we’ll see in a moment.

Hash and map functions are similar. They both take a key and return a value. However in the case of the former, it returns a value derived from the key, not the value associated with the key.

Important properties of a hash function

It’s important to talk about the properties of a good hash function as the quality of the hash function determines how likely the map function is to run near O(1).

When used with a hashmap, hash functions have two important properties. The first is stabilityThe hash function must be stable. Given the same key, your hash function must return the same answer. If it doesn’t you will not be able to find things you put into the map.

The second property is good distributionGiven two near identical keys, the result should be wildly different. This is important for two reasons. Firstly, as we’ll see, values in a hashmap should be distributed evenly across buckets, otherwise the access time is not O(1). Secondly as the user can control some of the aspects of the input to the hash function, they may be able to control the output of the hash function, leading to poor distribution which has been a DDoS vector for some languages. This property is also known as collision resistance.

The hashmap data structure

The second part of a hashmap is the way data is stored.
The classical hashmap is an array of buckets each of which contains a pointer to an array of key/value entries. In this case our hashmap has eight buckets (as this is the value that the Go implementation uses) and each bucket can hold up to eight entries each (again drawn from the Go implementation). Using powers of two allows the use of cheap bit masks and shifts rather than expensive division.

As entries are added to a map, assuming a good hash function distribution, then the buckets will fill at roughly the same rate. Once the number of entries across each bucket passes some percentage of their total size, known as the load factor, then the map will grow by doubling the number of buckets and redistributing the entries across them.

With this data structure in mind, if we had a map of project names to GitHub stars, how would we go about inserting a value into the map?

We start with the key, feed it through our hash function, then mask off the bottom few bits to get the correct offset into our bucket array. This is the bucket that will hold all the entries whose hash ends in three (011 in binary). Finally we walk down the list of entries in the bucket until we find a free slot and we insert our key and value there. If the key was already present, we’d just overwrite the value.

Now, lets use the same diagram to look up a value in our map. The process is similar. We hash the key as before, then masking off the lower 3 bits, as our bucket array contains 8 entries, to navigate to the fifth bucket (101 in binary). If our hash function is correct then the string "moby/moby" will always hash to the same value, so we know that the key will not be in any other bucket. Now it’s a case of a linear search through the bucket comparing the key provided with the one stored in the entry.

Four properties of a hash map

That was a very high level explanation of the classical hashmap. We’ve seen there are four properties you need to implement a hashmap;

    1. You need a hash function for the key.
    2. You need an equality function to compare keys.
    3. You need to know the size of the key and,
    4. You need to know the size of the value because these affect the size of the bucket structure, which the compiler needs to know, as you walk or insert into that structure, how far to advance in memory.

Hashmaps in other languages

Before we talk about the way Go implements a hashmap, I wanted to give a brief overview of how two popular languages implement hashmaps. I’ve chosen these languages as both offer a single map type that works across a variety of key and values.


The first language we’ll discuss is C++. The C++ Standard Template Library (STL) provides std::unordered_map which is usually implemented as a hashmap.

This is the declaration for std::unordered_map. It’s a template, so the actual values of the parameters depend on how the template is instantiated.

    class Key,                             // the type of the key
    class T,                               // the type of the value
    class Hash = std::hash<Key>,
           // the hash function
    class KeyEqual = std::equal_to<Key>,
   // the key equality function
    class Allocator = std::allocator< std::pair<const Key, T> >

> class unordered_map;

There is a lot here, but the important things to take away are;

  • The template takes the type of the key and value as parameters, so it knows their size.
  • The template takes a std::hash function specialised on the key type, so it knows how to hash a key passed to it.
  • And the template takes an std::equal_to function, also specialised on key type, so it knows how to compare two keys.

Now we know how the four properties of a hashmap are communicated to the compiler in C++’s std::unordered_map, let’s look at how they work in practice.

First we take the key, pass it to the std::hash function to obtain the hash value of the key. We mask and index into the bucket array, then walk the entries in that bucket comparing the keys using the std::equal_to function.


The second language we’ll discuss is Java. In java the hashmap type is called, unsurprisingly, java.util.Hashmap.

In java, the java.util.Hashmap type can only operate on objects, which is fine because in Java almost everything is a subclass of java.lang.Object. As every object in Java descends from java.lang.Object they inherit, or override, a hashCode and an equals method.

However, you cannot directly store the eight primitive types; boolean, int, short, long, byte, char, float, and double, because they are not subclasss of java.lang.Object. You cannot use them as a key, you cannot store them as a value. To work around this limitation, those types are silently converted into objects representing their primitive values. This is known as boxing.

Putting this limitation to one side for the moment, let’s look at how a lookup in Java’s hashmap would operate.

First we take the key and call its hashCode method to obtain the hash value of the key. We mask and index into the bucket array, which in Java is a pointer to an Entry, which holds a key and value, and a pointer to the next Entry in the bucket forming a linked list of entries.


Now that we’ve seen how C++ and Java implement a Hashmap, let’s compare their relative advantages and disadvantages.

C++ templated std::unordered_map


  • Size of the key and value types known at compile time.
  • Data structure are always exactly the right size, no need for boxing or indiretion.
  • As code is specialised at compile time, other compile time optimisations like inlining, constant folding, and dead code elimination, can come into play.

In a word, maps in C++ can be as fast as hand writing a custom map for each key/value combination, because that is what is happening.


  • Code bloat. Each different map are different types. For N map types in your source, you will have N copies of the map code in your binary.
  • Compile time bloat. Due to the way header files and template work, each file that mentions a std::unordered_map the source code for that implementation has to be generated, compiled, and optimised.

Java util Hashmap


  • One implementation of a map that works for any subclass of java.util.Object. Only one copy of java.util.HashMap is compiled, and its referenced from every single class.


  • Everything must be an object, even things which are not objects, this means maps of primitive values must be converted to objects via boxing. This adds gc pressure for wrapper objects, and cache pressure because of additional pointer indirections (each object is effective another pointer lookup)
  • Buckets are stored as linked lists, not sequential arrays. This leads to lots of pointer chasing while comparing objects.
  • Hash and equality functions are left as an exercise to the author of the class. Incorrect hash and equals functions can slow down maps using those types, or worse, fail to implement the map behaviour.

Go’s hashmap implementation

Now, let’s talk about how the hashmap implementation in Go allows us to retain many of the benfits of the best map implementations we’ve seen, without paying for the disadvantages.

Just like C++ and just like Java, Go’s hashmap written in Go. But–Go does not provide generic types, so how can we write a hashmap that works for (almost) any type, in Go?

Does the Go runtime use interface{}

No, the Go runtime does not use interface{} to implement its hashmap. While we have the container/{list,heap} packages which do use the empty interface, the runtime’s map implementation does not use interface{}.

Does the compiler use code generation?

No, there is only one copy of the map implementation in a Go binary. There is only one map implementation, and unlike Java, it doesn’t use interface{} boxing. So, how does it work?

There are two parts to the answer, and they both involve co-operation between the compiler and the runtime.

Compile time rewriting

The first part of the answer is to understand that map lookups, insertion, and removal, are implemented in the runtime package. During compilation map operations are rewritten to calls to the runtime. eg.

v := m["key"]     → runtime.mapaccess1(m, ”key", &v)
v, ok := m["key"] → runtime.mapaccess2(m, ”key”, &v, &ok)
m["key"] = 9001   → runtime.mapinsert(m, ”key", 9001)
delete(m, "key")  → runtime.mapdelete(m, “key”)

It’s also useful to note that the same thing happens with channels, but not with slices.

The reason for this is channels are complicated data types. Send, receive, and select have complex interactions with the scheduler so that’s delegated to the runtime. By comparison slices are much simpler data structures, so the compiler natively handles operations like slice access, len and cap while deferring complicated cases in copy and append to the runtime.

Only one copy of the map code

Now we know that the compiler rewrites map operations to calls to the runtime. We also know that inside the runtime, because this is Go, there is only one function called mapaccess, one function called mapaccess2, and so on.

So, how can the compiler can rewrite this

v := m[“key"]

into this

runtime.mapaccess(m, ”key”, &v)

without using something like interface{}? The easiest way to explain how map types work in Go is to show you the actual signature of runtime.mapaccess1.

func mapaccess1(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer

Let’s walk through the parameters.

  • key is a pointer to the key, this is the value you provided as the key.
  • h is a pointer to a runtime.hmap structure. hmap is the runtime’s hashmap structure that holds the buckets and other housekeeping values 1.
  • t is a pointer to a maptype, which is odd.

Why do we need a *maptype if we already have a *hmap? *maptype is the special sauce that makes the generic *hmap work for (almost) any combination of key and value types. There is a maptype value for each unique map declaration in your program. There will be one that describes maps from strings to ints, from strings to http.Headers, and so on.

Rather than having, as C++ has, a complete map implementation for each unique map declaration, the Go compiler creates a maptype during compilation and uses that value when calling into the runtime’s map functions.

type maptype struct {

        typ           _type

        key         *_type
       elem        *_type

        bucket        *_type // internal type representing a hash bucket

        hmap          *_type // internal type representing a hmap

        keysize       uint8  // size of key slot

        indirectkey   bool   // store ptr to key instead of key itself

        valuesize     uint8  // size of value slot

        indirectvalue bool   // store ptr to value instead of value itself

        bucketsize    uint16 // size of bucket

        reflexivekey  bool   // true if k==k for all keys

        needkeyupdate bool   // true if we need to update key on overwrite


Each maptype contains details about properties of this kind of map from key to elem. It contains infomation about the key, and the elements. maptype.key contains information about the pointer to the key we were passed. We call these type descriptors.

type _type struct {

        size       uintptr

        ptrdata    uintptr // size of memory prefix holding all pointers

        hash       uint32

        tflag      tflag

        align      uint8

        fieldalign uint8

        kind       uint8

        alg       *typeAlg

        // gcdata stores the GC type data for the garbage collector.

        // If the KindGCProg bit is set in kind, gcdata is a GC program.

        // Otherwise it is a ptrmask bitmap. See mbitmap.go for details.

        gcdata    *byte

        str       nameOff

        ptrToThis typeOff


In the _type type, we have things like it’s size, which is important because we just have a pointer to the key value, but we need to know how large it is, what kind of a type it is; it is an integer, is it a struct, and so on. We also need to know how to compare values of this type and how to hash values of that type, and that is what the _type.alg field is for.

type typeAlg struct {

        // function for hashing objects of this type

        // (ptr to object, seed) -> hash

        hash func(unsafe.Pointer, uintptr) uintptr

        // function for comparing objects of this type

        // (ptr to object A, ptr to object B) -> ==?

        equal func(unsafe.Pointer, unsafe.Pointer) bool


There is one typeAlg value for each type in your Go program.

Putting it all together, here is the (slightly edited for clarity) runtime.mapaccess1 function.

// mapaccess1 returns a pointer to h[key].  Never returns nil, instead

// it will return a reference to the zero object for the value type if

// the key is not in the map.

func mapaccess1(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer {

        if h == nil || h.count == 0 {

                return unsafe.Pointer(&zeroVal[0])


        alg := t.key.alg

        hash := alg.hash(key, uintptr(h.hash0))

        m := bucketMask(h.B)

        b := (*bmap)(add(h.buckets, (hash&m)*uintptr(t.bucketsize)))

One thing to note is the h.hash0 parameter passed into alg.hash. h.hash0 is a random seed generated when the map is created. It is how the Go runtime avoids hash collisions.

Anyone can read the Go source code, so they could come up with a set of values which, using the hash ago that go uses, all hash to the same bucket. The seed value adds an amount of randomness to the hash function, providing some protection against collision attack.


I was inspired to give this presentation at GoCon because Go’s map implementation is a delightful compromise between C++’s and Java’s, taking most of the good without having to accomodate most of the bad.

Unlike Java, you can use scalar values like characters and integers without the overhead of boxing. Unlike C++, instead of N runtime.hashmap implementations in the final binary, there are only N runtime.maptype values, a substantial saving in program space and compile time.

Now I want to be clear that I am not trying to tell you that Go should not have generics. My goal today was to describe the situation we have today in Go 1 and how the map type in Go works under the hood.  The Go map implementation we have today is very fast and provides most of the benefits of templated types, without the downsides of code generation and compile time bloat.

I see this as a case study in design that deserves recognition.

Containers versus Operating Systems

What does a distro provide?

The most popular docker base container image is either busybox, or scratch. This is driven by a movement that is equal parts puritanical and pragmatic. The puritan asks “Why do I need to run init(1) just to run my process?” The pragmatist asks “Why do I need a 700 meg base image to deploy my application?” And both, seeking immutable deployment units ask “Is it a good idea that I can ssh into my container?” But let’s step back for a second and look at the history of how we got to the point where questions like this are even a thing.

In the very beginnings, there were no operating systems. Programs ran one at a time with the whole machine at their disposal. While efficient, this created a problem for the keepers of these large and expensive machines. To maximise their investment, the time between one program finishing and another starting must be kept to an absolute minimum; hence monitor programs and batch processing was born.

Monitors started as barely more than watchdog timers. They knew how to load the next program off tape, then set an alarm if the program ran too long. As time went on, monitors became job control–quasi single user operating systems where the operators could schedule batch jobs with slightly more finesse than the previous model of concatenating them in the card reader.1

In response to the limitations of batch processing, and with the help of increased computing resources, interactive computing was born. Interactive computing allowing multiple users to interact with the computer directly, time slicing, or time sharing, the resources between users to present the illusion of each program having a whole computer to itself.

“The UNIX kernel is an I/O multiplexer more than a complete operating system. This is as it should be.” — Ken Thompson, BSTJ, 1978.

interactive computing in raw terms was less efficient than batch, however it recognised that the potential to deliver programs faster outweighed a less than optimal utilisation of the processor; a fact borne out by the realisation that programming time was not benefiting from the same economies of scale that Moore’s law was delivering for hardware. Job control evolved to became what we know as the kernel, a supervisor program which sits above the raw hardware, portioning it out and mediating access to hardware devices.

With interactive users came the shell, a place to start programs, and return once they completed. The shell presented an environment, a virtual work space to organise your work, communicate with others, and of course customise. Customise with programs you wrote, programs you got from others, and programs that you collaborated with your coworkers on.

Interactive computing, multi user systems and then networking gave birth to the first wave of client/server computing–the server was your world, your terminal was just a pane of glass to interact with it. Thus begat userspace, a crowded bazaar of programs, written in many languages, traded, sold, swapped and sometimes stolen. A great inter breeding between the UNIX vendors produced a whole far larger than the sum of its parts.

Each server was an island, lovingly tended by operators, living for years, slowly patched and upgraded, becoming ever more unique through the tide of software updates and personnel changes.

Skip forward to Linux and the GNU generation, a kernel by itself does not serve the market, it needs a user space of tools to attract and nurture users accustomed to the full interactive environment.

But that software was hard, and messy, and spread across a million ftp, tucows, sourceforge, and cvs servers. Their installation procedures are each unique, their dependencies are unknown or unmanaged–in short, a job for an expert. Thus distributions became experts at packaging open source software to work together as a coherent interactive userspace story.

Container sprawl

We used to just have lots of servers, drawing power, running old software, old operating systems, hidden under people’s desks, and sometimes left running behind dry wall. Along came virtualisation to sweep away all the old, slow, flaky, out of warranty hardware. Yet the software remained, and multiplied.

Vmsprawl, it was called. Now free from a purchase order and a network switch port, virtual machines could spawn faster than rabbits. But their lifespan would be much longer.

Back when physical hardware existed, you could put labels on things, assign them to operators, have someone to blame, or at least ask if the operating system was up to date, but virtual machines became ephemeral, multitudinous, and increasingly, redundant,

Now that a virtual machines’ virtual bulk has given way to containers, what does that mean for the security and patching landscape? Surely it’s as bad, if not worse. Containers can multiply even faster than VMs and at such little cost compared to their bloated cousins that the problem could be magnified many times over. Or will it?

The problem is maintaining the software you didn’t write. Before containers that was everything between you and the hardware; obviously a kernel, that is inescapable, but the much larger surface area (in recent years ballooning to a DVD’s girth) was the userland. The gigabytes of software that existed to haul the machine onto the network, initialise its device drivers, scrub its /tmp partition, and so on.

But what if there was no userland? What if the network was handled for you, truly virtualised at layer 3, not layer 1. Your volumes were always mounted and your local storage was fleeting, so nothing to scrub. What would be the purpose of all those decades of lovingly crafted userland cruft?

If interactive software goes unused, was it ever installed at all?

Immutable images

Netflix tells us that immutable images are the path to enlightenment. Built it once, deploy it often. If there is a problem, an update, a software change, a patch, or a kernel fix, then build another image and roll it out. Never change something in place. This mirrors the trend towards immutability writ large by the functional programming tidal wave.

While Netflix use virtual machines, and so need software to configure their (simulated) hardware and software to plumb their (simulated) network interfaces to get to the point of being able to launch the application, containers leave all these concerns to the host. A container is spawned with any block devices or network interfaces required already mounted or plumbed a priori.

So, if you remove the requirement, and increasingly, the ability, to change the contents of the running image, and you remove the requirement to prepare the environment before starting the application, because the container is created with all its facilities already prepared, why do you need a userland inside a container?

Debugging? Possibly.

Today there are many of my generation who would feel helpless without being about to ssh to a host, run their favourite (and disparate) set of inspection tools. But a container is just a process inside a larger host operating system, so do you diagnosis there instead. Unlike virtual machines, these are not black boxes, the host operating system has far more capability to inspect and diagnose a guest than the guest itself–so leave your diagnosis tools on the host. And your ssh daemon, for good measure.

Updates, updates. Updates!

Why do we have operating system distros? In a word, outsourcing.

Sure, every admin could subscribe to the mailing lists of all the software packages installed on the servers they maintain (you do know all the software installed on the machines you are responsible for, right?) and then download, test, certify, upgrade the software promptly after being notified. Sound’s simple. Any admin worth hiring should be able to do this.

Sure, assuming you can find an admin who wants to do this grunt work, and that they can keep up with the workload, and that they can service more than a few machines before they’re hopelessly chasing their tails.

No, of course not, we outsource this to operating system vendor. In return for using outdated versions of software, distros will centralise the triage, testing and preparation of upgrades and patches.

This is the reason that a distro and its package management tool of choice are synonymous. Without a tool to automate the dissemination, installation and upgrade of packaged software, distro vendors would have no value. And without someone to marshal unique snowflake open source software into a unified form, package management software would have no value.

No wonder that the revenue model for all open source distro vendors centers around tooling that automates the distribution of update packages.

The last laugh

Ironically, the last laugh in this tale may be the closed source operating system vendors. It was Linux and open source that destroyed the proprietary UNIX market after the first dot com crash.

Linux rode Moore’s law to become the server operating system for the internet, and made kings of the operating system distributors. But it’s Linux that is driving containers, at least in their current form, and Linux containers, or more specifically a program that communicates directly with the kernel syscall api inside a specially prepared process namespace, is defining the new normal for applications.

Containers are eating the very Linux distribution market which enabled their creation.

OSX and Windows may be relegated to second class citizens–the clients in the client/server or client/container equation–but at least nobody is asking difficult questions about the role of their userspace.

Whither distros

What is the future of operating system distributions? Their services, while mature, scalable, well integrated, and expertly staffed, will unfortunately be priced out of the market. History tells us this.

In the first dot com bust, companies retreated from expensive proprietary software, not because it wasn’t good, not because it wasn’t extensible or changeable, but because it was too expensive. With no money coming in, thousands of dollars of opex walking out the door in licence fees was unsustainable.

The companies that survived the crash, or were born in its wreckage, chose open source software. Software that was arguably less mature, less refined, at the time, but with a price tag that was much more approachable. They kept their investors money in the bank, rode the wave of hardware improvements, and by pulling together in a million loosely organised software projects created a free (as in free puppy) platform to build their services on top–trading opex for some risk that they may have no-one to blame if their free software balloon sprang a leak.

Now, it is the Linux distributors who are chasing the per seat or per cpu licence fees. Offering scaled out professional services in the form of a stream of software updates and patches, well tested and well integrated.

But, with the exception of the kernel–which is actually provided by the host operating system, not the container–all those patches and updates are for software that is not used by the container, and in the case of our opening examples, busybox and scratch. not present. The temptation to go it alone, cut out the distro vendors, backed by the savings in licence fees is overwhelming.

What can distros do?

What would you do if you woke up one day to find that you owned the best butchers shop in a town that had decided to become vegetarian en mass?