Category Archives: Go

How to find out which Go version built your binary

This is a short post describing the procedure for discovering which version of Go was used to compile a Go binary.

This procedure relies on the fact that each Go program includes a copy of the version string reported by runtime.Version() . Linker magic ensures that this value will be present in the final binary irrespective of whether runtime.Version() is called by the resulting program. The value in question is stored in the runtime.buildVersion variable and can be recovered by a debugger.

The rest of this post describes the mechanisms for recovering the contents of runtime.buildVersion on various platforms.

Linux/FreeBSD/OpenBSD/NetBSD

If you’re on a Linux or *BSD platform, you can recover the binary build version with gdb.

% gdb $HOME/bin/godoc
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
(gdb) p 'runtime.buildVersion'
$1 = 0xa9ceb8 "go1.8.3"

Darwin

The debugging situation on OS X isn’t great, but here are several options.

gdb

gdb was removed from the XCode toolchain following the switch from gcc to llvm. If you are running a version of XCode that has gdb, you should used the instructions from the previous section.

Delve

Delve can be used to print the value of runtime.buildVersion.

% dlv exec $HOME/bin/godoc
Type 'help' for list of commands.
(dlv) b main.main
Breakpoint 1 set at 0x15596eb for main.main() ./golang.org/x/tools/cmd/godoc/main.go:156
(dlv) c
> main.main() ./golang.org/x/tools/cmd/godoc/main.go:156 (hits goroutine(1):1 total:1) (PC: 0x15596eb)
   151:                 }
   152:         }
   153:         log.Fatalf("too many redirects")
   154: }
   155:
=> 156: func main() {
   157:         flag.Usage = usage
   158:         flag.Parse()
   159:
   160:         playEnabled = *showPlayground
   161:
(dlv) p runtime.buildVersion
"go1.8.1"

lldb

Christian Witts reports on Twitter that XCode 8.3.3 ships with a version of lldb, version 370.0.42, that can interpret the Go string syntax.

$ lldb $HOME/bin/godoc
(lldb) b main.main
(lldb) run
(lldb) p runtime.buildVersion

I’ve tested earlier versions of lldb and found they do not work. Instread, use delve

Windows

Good news, everyone. Brian Ketelsen of GopherCon and GoTime.fm fame, reports that delve works perfectly on Windows for recovering this binaries’ build version.

PS C:\Users\bkete\go\src\http://github.com \derekparker\delve\cmd\dlv> dlv exec C:\Users\bkete\go\bin\dlv.exe
Type 'help' for list of commands.
(dlv) b main.main
Breakpoint 1 set at 0x8ec666 for main.main() c:/Users/bkete/go/src/github.com/derekparker/delve/cmd/dlv/main.go:11
(dlv) c
> main.main() c:/Users/bkete/go/src/github.com/derekparker/delve/cmd/dlv/main.go:11 (hits goroutine(1):1 total:1) (PC: 0x8ec666)
     6: )
     7:
     8: // Build is the git sha of this binaries build.
     9: var Build string
    10:
=>  11: func main() {
    12:         http://version.DelveVersion.Build  = Build
    13:         http://cmds.New ().Execute()
    14: }
(dlv) p runtime.buildVersion
"go1.8.1"

If someone wants to figure out the correct WinDbg or Visual Studio Debugger incantation, please let me know and I’ll link to you from this post.

Simplicity Debt Redux

In my previous post I discussed my concerns the additional complexity adding generics or immutability would bring to a future Go 2.0. As it was an opinion piece, I tried to keep it around 500 words. This post is an exploration of the most important (and possibly overlooked) point of that post.

Indeed, the addition of [generics and/or immutability] would have a knock-on effect that would profoundly alter the way error handling, collections, and concurrency are implemented. 

Specifically, what I believe would be the possible knock-on effect of adding generics or immutability to the language.

Error handling

A powerful motivation for adding generic types to Go is to enable programmers to adopt a monadic error handling pattern. My concerns with this approach have little to do with the notion of the maybe monad itself. Instead I want to explore the question of how this additional form of error handling might be integrated into the stdlib, and thus the general population of Go programmers.

Right now, to understand how io.Reader works you need to know how slices work, how interfaces work, and know how nil works. If the if err != nil { return err } idiom was replaced by an option type or maybe monad, then everyone who wanted to do basic things like read input or write output would have to understand how option types or maybe monads work in addition to discussion of what templated types are, and how they are implemented in Go.

Obviously it’s not impossible to learn, but it is more complex than what we have today. Newcomers to the language would have to integrate more concepts before they could understand basic things, like reading from a file.

The next question is, would this monadic form become the single way errors are handled? It seems confusing, and gives unclear guidiance to newcomers to Go 2.0, to continue to support both the error interface model and a new monadic maybe type. Also, if some form of templated maybe type was added, would it be a built in, like error, or would it have to be imported in almost every package. Note: we’ve been here before with os.Error.

What began as the simple request to create the ability to write a templated maybe or option type has ballooned into a set of question that would affect every single Go package ever written.

Collections

Another reason to add templated types to Go is to facilitate custom collection types without the need for interface{} boxing and type assertions.

On the surface this sounds like a grand idea, especially as these types are leaking into the standard library anyway. But that leaves the question of what to do with the built in slice and map types. Should slices and maps co-exist with user defined collections, or should they be removed in favour of defining everything as a generic type?

To keep both sounds redundant and confusing, as all Go developers would have to be fluent in both and develop a sophisticated design sensibility about when and where to choose one over the other. But to remove slices and maps in favour of collection types provided by a library raises other questions.

Slicing

For example, if there is no slice type, only types like a vector or linked list, what happens to slicing? Does it go away, if so, how would that impact common operations like handling the result a call to io.Reader.Read? If slicing doesn’t go away, would that require the addition of operator overloading so that user defined collection types can implement a slice operator?

Then there are questions on how to marry the built in map type with a user defined map or set. Should user defined maps support the index and assignment operators? If so, how could a user defined map offer both the one and two return value forms of lookup without requiring polymophic dispatch based on the number of return arguments? How would those operators work in the presence of set operations which have no value, only a key?

Which types could use the delete function? Would delete need to be modified to work with types that implement some kind of Deleteable interface? The same questions apply to append, lencap, and copy.

What about addressability? Values in the built in map type are not addressable, but should that be permitted or disallowed for user defined map types? How would that interact with operator overloading designed to make user defined maps look more like the built in map?

What sounded like a good idea on paper—make it possible for programmers to define their own efficient collection data types—has highlighted how deeply integrated the built in map and slice are and spawned not only a requirement for templated types, but operator overloading, polymorphic dispatch, and some kind of return value addressability semantics.

How could you implement a vector?

So, maybe you make the argument that now we have templated types we can do away with the built in slice and map, and replace them with a Java-esque list of collection types.

Go’s Pascal-like array type has a fixed size known at compile time. How could you implement a growable vector without resorting to unsafe hacks? I’ll leave that as an exercise to the reader. But I put it to you that if you cannot implement simple templated vector type with the memory safety we enjoy today with slices, then that is a very strong design smell.

Iteration

I’ll admit that the inability to use the for ... range statement over my own types was something that frustrated me for a long time when I came to Go, as I was accustomed to the flexibility of the iterator types in the Java collections library.

But iterating over in-memory data structures is boring—what you really want to be able to do is compose iterators over database results and network requests. In short, data from outside your process—and when data is outside your process, retrieving it might fail. In that case you have a choice, does your Iterable interface return a value, a value and an error, or perhaps you go down the option type route. Each would require a new form of range loop semantic sugar in an area which already contains its share of footguns.

You can see that adding the ability to write template collection types sounds great on paper, but in practice it would perpetuate a situation where the built in collection types live on in addition to their user defined counterparts. Each would have their strengths and weaknesses, and a Go developer would have to become proficient in both. This is something that Go developers just don’t have to think about today as slices and maps are practically ubiquitous.

Immutability

Russ wrote at the start of the year that a story for reference immutability was an important area of exploration for the future of Go. Having surveyed hundreds of Go packages and found few which are written with an understanding of the problem of data races—let alone actually tried running their tests under the race detector—it is tempting to agree with Russ that the ‘after the fact’ model of checking for races at run time has some problems.

On balance, after thinking about the problems of integrating templated types into Go, I think if I had to choose between generics and immutability, I’d choose the latter.

But the ability to mark a function parameter as const is insufficient, because while it restricts the receiver from mutating the value, it does not prohibit the caller from doing so, which is the majority of the data races I see in Go programs today. Perhaps what Go needs is not immutability, but ownership semantics.

While the Rust ownership model is undoubtedly correctiff your program complies, it has no data races—nobody can argue that the ownership model is simple or easy for newcomers. Nor would adding an extra dimension of immutability to every variable declaration in Go be simple as it would force every user of the language to write their programs from the most pessimistic standpoint of assuming every variable will be shared and will be mutated concurrently.

In conclusion

These are some of the knock on effects that I see of adding generics or immutability to Go. To be clear, I’m not saying that it should not be done, in fact in my previous post I argued the opposite.

What I want to make clear is adding generics or immutability has nothing to do with the syntax of those features, little to do with their underlying implementation, and everything to do with the impact on the overall complexity budget of the language and its libraries, that these features would unlock.

David Symonds argued years ago that there would be no benefit in adding generics to Go if they were not used heavily in the stdlib. The question, and concern, I have is; would the result be more complex than what we have today with our quaint built in slice, map, and error types?

I think it is worth keeping in mind the guiding principals of the language—simplicity and readability. The design of Go does not follow the accretive model of C++ or Java The goal is not to reinvent those languages, minus the semicolons.

Simplicity Debt

Fifteen years ago Python’s GIL wasn’t a big issue. Concurrency was something dismissed as probably unnecessary. What people really was needed was a faster interpreter, after all, who had more than one CPU? But, slowly, as the requirement for concurrency increased, the problems with the GIL increased.

By the time this decade rolled around, Node.js and Go had arrived on the scene, highlighting the need for concurrency as a first class concept. Various async contortions papered over the single threaded cracks of Python programs, but it was too late. Other languages had shown that concurrency must be a built-in facility, and Python had missed the boat.

When Go launched in 2009, it didn’t have a story for templated types. First we said they were important, but we didn’t know how to implement them. Then we argued that you probably didn’t need them, instead Go programmers should focus on interfaces, not types. Meanwhile Rust, Nim, Pony, Crystal, and Swift showed that basic templated types are a useful, and increasingly, expected feature of any language—just like concurrency.

There is no question that templated types and immutability are on their way to becoming mandatory in any modern programming language. But there is equally no question that adding these features to Go would make it more complex.

Just as efforts to improve Go’s dependency management situation have made it easier to build programs that consume larger dependency graphs, producing larger and more complex pieces of software, efforts to add templated types and immutability to the language would unlock the ability to write more complex, less readable software. Indeed, the addition of these features would have a knock on effect that would profoundly alter the way error handling, collections, and concurrency are implemented.

I have no doubt that adding templated types to Go will make it a more complicated language, just as I have no doubt that not adding them would be a mistake–lest Go find itself, like Python, on the wrong side of history. But, no matter how important and useful templated types and immutability would be, integrating them into a hypothetical Go 2 would decrease its readability and increase compilation times—two things which Go was designed to address. They would, in effect, impose a simplicity debt.

If you want generics, immutability, ownership semantics, option types, etc, those are already available in other languages. There is a reason Go programmers choose to program in Go, and I believe that reason stems from our core tenets of simplicity and readability. The question is, how can we pay down the cost in complexity of adding templated types or immutability to Go?

Go 2 isn’t here yet, but its arrival is a lot more certain than previously believed. As it stands now, generics or immutability can’t just be added to Go and still call it simple. As important as the discussions on how to add these features to Go 2 would be, equal weight must be given to the discussion of how to first offset their inherent complexity.

We have to build up a bankroll to spend on the complexity generics and immutability would add, otherwise Go 2 will start its life in simplicity debt.

Next: Simplicity Debt Redux

Go, without package scoped variables

This is a thought experiment, what would Go look like if we could no longer declare variables at the package level? What would be the impact of removing package scoped variable declarations, and what could we learn about the design of Go programs?

I’m only talking about expunging var, the other five top level declarations would still be permitted as they are effectively constant at compile time. You can, of course, continue to declare variables at the function or block scope.

Why are package scoped variables bad?

But first, why are package scoped variables bad? Putting aside the problem of globally visible mutable state in a heavily concurrent language, package scoped variables are fundamentally singletons, used to smuggle state between unrelated concerns, encourage tight coupling and makes the code that relies on them hard to test.

As Peter Bourgon wrote recently:

tl;dr: magic is bad; global state is magic → [therefore, you want] no package level vars; no func init.

Removing package scoped variables, in practice

To put this idea to the test I surveyed the most popular Go code base in existence; the standard library, to see how package scoped variables were used, and assessed the effect applying this experiment would have.

Errors

One of the most frequent uses of public package level var declarations are errors; io.EOF,
sql.ErrNoRowscrypto/x509.ErrUnsupportedAlgorithm, and so on. Removing the use of package scoped variables would remove the ability to use public variables for sentinel error values. But what could be used to replace them?

I’ve written previously that you should prefer behaviour over type or identity when inspecting errors. Where that isn’t possible, declaring error constants removes the potential for modification which retaining their identity semantics.

The remaining error variables are private declarations which give a symbolic name to an error message. These error values are unexported so they cannot be used for comparison by callers outside the package. Declaring them at the package level, rather than at the point they occur inside a function negates the opportunity to add additional context to the error. Instead I recommend using something like pkg/errors to capture a stack trace at the point the error occurs.

Registration

A registration pattern is followed by several packages in the standard library such as net/http, database/sql, flag, and to a lesser extent log. It commonly involves a package scoped private map or struct which is mutated by a public function—a textbook singleton.

Not being able to create a package scoped placeholder for this state would remove the side effects in the image, database/sql, and crypto packages to register image decoders, database drivers and cryptographic schemes. However, this is precisely the magic that Peter is referring to–importing a package for the side effect of changing some global state of your program is truly spooky action at a distance.

Registration also promotes duplicated business logic. The net/http/pprof package registers itself, via a side effect with net/http.DefaultServeMux, which is both a potential security issue—other code cannot use the default mux without exposing the pprof endpoints—and makes it difficult to convince the net/http/pprof package to register its handlers with another mux.

If package scoped variables were no longer used, packages like net/http/pprof could provide a function that registers routes on a supplied http.ServeMux, rather than relying on side effects to altering global state.

Removing the ability to apply the registry pattern would also solve the issues encountered when multiple copies of the same package are imported in the final binary and try to register themselves during startup.

Interface satisfaction assertions

The interface satisfaction idiom

var _ SomeInterface = new(SomeType)

occurred at least 19 times in the standard library. In my opinion these assertions are tests. They don’t need to be compiled, only to be eliminated, every time you build your package. Instead they should be moved to the corresponding _test.go file. But if we’re prohibiting package scoped variables, this prohibition also applies to tests, so how can we keep this test?

One option is to move the declaration from package scope to function scope, which will still fail to compile if SomeType stop implementing SomeInterface

func TestSomeTypeImplementsSomeInterface(t *testing.T) {
       // won't compile if SomeType does not implement SomeInterface
       var _ SomeInterface = new(SomeType)
}

But, as this is actually a test, it’s not hard to rewrite this idiom as a standard Go test.

func TestSomeTypeImplementsSomeInterface(t *testing.T) {
       var i interface{} = new(SomeType)
       if _, ok := i.(SomeInterface); !ok {
               t.Fatalf("expected %t to implement SomeInterface", i)
       }
}

As a side note, because the spec says that assignment to the blank identifier must fully evaluate the right hand side of the expression, there are probably a few suspicious package level initialisation constructs hidden in those var declarations.

It’s not all beer and skittles

The previous sections showed that avoiding package scoped variables might be possible, but there are some areas of the standard library which have proved more difficult to apply this idea.

Real singletons

While I think that the singleton pattern is generally overplayed, especially in its registration form, there are always some real singleton values in every program. A good example of this is  os.Stdout and friends.

package os 

var (
        Stdin  = NewFile(uintptr(syscall.Stdin), "/dev/stdin")
        Stdout = NewFile(uintptr(syscall.Stdout), "/dev/stdout")
        Stderr = NewFile(uintptr(syscall.Stderr), "/dev/stderr")
)

There are a few problems with this declaration. Firstly Stdin, Stdout, and Stderr are of type *os.File, not their respective io.Reader or io.Writer interfaces. This makes replacing them with alternatives problematic. However the notion of replacing them is exactly the kind of magic that this experiment seeks to avoid.

As the previous constant error example showed, we can retain the singleton nature of the standard IO file descriptors, such that packages like log and fmt can address them directly, but avoid declaring them as mutable public variables with something like this:

package main

import (
        "fmt"
        "syscall"
)

type readfd int

func (r readfd) Read(buf []byte) (int, error) {
        return syscall.Read(int(r), buf)
}

type writefd int

func (w writefd) Write(buf []byte) (int, error) {
        return syscall.Write(int(w), buf)
}

const (
        Stdin  = readfd(0)
        Stdout = writefd(1)
        Stderr = writefd(2)
)

func main() {
        fmt.Fprintf(Stdout, "Hello world")
}

Caches

The second most common use of unexported package scoped variables are caches. These come in two forms; real caches made out of maps (see the registration pattern above) and sync.Pool, and quasi constant variables that ameliorate the cost of a compilation.

As an example the crypto/ecsda package has a zr type whose Read method zeros any buffer passed to it. The package keeps a single instance of zr around because it is embedded in other structs as an io.Reader, potentially escaping to the heap each time it is instantiated.

package ecdsa 

type zr struct {
        io.Reader
}

// Read replaces the contents of dst with zeros.
func (z *zr) Read(dst []byte) (n int, err error) {
        for i := range dst {
                dst[i] = 0
        }
        return len(dst), nil
}

var zeroReader = &zr{}

However zr doesn’t embed an io.Reader, it is an io.Reader, so the unused zr.Reader field could be eliminated, giving zr a width of zero. In my testing this modified type can be created directly where it is used without performance regression.

        csprng := cipher.StreamReader{
                R: zr{},
                S: cipher.NewCTR(block, []byte(aesIV)),
        }

Perhaps some of the caching decision could be revisited as the inlining and escape analysis options available to the compiler have improved significantly since the standard library was first written.

Tables

The last major use of  common use of private package scoped variables is for tables, as seen in the unicode, crypto/*, and math packages. These tables either encode constant data in the form of arrays of integer types, or less commonly simple structs and maps.

Replacing package scoped variables with constants would require a language change along the lines of #20443. So, fundamentally, providing there is no way to modify those tables at run time, they are probably a reasonable exception to this proposal.

A bridge too far

Even though this post was just a thought experiment, it’s clear that forbidding all package scoped variables is too draconian to be workable as a language precept. Addressing the bespoke uses of private var usage may prove impractical from a performance standpoint, would be akin to pinning a “kick me” sign to ones back and inviting all the Go haters to take a free swing.

However, I believe there are a few concrete recommendations that can be drawn from this exercise, without going to the extreme of changing the language spec.

  • Firstly, public var declarations should be eschewed. This is not a controversial conclusion and not one that is unique to Go. The singleton pattern is discouraged, and an unadorned public variable that can be changed at any time by any party that knows its name should be a design, and concurrency, red flag.
  • Secondly, where public package var declarations are used, the type of those variables should be carefully constructed to expose as little surface area as possible. It should not be the default to take a type expected to be used on a per instance basis, and assign it to a package scoped variable.

Private variable declarations are more nuanced, but certain patterns can be observed:

  • Private variables with public setters, which I labelled registries, have the same effect on the overall program design as their public counterparts. Rather than registering dependencies globally, they should instead be passed in during declaration using a constructor function, compact literal, config structure, or option function.
  • Caches of []byte vars can often be expressed as consts at no performance cost.  Don’t forget the compiler is pretty good at avoiding string([]byte) conversions where they don’t escape the function call.
  • Private variables that hold tables, like the unicode package, are an unavoidable consequence of the lack of a constant array type. As long as they are unexported, and do not expose any way to mutate them, they can be considered effectively constant for the purpose of this discussion.

The bottom line; think long and hard about adding package scoped variables that are mutated during the operation of your program. It may be a sign that you’ve introduced magic global state.

If a map isn’t a reference variable, what is it?

In my previous post I showed that Go maps are not reference variables, and are not passed by reference. This leaves the question, if maps are not references variables, what are they?

For the impatient, the answer is:

A map value is a pointer to a runtime.hmap structure.

If you’re not satisfied with this explanation, read on.

What is the type of a map value?

When you write the statement

m := make(map[int]int)

The compiler replaces it with a call to runtime.makemap, which has the signature

// makemap implements a Go map creation make(map[k]v, hint)
// If the compiler has determined that the map or the first bucket
// can be created on the stack, h and/or bucket may be non-nil.
// If h != nil, the map can be created directly in h.
// If bucket != nil, bucket can be used as the first bucket.
func makemap(t *maptype, hint int64, h *hmap, bucket unsafe.Pointer) *hmap

As you see, the type of the value returned from runtime.makemap is a pointer to a runtime.hmap structure. We cannot see this from normal Go code, but we can confirm that a map value is the same size as a uintptr–one machine word.

package main

import (
	"fmt"
	"unsafe"
)

func main() {
	var m map[int]int
	var p uintptr
	fmt.Println(unsafe.Sizeof(m), unsafe.Sizeof(p)) // 8 8 (linux/amd64)
}

If maps are pointers, shouldn’t they be *map[key]value?

It’s a good question that if maps are pointer values, why does the expression make(map[int]int) return a value with the type map[int]int. Shouldn’t it return a *map[int]int? Ian Taylor answered this recently in a golang-nuts thread1.

In the very early days what we call maps now were written as pointers, so you wrote *map[int]int. We moved away from that when we realized that no one ever wrote `map` without writing `*map`.

Arguably renaming the type from *map[int]int to map[int]int, while confusing because the type does not look like a pointer, was less confusing than a pointer shaped value which cannot be dereferenced.

Conclusion

Maps, like channels, but unlike slices, are just pointers to runtime types. As you saw above, a map is just a pointer to a runtime.hmap structure.

Maps have the same pointer semantics as any other pointer value in a Go program. There is no magic save the rewriting of map syntax by the compiler into calls to functions in runtime/hmap.go.


Notes

  1. If you look far enough back in the history of Go repository, you can find examples of maps created with the new operator.

There is no pass-by-reference in Go

My post on pointers provoked a lot of debate about maps and pass by reference semantics. This post is a response to those debates.

To be clear, Go does not have reference variables, so Go does not have pass-by-reference function call semantics.

What is a reference variable?

In languages like C++ you can declare an alias, or an alternate name to an existing variable. This is called a reference variable.

#include <stdio.h>

int main() {
        int a = 10;
        int &b = a;
        int &c = b;

        printf("%p %p %p\n", &a, &b, &c); // 0x7ffe114f0b14 0x7ffe114f0b14 0x7ffe114f0b14
        return 0;
}

You can see that a, b, and c all refer to the same memory location. A write to a will alter the contents of b and c. This is useful when you want to declare reference variables in different scopes–namely function calls.

Go does not have reference variables

Unlike C++, each variable defined in a Go program occupies a unique memory location.

package main

import "fmt"

func main() {
        var a, b, c int
        fmt.Println(&a, &b, &c) // 0x1040a124 0x1040a128 0x1040a12c
}

It is not possible to create a Go program where two variables share the same storage location in memory. It is possible to create two variables whose contents point to the same storage location, but that is not the same thing as two variables who share the same storage location.

package main

import "fmt"

func main() {
        var a int
        var b, c = &a, &a
        fmt.Println(b, c)   // 0x1040a124 0x1040a124
        fmt.Println(&b, &c) // 0x1040c108 0x1040c110
}

In this example, b and c hold the same value–the address of a–however, b and c themselves are stored in unique locations. Updating the contents of b would have no effect on c.

But maps and channels are references, right?

Wrong. Maps and channels are not references. If they were this program would print false.

package main

import "fmt"

func fn(m map[int]int) {
        m = make(map[int]int)
}

func main() {
        var m map[int]int
        fn(m)
        fmt.Println(m == nil)
}

If the map m was a C++ style reference variable, the m declared in main and the m declared in fn would occupy the same storage location in memory. But, because the assignment to m inside fn has no effect on the value of m in main, we can see that maps are not reference variables.

Conclusion

Go does not have pass-by-reference semantics because Go does not have reference variables.

Next: If a map isn’t a reference variable, what is it?

Understand Go pointers in less than 800 words or your money back

This post is for programmers coming to Go who are unfamiliar with the idea of pointers or a pointer type in Go.

What is a pointer?

Simply put, a pointer is a value which points to the address of another. This is the textbook explanation, but if you’re coming from a language that doesn’t let you talk about address of a variable, it could very well be written in Cuneiform.

Let’s break this down.

What is memory?

Computer memory, RAM, can be thought of as a sequence of boxes, placed one after another in a line. Each box, or cell, is labeled with a unique number, which increments sequentially; this is the address of the cell, its memory location.

Each cell holds a single value. If you know the memory address of a cell, you can go to that cell and read its contents. You can place a value in that cell; replacing anything that was in there previously.

That’s all there is to know about memory. Everything the CPU does is expressed as fetching and depositing values into memory cells.

What is a variable?

To write a program that retrieves the value stored in memory location 200, multiples it by 3 and deposits the result into memory location 201, we could write something like this in pseudocode:

  • retrieve the value stored in address 200 and place it in the CPU.
  • multiple the value stored in the CPU by 3.
  • deposit the value stored in the CPU into memory location 201.


This is exactly how early programs were written; programmers would keep a list of memory locations, who used it, when, and what the value stored there represented.

Obviously this was tedious and error prone, and meant every possible value stored in memory had to be assigned an address during the construction of the program. Worse, this arrangement made it difficult to allocate storage to variables dynamically as the program ran– just imagine if you had to write large programs using only global variables.

To address this, the notion of a variable was created. A variable is just a convenient, alphanumeric pseudonym for a memory location; a label, or nickname.

Now, rather than talking about memory locations, we can talk about variables, which are convenient names we give to memory locations. The previous program can now be expressed as:

  • Retrieve the value stored in variable a and place it in the CPU.
  • multiple it by 3
  • deposit the value into the variable b.

This is the same program, with one crucial improvement–because we no longer need to talk about memory locations directly, we no longer need to keep track of them–that drudgery is left to the compiler.

Now we can write a program like

var a = 6
var b = a * 3

And the compiler will make sure that the variables a and b are assigned unique memory locations to hold their value for as long as needed.

What is a pointer?

Now that we know that memory is just a series of numbered cells, and variables are just nicknames for a memory location assigned by the compiler, what is a pointer?

A pointer is a value that points to the memory address of another variable.

The pointer points to memory address of a variable, just as a variable represents the memory address of value.

Let’s have a look at this program fragment

func main() {
	a := 200
	b := &a
	*b++
	fmt.Println(a)
}

On the first line of main we declare a new variable a and assign it the value 200.

Next we declare a variable b and assign it the address a. Remember that we don’t know the exact memory location where a is stored, but we can still store a‘s address in b.

The third line is probably the most confusing, because of the strongly typed nature of Go. b contains the address of variable a, but we want to increment the value stored in a. To do this we must dereference b, follow the pointer from b to a.

Then we add one the value, and store it back in the memory location stored in b.

The final line prints the value of a, showing that it has increased to 201.

Conclusion

If you are coming from a language with no notion of pointers, or where every variable is implicitly a pointer don’t panic, forming a mental model of how variables and pointers relate takes time and practice. Just remember this rule:

A pointer is a value that points to the memory address of another variable.

Next: There is no pass-by-reference in Go

Why Go?

A few weeks ago I was asked by a friend, “why should I care about Go”? They knew that I was passionate about Go, but wanted to know why I thought other people should care. This article contains three salient reasons why I think Go is an important programming language.

Safety

As individuals, you and I may be perfectly capable of writing a program in C that neither leaks memory or reuses it unsafely. However, with more than 40 years of experience, it is clear that collectively, programmers working in C are unable to reliably do so en masse.

Despite static code analysis, valgrind, tsan, and -Werror being available for a decades, there is scant evidence those tools have achieved widespread acknowledgement, let alone widespread adoption. In aggregate, programmers have shown they simply cannot safely manage their own memory. It’s time to move away from C.

Go does not rely on the programmer to manage memory directly, instead all memory allocation is managed by the language runtime, initialized before use, and bounds checked when necessary. It’s certainly not the first mainstream language that offered these safety guarantees, Java (1995) is probably a contender for that crown. The point being, the world has no appetite for unsafe programming languages, thus Go is memory safe by default.

Developer productivity

The point at which developer time became more expensive than hardware time was crossed back in the late 1970s. Developer productivity is a sprawling topic but it boils down to this; how much time do you spend doing useful work vs waiting for the compiler or hopelessly lost in a foreign codebase.

The joke goes that Go was developed while waiting for a C++ program to compile. Fast compilation is a key feature of Go and a key recruiting tool to attract new developers. While compilation speed remains a constant battleground, it is fair to say that compilations which take minutes in other languages, take seconds in Go.

More fundamental to the question of developer productivity, Go programmers realise that code is written to be read and so place the act of reading code above the act of writing it. Go goes so far as to enforce, via tooling and custom, that all code by formatted in a specific style. This removes the friction of learning a project specific language sub-dialect and helps spot mistakes because they just look incorrect.

Due to a focus on analysis and mechanical assistance, a growing set of tools that exist to spot common coding errors have been adopted by Go developers in a way that never struck a chord with C programmers—Go developers want tools to help them keep their code clean.

Concurrency

For more than a decade, chip designers have been warning that the free lunch is over. Hardware parallelism, from the lowliest mobile phone to the most power hungry server, in the form of more, slower, cpu cores, is only available if your language can utilise them. Therefore, concurrency needs to be built into the software we write to run on today’s hardware.

Go takes a step beyond languages that expose the operating system’s multi-process or multi-threading parallelism models by offering a lightweight concurrency model based on coroutines, or goroutines as they are known in Go. Goroutines allows the programmer to eschew convoluted callback styles while the language runtime makes sure that there will be just enough threads to keep your cores active.

The rule of three

These were my three reasons for recommending Go to my friend; safety, productivity, and concurrency. Individually, there are languages that cover one, possibly two of these domains, but it is the combination of all three that makes Go an excellent choice for mainstream programmers today.

I’m speaking at GopherChina and GopherCon Singapore

In April and May I’ll be speaking at GopherChina and GopherCon Singapore, respectively. This post is a teaser for the talks that were selected by the organisers. If you’re in the area, I hope you’ll come and hear me speak.

GopherChina

GopherChina is the third event in this conference series and this year will return to Shanghai. I was lucky to attend the event in 2016 and am looking forward to 2017.

The hidden #pragmas of Go

Go isn’t like C. It doesn’t have a preprocessor, it doesn’t have macros, and it certainly doesn’t have #define, but Go does have pragmas.

What are pragmas? The name come from the #pragma declaration that tells C compilers to alter their interpretation of a piece of code. Now, Go doesn’t have a #pragma directive, but it does have ways of altering the operation of the Go compiler via directive syntax hidden in comments.

This talk will explore the history of these directives, how and why they are used, and how you can, but probably shouldn’t, use them in your own code.

GopherCon Singapore

GopherCon Singapore is the latest in the GopherCon franchise, and as flight times go, relatively close to home. I’m delighted to have the opportunity to present at their inaugural conference in May.

Concurrency made Easy

In my experience, many people who come to Go do so because they have a problem where being able to run more than one task at a time in their program would be beneficial. Ruby and Python programmers come to Go because the concurrency story is much better, the same is true of Node programmers; the event loop is still inherently single threaded.

But, most programmers who stick with Go for a while tend to look back on their early efforts and say things like “wow, I really went overboard with channels” or “I went crazy with goroutines when I started writing Go. It was impossible to understand what the program did”. For people who learn Go formally from an instructor or a book, the concurrency section is always the last section they cover.

So there is a dichotomy here. Go’s headline feature is simple, lightweight concurrency. As a product the language sells itself on that feature alone. On the other hand, there is a narrative that concurrency isn’t actually that easy to use, otherwise people wouldn’t make it the last things in their books or classes, or perhaps more accurately, concurrency is not the solution to every problem.

With this as a background, I’d like to explore some strategies for using concurrency in Go without the pitfalls of convoluted code, the importance of memory ownership, and the best way to structure a Go program using goroutines.

Context is for cancelation

In my previous post I suggested that the best way to break the compile time coupling between the logger and the loggee was passing in a logger interface when constructing each major type in your program. The suggestion has been floated several times that logging is context specific, so maybe a logger can be passed around via a context.Context. I think this suggestion is flawed (as are most uses of context.Value, but that’s another story). This post explains why.

context.Value() is goroutine thread local storage

Using context.Context to pass a logger into a function is a poor design pattern. In effect context.Context is being used as a conduit to arbitrarily extend the API of any method that takes a context.Context value. It’s like Python’s **kwargs, or whatever the name is for that Ruby pattern of always passing a hash. Using context.Context in this way avoids an API break by smuggling data in the unstructured bag of values attached to the context. It’s thread local storage in a cheap suit.

It’s not just that values are boxed into an interface{} inside context.WithValue that I object to. The far more serious concern is there is no schema to this data, so there is no way for a method that takes a context to ensure that it contains the specific key required to complete the operation. context.Value returns nil if the key is not found, which means any code doing the naïve

log := ctx.Value("logger").(log.Logger)
log.Warn("something you'll ignore later")

will blow up if the "logger" key is not present.

Sure, you can check that the assertion succeeded, but I feel pretty confident that if this pattern were to become popular then people would eschew the two arg form of type assertion and just expect that the key always returned a valid logger. This would be especially true as logging in error paths is rarely tested, so you’ll hit this when you need it the most.

In my opinion passing loggers inside context.Context would be the worst solution to the problem of decoupling loggers from implementations. We’d have gone from an explicit compile time dependency to an implicit run time dependency, one that could not be enforced by the compiler.

To quote @freeformz

Loggers should be injected into dependencies. Full stop.

It’s verbose, but it’s the only way to achieve decoupled design.