Category Archives: Programming

Use internal packages to reduce your public API surface

In the beginning, before the go tool, before Go 1.0, the Go distribution stored the standard library in a subdirectory called pkg/ and the commands which built upon it in cmd/. This wasn’t so much a deliberate taxonomy but a by product of the original make based build system. In September 2014, the Go distribution dropped the pkg/ subdirectory, but then this tribal knowledge had set root in large Go projects and continues to this day.

I tend to view empty directories inside a Go project with suspicion. Often they are a hint that the module’s author may be trying to create a taxonomy of packages rather than ensuring each package’s name, and thus its enclosing directory, uniquely describes its purpose. While the symmetry with cmd/ for package main commands is appealing, a directory that exists only to hold other packages is a potential design smell.

More importantly, the boilerplate of an empty pkg/ directory distracts from the more useful idiom of an internal/ directory. internal/ is a special directory name recognised by the go tool which will prevent one package from being imported by another unless both share a common ancestor. Packages within an internal/ directory are therefore said to be internal packages.

To create an internal package, place it within a directory named internal/. When the go command sees an import of a package with internal/ in the import path, it verifies that the importing package is within the tree rooted at the parent of the internal/ directory.

For example, a package /a/b/c/internal/d/e/f can only be imported by code in the directory tree rooted at /a/b/c. It cannot be imported by code in /a/b/g or in any other repository.

If your project contains multiple packages you may find you have some exported symbols which are intended to be used by other packages in your project, but are not intended to be part of your project’s public API. Although Go has limited visibility modifiers–public, exported, symbols and private, non exported, symbols–internal packages provide a useful mechanism for controlling visibility to parts of your project which would otherwise be considered part of its public versioned API.

You can, of course, promote internal packages later if you want to commit to supporting that API; just move them up a directory level or two. The key is this process is opt-in. As the author, internal packages give you control over which symbols in your project’s public API without being forced to glob concepts together into unwieldy mega packages to avoid exporting them.

Be wary of functions which take several parameters of the same type

APIs should be easy to use and hard to misuse.

— Josh Bloch

A good example of a simple looking, but hard to use correctly, API is one which takes two or more parameters of the same type. Let’s compare two function signatures:

func Max(a, b int) int
func CopyFile(to, from string) error

What’s the difference between these functions? Obviously one returns the maximum of two numbers, the other copies a file, but that’s not the important thing.

Max(8, 10) // 10
Max(10, 8) // 10

Max is commutative; the order of its parameters does not matter. The maximum of eight and ten is ten regardless of if I compare eight and ten or ten and eight.

However, this property does not hold true for CopyFile.

CopyFile("/tmp/backup", "presentation.md")
CopyFile("presentation.md", "/tmp/backup")

Which one of these statements made a backup of your presentation and which one overwrite your presentation with last week’s version? You can’t tell without consulting the documentation. A code reviewer cannot know if you’ve got the order correct without consulting the documentation.

The general advice is to try to avoid this situation. Just like long parameter lists, indistinct parameter lists are a design smell.

A challenge

When this situation is unavoidable my solution to this class of problem is to introduce a helper type which will be responsible for calling CopyFile correctly.

type Source string

func (src Source) CopyTo(dest string) error {
	return CopyFile(dest, string(src))
}

func main() {
	var from Source = "presentation.md"
	from.CopyTo("/tmp/backup")
}

In this way CopyFile is always called correctly and, given its poor API can possibly be made private, further reducing the likelihood of misuse.

Can you suggest a better solution?

Don’t force allocations on the callers of your API

This is a post about performance. Most of the time when worrying about the performance of a piece of code the overwhelming advice should be (with apologies to Brendan Gregg) don’t worry about it, yet. However there is one area where I counsel developers to think about the performance implications of a design, and that is API design.

Because of the high cost of retrofitting a change to an API’s signature to address performance concerns, it’s worthwhile considering the performance implications of your API’s design on its caller.

A tale of two API designs

Consider these two Read methods:

func (r *Reader) Read(buf []byte) (int, error)
func (r *Reader) Read() ([]byte, error)

The first method takes a []byte buffer and returns the number of bytes read into that buffer and possibly an error that occurred while reading. The second takes no arguments and returns some data as a []byte or an error.

This first method should be familiar to any Go programmer, it’s io.Reader.Read. As ubiquitous as io.Reader is, it’s not the most convenient API to use. Consider for a moment that io.Reader is the only Go interface in widespread use that returns both a result and an error. Meditate on this for a moment. The standard Go idiom, checking the error and iff it is nil is it safe to consult the other return values, does not apply to Read. In fact the caller must do the opposite. First they must record the number of bytes read into the buffer, reslice the buffer, process that data, and only then, consult the error. This is an unusual API for such a common operation and one that frequently catches out newcomers.

A trap for young players?

Why is it so? Why is one of the central APIs in Go’s standard library written like this? A superficial answer might be io.Reader‘s signature is a reflection of the underlying read(2) syscall, which is indeed true, but misses the point of this post.

If we compare the API of io.Reader to our alternative, func Read() ([]byte, error), this API seems easier to use. Each call to Read() will return the data that was read, no need to reslice buffers, no need to remember the special case to do this before checking the error. Yet this is not the signature of io.Reader.Read. Why would one of Go’s most pervasive interfaces choose such an awkward API? The answer, I believe, lies in the performance implications of the APIs signature on the caller.

Consider again our alternative Read function, func Read() ([]byte, error). On each call Read will read some data into a buffer1 and return the buffer to the caller. Where does this buffer come from? Who allocates it? The answer is the buffer is allocated inside Read. Therefore each call to Read is guaranteed to allocate a buffer which would escape to the heap. The more the program reads, the faster it reads data, the more streams of data it reads concurrently, the more pressure it places on the garbage collector.

The standard libraries’ io.Reader.Read forces the caller to supply a buffer because if the caller is concerned with the number of allocations their program is making this is precisely the kind of thing they want to control. Passing a buffer into Read puts the control of the allocations into the caller’s hands. If they aren’t concerned about allocations they can use higher level helpers like ioutil.ReadAll to read the contents into a []byte, or bufio.Scanner to stream the contents instead.

The opposite, starting with a method like our alternative func Read() ([]byte, error) API, prevents callers from pooling or reusing allocations–no amount of helper methods can fix this. As an API author, if the API cannot be changed you’ll be forced to add a second form to your API taking a supplied buffer and reimplementing your original API in terms of the newer form. Consider, for example, io.CopyBuffer. Other examples of retrofitting APIs for performance reasons are the fmt package and the net/http package which drove the introduction of the sync.Pool type precisely because the Go 1 guarantee prevented the APIs of those packages from changing.


If you want to commit to an API for the long run, consider how its design will impact the size and frequency of allocations the caller will have to make to use it.

Clear is better than clever

This article is based on my GopherCon Singapore 2019 presentation. In the presentation I referenced material from my post on declaring variables and my GolangUK 2017 presentation on SOLID design. For brevity those parts of the talk have been elided from this article. If you prefer, you can watch the recording of the talk.


Readability is often cited as one of Go’s core tenets, I disagree. In this article I’ll discuss the differences between clarity and readability, show you what I mean by clarity and how it applies to Go code, and argue that Go programmers should strive for clarity–not just readability–in their programs.

Why would I read your code?

Before I pick apart the difference between clarity and readability, perhaps the question to ask is, “why would I read your code?” To be clear, when I say I, I don’t mean me, I mean you. And when I say your code I also mean you, but in the third person. So really what I’m asking is, “why would you read another person’s code?”

I think Russ Cox, paraphrasing Titus Winters, put it best:

Software engineering is what happens to programming when you add time and other programmers.

Russ Cox, GopherCon Singapore 2018

The answer to the question, “why would I read your code” is, because we have to work together. Maybe we don’t work in the same office, or live in the same city, maybe we don’t even work at the same company, but we do collaborate on a piece of software, or more likely consume it as a dependency.

This is the essence of Russ and Titus’ observation; software engineering is the collaboration of software engineers over time. I have to read your code, and you read mine, so that I can understand it, so that you can maintain it, and in short, so that any programmer can change it.

Russ is making the distinction between software programming and software engineering. The former is a program you write for yourself, the latter is a program, ​a project, a service, a product, ​that many people will contribute to over time. Engineers will come and go, teams will grow and shrink, requirements will change, features will be added and bugs fixed. This is the nature of software engineering.

We don’t read code, we decode it

It was sometime after that presentation that I finally realized the obvious: Code is not literature. We don’t read code, we decode it.

Peter Seibel

The author Peter Seibel suggests that programs are not read, but are instead decoded. In hindsight this is obvious, after all we call it source code, not source literature. The source code of a program is an intermediary form, somewhere between our concept–what’s inside our heads–and the computer’s executable notation.

In my experience, the most common complaint when faced with a foreign codebase written by someone, or some team, is the code is unreadable. Perhaps you agree with me?

But readability as a concept is subjective. Readability is nit picking about line length and variable names. Readability is holy wars about brace position. Readability is the hand to hand combat of style guides and code review guidelines that regulate the use of whitespace.

Clarity ≠ Readability

Clarity, on the other hand, is the property of the code on the page. Clear code is independent of the low level details of function names and indentation because clear code is concerned with what the code is doing, not just how it is written down.

When you or I say that a foreign codebase is unreadable, what I think what we really mean is, I don’t understand it. For the remainder of this article I want to try to explore the difference between clear code and code that is simply readable, because the goal is not how quickly you can read a piece of code, but how quickly you can grasp its meaning.

Keep to the left

Go programs are traditionally written in a style that favours guard clauses and preconditions. This encourages the successful path to proceed down the page rather than indented inside a conditional block. Mat Ryer calls this line of sight coding, because, the active part of your function is not at risk of sliding out of sight beyond the right hand margin of your screen.

By keeping conditional blocks short, and for the exceptional condition, we avoid nested blocks and potentially complex value shadowing. The successful flow of control continues down the page. At every point in the sequence of statements, if you’ve arrived at that point, you are confident that a growing set of preconditions holds true.

func ReadConfig(path string) (*Config, error) {
        f, err := os.Open(path)
        if err != nil {
                return nil, err
        }
        defer f.Close()
        // ...
 } 

The canonical example of this is the classic Go error check idiom; if err != nil then return it to the caller, else continue with the function. We can generalise this pattern a little and in pseudocode we have:

if some condition {
        // true: cleanup
        return
 }
 // false: continue 

If some condition is true, then return to the caller, else continue onwards towards the end of the function. 

This form holds true for all preconditions, error checks, map lookups, length checks, and so forth. The exact form of the precondition’s check changes, but the pattern is always the same; the cleanup code is inside the block, terminating with a return, the success condition lies outside the block, and is only reachable if the precondition is false.

Even if you are unsure what the preceding and succeeding code does, how the precondition is formed, and how the cleanup code works, it is clear to the reader that this is a guard clause.

Structured programming

Here we have a comp function that takes two ints and returns an int;

func comp(a, b int) int {
        if a < b {
                return -1
        }
        if a > b {
                return 1
        }
        return 0
} 

The comp function is written in a similar form to guard clauses from earlier. If a is less than b, the return -1 path is taken. If a is greater than b, the return 1 path is taken. Else, a and b are by induction equal, so the final return 0 path is taken.

func comp(a, b int) int {
        if condition A {
                body A
        }
        if condition B {
                 body B
        }
        return 0
} 

The problem with comp as written is, unlike the guard clause, someone maintaining this function has to read all of it. To understand when 0 is returned, the reader has to consult the conditions and the body of each clause. This is reasonable when you’re dealing with functions which fit on a slide, but in the real world complicated functions–​the ones we’re paid for our expertise to maintain–are rarely slide sized, and their conditions and bodies are rarely simple.

Let’s address the problem of making it clear under which condition 0 will be returned:

func comp(a, b int) int {
        if a < b {
                return -1
        } else if a > b {
                return 1
        } else {
                return 0
        }
} 

Now, although this code is not what anyone would argue is readable–​long chains of if else if statements are broadly discouraged in Go–​it is clearer to the reader that zero is only returned if none of the conditions are met.

How do we know this? The Go spec declares that each function that returns a value must end in a terminating statement. This means that the body of all conditions must return a value. Thus, this does not compile:

func comp(a, b int) int {
        if a > b {
                a = b // does not compile
        } else if a < b {
                return 1
        } else {
                return 0
        }
}

Further, it is now clear to the reader that this code isn’t actually a series of conditions. This is an example of selection. Only one path can be taken regardless of the operation of the condition blocks. Based on the inputs one of -1, 0, or 1 will always be returned. 

func comp(a, b int) int {
        if a < b {
                return -1
        } else if a > b {
                return 1
        } else {
                return 0
        }
} 

However this code is hard to read as each of the conditions is written differently, the first is a simple if a < b, the second is the unusual else if a > b, and the last conditional is actually unconditional.

But it turns out there is a statement which we can use to make our intention much clearer to the reader; switch.

func comp(a, b int) int {
        switch {
        case a < b:
                return -1
        case a > b:
                return 1
        default:
                return 0
        }
} 

Now it is clear to the reader that this is a selection. Each of the selection conditions are documented in their own case statement, rather than varying else or else if clauses.

By moving the default condition inside the switch, the reader only has to consider the cases that match their condition, as none of the cases can fall out of the switch block because of the default clause.1

Structured programming submerges structure and emphasises behaviour

–Richard Bircher, The limits of software

I found this quote recently and I think it is apt. My arguments for clarity are in truth arguments intended to emphasise the behaviour of the code, rather than be side tracked by minutiae of the structure itself. Said another way, what is the code trying to do, not how is it is trying to do it.

Guiding principles

I opened this article with a discussion of readability vs clarity and hinted that there were other principles of well written Go code. It seems fitting to close on a discussion of those other principles.

Last year Bryan Cantrill gave a wonderful presentation on operating system principles, wherein he highlighted that different operating systems focus on different principles. It is not that they ignore the principles that differ between their competitors, just that when the chips are down, they prioritise a core set. So what is that core set of principles for Go?

Clarity

If you were going to say readability, hopefully I’ve provided you with an alternative.

Programs must be written for people to read, and only incidentally for machines to execute.

–Hal Abelson and Gerald Sussman. Structure and Interpretation of Computer Programs

Code is read many more times than it is written. A single piece of code will, over its lifetime, be read hundreds, maybe thousands of times. It will be read hundreds or thousands of times because it must be understood. Clarity is important because all software, not just Go programs, is written by people to be read by other people. The fact that software is also consumed by machines is secondary.

The most important skill for a programmer is the ability to effectively communicate ideas.

–Gastón Jorquera

Legal documents are double spaced to aide the reader, but to the layperson that does nothing to help them comprehend what they just read. Readability is a property of how easy it was to read the words on the screen. Clarity, on the other hand, is the answer to the question “did you understand what you just read?”.

If you’re writing a program for yourself, maybe it only has to run once, or you’re the only person who’ll ever see it, then do what ever works for you. But if this is a piece of software that more than one person will contribute to, or that will be used by people over a long enough time that requirements, features, or the environment it runs in may change, then your goal must be for your program to be maintainable.

The first step towards writing maintainable code is making sure intent of the code is clear.

Simplicity

The next principle is obviously simplicity. Some might argue the most important principle for any programming language, perhaps the most important principle full stop.

Why should we strive for simplicity? Why is important that Go programs be simple?

The ability to simplify means to eliminate the unnecessary so that the necessary may speak

–Hans Hofmann

We’ve all been in a situation where we say “I can’t understand this code”. We’ve all worked on programs we were scared to make a change because we worried that it’ll break another part of the program; a part you don’t understand and don’t know how to fix. 

This is complexity. Complexity turns reliable software in unreliable software. Complexity is what leads to unmaintainable software. Complexity is what kills software projects. Clarity and simplicity are interlocking forces that lead to maintainable software.

Productivity

The last Go principle I want to highlight is productivity. Developer productivity boils down to this; how much time do you spend doing useful work verses waiting for your tools or hopelessly lost in a foreign code-base? Go programmers should feel that they can get a lot done with Go.

“I started another compilation, turned my chair around to face Robert, and started asking pointed questions. Before the compilation was done, we’d roped Ken in and had decided to do something.”

–Rob Pike, Less is Exponentially more

The joke goes that Go was designed while waiting for a C++ program to compile. Fast compilation is a key feature of Go and a key recruiting tool to attract new developers. While compilation speed remains a constant battleground, it is fair to say that compilations which take minutes in other languages, take seconds in Go. This helps Go developers feel as productive as their counterparts working in dynamic languages without the maintenance issues inherent in those languages.

Design is the art of arranging code to work today, and be changeable  forever.

–Sandi Metz

More fundamental to the question of developer productivity, Go programmers realise that code is written to be read and so place the act of reading code above the act of writing it. Go goes so far as to enforce, via tooling and custom, that all code be formatted in a specific style. This removes the friction of learning a project specific dialect and helps spot mistakes because they just look incorrect.

Go programmers don’t spend days debugging inscrutable compile errors. They don’t waste days with complicated build scripts or deploying code to production. And most importantly they don’t spend their time trying to understand what their coworker wrote.

Complexity is anything that makes software hard to understand or to modify.

–John Ousterhout, A Philosophy of Software Design

Something I know about each of you reading this post is you will eventually leave your current employer. Maybe you’ll be moving on to a new role, or perhaps a promotion, perhaps you’ll move cities, or follow your partner overseas. Whatever the reason, we must all consider the succession of the maintainership of the programs we create.

If we strive to write programs that are clear, programs that are simple, and to focus on the productivity of those working with us that will set all Go programmers in good stead.

Because if we don’t, as we move from job to job, we’ll leave behind programs which cannot be maintained. Programs which cannot be changed. Programs which are too hard to onboard new developers, and programs which feel like career digression for those that work on them.

If software cannot be maintained, then it will be rewritten; and that could be the last time your company invests in Go.

Constant Time

This essay is a derived from my dotGo 2019 presentation about my favourite feature in Go.


Many years ago Rob Pike remarked,

“Numbers are just numbers, you’ll never see 0x80ULL in a .go source file”.

—Rob Pike, The Go Programming Language

Beyond this pithy observation lies the fascinating world of Go’s constants. Something that is perhaps taken for granted because, as Rob noted, is Go numbers–constants–just work.
In this post I intend to show you a few things that perhaps you didn’t know about Go’s const keyword.

What’s so great about constants?

To kick things off, why are constants good? Three things spring to mind:

  • Immutability. Constants are one of the few ways we have in Go to express immutability to the compiler.
  • Clarity. Constants give us a way to extract magic numbers from our code, giving them names and semantic meaning.
  • Performance. The ability to express to the compiler that something will not change is key as it unlocks optimisations such as constant folding, constant propagation, branch and dead code elimination.

But these are generic use cases for constants, they apply to any language. Let’s talk about some of the properties of Go’s constants.

A Challenge

To introduce the power of Go’s constants let’s try a little challenge: declare a constant whose value is the number of bits in the natural machine word.

We can’t use unsafe.Sizeof as it is not a constant expression1. We could use a build tag and laboriously record the natural word size of each Go platform, or we could do something like this:

const uintSize = 32 << (^uint(0) >> 32 & 1)

There are many versions of this expression in Go codebases. They all work roughly the same way. If we’re on a 64 bit platform then the exclusive or of the number zero–all zero bits–is a number with all bits set, sixty four of them to be exact.

1111111111111111111111111111111111111111111111111111111111111111

If we shift that value thirty two bits to the right, we get another value with thirty two ones in it.

0000000000000000000000000000000011111111111111111111111111111111

Anding that with a number with one bit in the final position give us, the same thing, 1,

0000000000000000000000000000000011111111111111111111111111111111 & 1 = 1

Finally we shift the number thirty two one place to the right, giving us 642.

32 << 1 = 64

This expression is an example of a constant expression. All of these operations happen at compile time and the result of the expression is itself a constant. If you look in the in runtime package, in particular the garbage collector, you’ll see how constant expressions are used to set up complex invariants based on the word size of the machine the code is compiled on.

So, this is a neat party trick, but most compilers will do this kind of constant folding at compile time for you. Let’s step it up a notch.

Constants are values

In Go, constants are values and each value has a type. In Go, user defined types can declare their own methods. Thus, a constant value can have a method set. If you’re surprised by this, let me show you an example that you probably use every day.

const timeout = 500 * time.Millisecond
fmt.Println("The timeout is", timeout) // 500ms

In the example the untyped literal constant 500 is multiplied by time.Millisecond, itself a constant of type time.Duration. The rule for assignments in Go are, unless otherwise declared, the type on the left hand side of the assignment operator is inferred from the type on the right.500 is an untyped constant so it is converted to a time.Duration then multiplied with the constant time.Millisecond.

Thus timeout is a constant of type time.Duration which holds the value 500000000.
Why then does fmt.Println print 500ms, not 500000000?

The answer is time.Duration has a String method. Thus any time.Duration value, even a constant, knows how to pretty print itself.

Now we know that constant values are typed, and because types can declare methods, we can derive that constant values can fulfil interfaces. In fact we just saw an example of this. fmt.Println doesn’t assert that a value has a String method, it asserts the value implements the Stringer interface.

Let’s talk a little about how we can use this property to make our Go code better, and to do that I’m going to take a brief digression into the Singleton pattern.

Singletons

I’m generally not a fan of the singleton pattern, in Go or any language. Singletons complicate testing and create unnecessary coupling between packages. I feel the singleton pattern is often used not to create a singular instance of a thing, but instead to create a place to coordinate registration. net/http.DefaultServeMux is a good example of this pattern.

package http

// DefaultServeMux is the default ServeMux used by Serve.
var DefaultServeMux = &defaultServeMux

var defaultServeMux ServeMux

There is nothing singular about http.defaultServerMux, nothing prevents you from creating another ServeMux. In fact the http package provides a helper that will create as many ServeMux‘s as you want.

// NewServeMux allocates and returns a new ServeMux.
func NewServeMux() *ServeMux { return new(ServeMux) }

http.DefaultServeMux is not a singleton. Never the less there is a case for things which are truely singletons because they can only represent a single thing. A good example of this are the file descriptors of a process; 0, 1, and 2 which represent stdin, stdout, and stderr respectively.

It doesn’t matter what names you give them, 1 is always stdout, and there can only ever be one file descriptor 1. Thus these two operations are identical:

fmt.Fprintf(os.Stdout, "Hello dotGo\n")
syscall.Write(1, []byte("Hello dotGo\n"))

So let’s look at how the os package defines Stdin, Stdout, and Stderr:

package os

var (
        Stdin  = NewFile(uintptr(syscall.Stdin), "/dev/stdin")
        Stdout = NewFile(uintptr(syscall.Stdout), "/dev/stdout")
        Stderr = NewFile(uintptr(syscall.Stderr), "/dev/stderr")
)

There are a few problems with this declaration. Firstly their type is *os.File not the respective io.Reader or io.Writer interfaces. People have long complained that this makes replacing them with alternatives problematic. However the notion of replacing these variables is precisely the point of this digression. Can you safely change the value of os.Stdout once your program is running without causing a data race?

I argue that, in the general case, you cannot. In general, if something is unsafe to do, as programmers we shouldn’t let our users think that it is safe, lest they begin to depend on that behaviour.

Could we change the definition of os.Stdout and friends so that they retain the observable behaviour of reading and writing, but remain immutable? It turns out, we can do this easily with constants.

type readfd int

func (r readfd) Read(buf []byte) (int, error) {
       return syscall.Read(int(r), buf)
}

type writefd int

func (w writefd) Write(buf []byte) (int, error) {
        return syscall.Write(int(w), buf)
}

const (
        Stdin  = readfd(0)
        Stdout = writefd(1)
        Stderr = writefd(2)
)

func main() {
        fmt.Fprintf(Stdout, "Hello world")
}

In fact this change causes only one compilation failure in the standard library.3

Sentinel error values

Another case of things which look like constants but really aren’t, are sentinel error values. io.EOF, sql.ErrNoRows, crypto/x509.ErrUnsupportedAlgorithm, and so on are all examples of sentinel error values. They all fall into a category of expected errors, and because they are expected, you’re expected to check for them.

To compare the error you have with the one you were expecting, you need to import the package that defines that error. Because, by definition, sentinel errors are exported public variables, any code that imports, for example, the io package could change the value of io.EOF.

package nelson

import "io"

func init() {
        io.EOF = nil // haha!
}

I’ll say that again. If I know the name of io.EOF I can import the package that declares it, which I must if I want to compare it to my error, and thus I could change io.EOF‘s value. Historically convention and a bit of dumb luck discourages people from writing code that does this, but technically there is nothing to prevent you from doing so.

Replacing io.EOF is probably going to be detected almost immediately. But replacing a less frequently used sentinel error may cause some interesting side effects:

package innocent

import "crypto/rsa"

func init() {
        rsa.ErrVerification = nil // 🤔
}

If you were hoping the race detector will spot this subterfuge, I suggest you talk to the folks writing testing frameworks who replace os.Stdout without it triggering the race detector.

Fungibility

I want to digress for a moment to talk about the most important property of constants. Constants aren’t just immutable, its not enough that we cannot overwrite their declaration,
Constants are fungible. This is a tremendously important property that doesn’t get nearly enough attention.

Fungible means identical. Money is a great example of fungibility. If you were to lend me 10 bucks, and I later pay you back, the fact that you gave me a 10 dollar note and I returned to you 10 one dollar bills, with respect to its operation as a financial instrument, is irrelevant. Things which are fungible are by definition equal and equality is a powerful property we can leverage for our programs.

var myEOF = errors.New("EOF") // io/io.go line 38
fmt.Println(myEOF == io.EOF)  // false

Putting aside the effect of malicious actors in your code base the key design challenge with sentinel errors is they behave like singletons, not constants. Even if we follow the exact procedure used by the io package to create our own EOF value, myEOF and io.EOF are not equal. myEOF and io.EOF are not fungible, they cannot be interchanged. Programs can spot the difference.

When you combine the lack of immutability, the lack of fungibility, the lack of equality, you have a set of weird behaviours stemming from the fact that sentinel error values in Go are not constant expressions. But what if they were?

Constant errors

Ideally a sentinel error value should behave as a constant. It should be immutable and fungible. Let’s recap how the built in error interface works in Go.

type error interface {
        Error() string
}

Any type with an Error() string method fulfils the error interface. This includes user defined types, it includes types derived from primitives like string, and it includes constant strings. With that background, consider this error implementation:

type Error string

func (e Error) Error() string {
        return string(e)
}

We can use this error type as a constant expression:

const err = Error("EOF")

Unlike errors.errorString, which is a struct, a compact struct literal initialiser is not a constant expression and cannot be used.

const err2 = errors.errorString{"EOF"} // doesn't compile

As constants of this Error type are not variables, they are immutable.

const err = Error("EOF")
err = Error("not EOF")   // doesn't compile

Additionally, two constant strings are always equal if their contents are equal:

const str1 = "EOF"
const str2 = "EOF"
fmt.Println(str1 == str2) // true

which means two constants of a type derived from string with the same contents are also equal.

type Error string

const err1 = Error("EOF")
const err2 = Error("EOF")
fmt.Println(err1 == err2) // true```

Said another way, equal constant Error values are the same, in the way that the literal constant 1 is the same as every other literal constant 1.

Now we have all the pieces we need to make sentinel errors, like io.EOF, and rsa.ErrVerfication, immutable, fungible, constant expressions.

% git diff
diff --git a/src/io/io.go b/src/io/io.go
index 2010770e6a..355653b4b8 100644
--- a/src/io/io.go
+++ b/src/io/io.go
@@ -35,7 +35,12 @@ var ErrShortBuffer = errors.New("short buffer")
 // If the EOF occurs unexpectedly in a structured data stream,
 // the appropriate error is either ErrUnexpectedEOF or some other error
 // giving more detail.
-var EOF = errors.New("EOF")
+const EOF = ioError("EOF")
+
+type ioError string
+
+func (e ioError) Error() string { return string(e) }

This change is probably a bit of a stretch for the Go 1 contract, but there is no reason you cannot adopt a constant error pattern for your sentinel errors in the packages that you write.

In summary

Go’s constants are powerful. If you only think of them as immutable numbers, you’re missing out. Go’s constants let us compose programs that are more correct and harder to misuse.

Today I’ve outlined three ways to use constants that are more than your typical immutable number.

Now it’s over to you, I’m excited to see where you can take these ideas.

Why bother writing tests at all?

In previous posts and presentations I talked about how to test, and when to test. To conclude this series of I’m going to ask the question, why test at all?

Even if you don’t, someone will test your software

I’m sure no-one reading this post thinks that software should be delivered without being tested first. Even if that were true, your customers are going to test it, or at least use it. If nothing else, it would be good to discover any issues with the code before your customers do. If not for the reputation of your company, at least for your professional pride.

So, if we agree that software should be tested, the question becomes: who should do that testing?

The majority of testing should be performed by development teams

I argue that the majority of the testing should be done by development groups. Moreover, testing should be automated, and thus the majority of these tests should be unit style tests.

To be clear, I am not saying you shouldn’t write integration, functional, or end to end tests. I’m also not saying that you shouldn’t have a QA group, or integration test engineers. However at a recent software conference, in a room of over 1,000 engineers, nobody raised their hand when I asked if they considered themselves in a pure quality assurance role.

You might argue that the audience was self selecting, that QA engineers did not feel a software conference was relevant–or welcoming–to them. However, I think this proves my point, the days of one developer to one test engineer are gone and not coming back.

If development teams aren’t writing the majority of tests, who is?

Manual testing should not be the majority of your testing because manual testing is O(n)

Thus, if individual contributors are expected to test the software they write, why do we need to automate it? Why is a manual testing plan not good enough?

Manual testing of software or manual verification of a defect is not sufficient because it does not scale. As the number of manual tests grows, engineers are tempted to skip them or only execute the scenarios they think are could be affected. Manual testing is expensive in terms of time, thus dollars, and it is boring. 99.9% of the tests that passed last time are expected to pass again. Manual testing is looking for a needle in a haystack, except you don’t stop when you find the first needle.

This means that your first response when given a bug to fix or a feature to implement should be to write a failing test. This doesn’t need to be a unit test, but it should be an automated test. Once you’ve fixed the bug, or added the feature, now have the test case to prove it worked–and you can check them in together.

Tests are the critical component that ensure you can always ship your master branch

As a development team, you are judged on your ability to deliver working software to the business. No, seriously, the business could care less about OOP vs FP, CI/CD, table tennis or limited run La Croix.

Your super power is, at any time, anyone on the team should be confident that the master branch of your code is shippable. This means at any time they can deliver a release of your software to the business and the business can recoup its investment in your development R&D.

I cannot emphasise this enough. If you want the non technical parts of the business to believe you are heros, you must never create a situation where you say “well, we can’t release right now because we’re in the middle of an important refactoring. It’ll be a few weeks. We hope.”

Again, I’m not saying you cannot refactor, but at every stage your product must be shippable. Your tests have to pass. It may not have all the desired features, but the features that are there should work as described on the tin.

Tests lock in behaviour

Your tests are the contract about what your software does and does not do. Unit tests should lock in the behaviour of the package’s API. Integration tests do the same for complex interactions. Tests describe, in code, what the program promises to do.

If there is a unit test for each input permutation, you have defined the contract for what the code will do in code, not documentation. This is a contract anyone on your team can assert by simply running the tests. At any stage you know with a high degree of confidence that the behaviour people relied on before your change continues to function after your change.

Tests give you confidence to change someone else’s code

Lastly, and this is the biggest one, for programmers working on a piece of code that has been through many hands. Tests give you the confidence to make changes.

Even though we’ve never met, something I know about you, the reader, is you will eventually leave your current employer. Maybe you’ll be moving on to a new role, or perhaps a promotion, perhaps you’ll move cities, or follow your partner overseas. Whatever the reason, the succession of the maintenance of programs you write is key.

If people cannot maintain our code then as you and I move from job to job we’ll leave behind programs which cannot be maintained. This goes beyond advocacy for a language or tool. Programs which cannot be changed, programs which are too hard to onboard new developers, or programs which feel like career digression to work on them will reach only one end state–they are a dead end. They represent a balance sheet loss for the business. They will be replaced.

If you worry about who will maintain your code after you’re gone, write good tests.

Prefer table driven tests

I’m a big fan of testing, specifically unit testing and TDD (done correctly, of course). A practice that has grown around Go projects is the idea of a table driven test. This post explores the how and why of writing a table driven test.

Let’s say we have a function that splits strings:

// Split slices s into all substrings separated by sep and
// returns a slice of the substrings between those separators.
func Split(s, sep string) []string {
var result []string
i := strings.Index(s, sep)
for i > -1 {
result = append(result, s[:i])
s = s[i+len(sep):]
i = strings.Index(s, sep)
}
return append(result, s)
}

In Go, unit tests are just regular Go functions (with a few rules) so we write a unit test for this function starting with a file in the same directory, with the same package name, strings.

package split

import (
"reflect"
"testing"
)

func TestSplit(t *testing.T) {
got := Split("a/b/c", "/")
want := []string{"a", "b", "c"}
if !reflect.DeepEqual(want, got) {
t.Fatalf("expected: %v, got: %v", want, got)
}
}

Tests are just regular Go functions with a few rules:

  1. The name of the test function must start with Test.
  2. The test function must take one argument of type *testing.T. A *testing.T is a type injected by the testing package itself, to provide ways to print, skip, and fail the test.

In our test we call Split with some inputs, then compare it to the result we expected.

Code coverage

The next question is, what is the coverage of this package? Luckily the go tool has a built in branch coverage. We can invoke it like this:

% go test -coverprofile=c.out
PASS
coverage: 100.0% of statements
ok split 0.010s

Which tells us we have 100% branch coverage, which isn’t really surprising, there’s only one branch in this code.

If we want to dig in to the coverage report the go tool has several options to print the coverage report. We can use go tool cover -func to break down the coverage per function:

% go tool cover -func=c.out
split/split.go:8: Split 100.0%
total: (statements) 100.0%

Which isn’t that exciting as we only have one function in this package, but I’m sure you’ll find more exciting packages to test.

Spray some .bashrc on that

This pair of commands is so useful for me I have a shell alias which runs the test coverage and the report in one command:

cover () {
local t=$(mktemp -t cover)
go test $COVERFLAGS -coverprofile=$t $@ \
&& go tool cover -func=$t \
&& unlink $t
}

Going beyond 100% coverage

So, we wrote one test case, got 100% coverage, but this isn’t really the end of the story. We have good branch coverage but we probably need to test some of the boundary conditions. For example, what happens if we try to split it on comma?

func TestSplitWrongSep(t *testing.T) {
got := Split("a/b/c", ",")
want := []string{"a/b/c"}
if !reflect.DeepEqual(want, got) {
t.Fatalf("expected: %v, got: %v", want, got)
}
}

Or, what happens if there are no separators in the source string?

func TestSplitNoSep(t *testing.T) {
got := Split("abc", "/")
want := []string{"abc"}
if !reflect.DeepEqual(want, got) {
t.Fatalf("expected: %v, got: %v", want, got)
}
}

We’re starting build a set of test cases that exercise boundary conditions. This is good.

Introducing table driven tests

However the there is a lot of duplication in our tests. For each test case only the input, the expected output, and name of the test case change. Everything else is boilerplate. What we’d like to to set up all the inputs and expected outputs and feel them to a single test harness. This is a great time to introduce table driven testing.

func TestSplit(t *testing.T) {
type test struct {
input string
sep string
want []string
}

tests := []test{
{input: "a/b/c", sep: "/", want: []string{"a", "b", "c"}},
{input: "a/b/c", sep: ",", want: []string{"a/b/c"}},
{input: "abc", sep: "/", want: []string{"abc"}},
}

for _, tc := range tests {
got := Split(tc.input, tc.sep)
if !reflect.DeepEqual(tc.want, got) {
t.Fatalf("expected: %v, got: %v", tc.want, got)
}
}
}

We declare a structure to hold our test inputs and expected outputs. This is our table. The tests structure is usually a local declaration because we want to reuse this name for other tests in this package.

In fact, we don’t even need to give the type a name, we can use an anonymous struct literal to reduce the boilerplate like this:

func TestSplit(t *testing.T) {
tests := []struct {
input string
sep string
want []string
}{
{input: "a/b/c", sep: "/", want: []string{"a", "b", "c"}},
{input: "a/b/c", sep: ",", want: []string{"a/b/c"}},
{input: "abc", sep: "/", want: []string{"abc"}},
}

for _, tc := range tests {
got := Split(tc.input, tc.sep)
if !reflect.DeepEqual(tc.want, got) {
t.Fatalf("expected: %v, got: %v", tc.want, got)
}
}
}

Now, adding a new test is a straight forward matter; simply add another line the tests structure. For example, what will happen if our input string has a trailing separator?

{input: "a/b/c", sep: "/", want: []string{"a", "b", "c"}},
{input: "a/b/c", sep: ",", want: []string{"a/b/c"}},
{input: "abc", sep: "/", want: []string{"abc"}},
{input: "a/b/c/", sep: "/", want: []string{"a", "b", "c"}}, // trailing sep

But, when we run go test, we get

% go test
--- FAIL: TestSplit (0.00s)
split_test.go:24: expected: [a b c], got: [a b c ]

Putting aside the test failure, there are a few problems to talk about.

The first is by rewriting each test from a function to a row in a table we’ve lost the name of the failing test. We added a comment in the test file to call out this case, but we don’t have access to that comment in the go test output.

There are a few ways to resolve this. You’ll see a mix of styles in use in Go code bases because the table testing idiom is evolving as people continue to experiment with the form.

Enumerating test cases

As tests are stored in a slice we can print out the index of the test case in the failure message:

func TestSplit(t *testing.T) {
tests := []struct {
input string
sep . string
want []string
}{
{input: "a/b/c", sep: "/", want: []string{"a", "b", "c"}},
{input: "a/b/c", sep: ",", want: []string{"a/b/c"}},
{input: "abc", sep: "/", want: []string{"abc"}},
{input: "a/b/c/", sep: "/", want: []string{"a", "b", "c"}},
}

for i, tc := range tests {
got := Split(tc.input, tc.sep)
if !reflect.DeepEqual(tc.want, got) {
t.Fatalf("test %d: expected: %v, got: %v", i+1, tc.want, got)
}
}
}

Now when we run go test we get this

% go test
--- FAIL: TestSplit (0.00s)
split_test.go:24: test 4: expected: [a b c], got: [a b c ]

Which is a little better. Now we know that the fourth test is failing, although we have to do a little bit of fudging because slice indexing—​and range iteration—​is zero based. This requires consistency across your test cases; if some use zero base reporting and others use one based, it’s going to be confusing. And, if the list of test cases is long, it could be difficult to count braces to figure out exactly which fixture constitutes test case number four.

Give your test cases names

Another common pattern is to include a name field in the test fixture.

func TestSplit(t *testing.T) {
tests := []struct {
name string
input string
sep string
want []string
}{
{name: "simple", input: "a/b/c", sep: "/", want: []string{"a", "b", "c"}},
{name: "wrong sep", input: "a/b/c", sep: ",", want: []string{"a/b/c"}},
{name: "no sep", input: "abc", sep: "/", want: []string{"abc"}},
{name: "trailing sep", input: "a/b/c/", sep: "/", want: []string{"a", "b", "c"}},
}

for _, tc := range tests {
got := Split(tc.input, tc.sep)
if !reflect.DeepEqual(tc.want, got) {
t.Fatalf("%s: expected: %v, got: %v", tc.name, tc.want, got)
}
}
}

Now when the test fails we have a descriptive name for what the test was doing. We no longer have to try to figure it out from the output—​also, now have a string we can search on.

% go test
--- FAIL: TestSplit (0.00s)
split_test.go:25: trailing sep: expected: [a b c], got: [a b c ]

We can dry this up even more using a map literal syntax:

func TestSplit(t *testing.T) {
tests := map[string]struct {
input string
sep string
want []string
}
{
"simple": {input: "a/b/c", sep: "/", want: []string{"a", "b", "c"}},
"wrong sep": {input: "a/b/c", sep: ",", want: []string{"a/b/c"}},
"no sep": {input: "abc", sep: "/", want: []string{"abc"}},
"trailing sep": {input: "a/b/c/", sep: "/", want: []string{"a", "b", "c"}},
}

for name, tc := range tests {
got := Split(tc.input, tc.sep)
if !reflect.DeepEqual(tc.want, got) {
t.Fatalf("%s: expected: %v, got: %v", name, tc.want, got)
}
}
}

Using a map literal syntax we define our test cases not as a slice of structs, but as map of test names to test fixtures. There’s also a side benefit of using a map that is going to potentially improve the utility of our tests.

Map iteration order is undefined 1 This means each time we run go test, our tests are going to be potentially run in a different order.

This is super useful for spotting conditions where test pass when run in statement order, but not otherwise. If you find that happens you probably have some global state that is being mutated by one test with subsequent tests depending on that modification.

Introducing sub tests

Before we fix the failing test there are a few other issues to address in our table driven test harness.

The first is we’re calling t.Fatalf when one of the test cases fails. This means after the first failing test case we stop testing the other cases. Because test cases are run in an undefined order, if there is a test failure, it would be nice to know if it was the only failure or just the first.

The testing package would do this for us if we go to the effort to write out each test case as its own function, but that’s quite verbose. The good news is since Go 1.7 a new feature was added that lets us do this easily for table driven tests. They’re called sub tests.

func TestSplit(t *testing.T) {
tests := map[string]struct {
input string
sep string
want []string
}{
"simple": {input: "a/b/c", sep: "/", want: []string{"a", "b", "c"}},
"wrong sep": {input: "a/b/c", sep: ",", want: []string{"a/b/c"}},
"no sep": {input: "abc", sep: "/", want: []string{"abc"}},
"trailing sep": {input: "a/b/c/", sep: "/", want: []string{"a", "b", "c"}},
}

for name, tc := range tests {
t.Run(name, func(t *testing.T) {
got := Split(tc.input, tc.sep)
if !reflect.DeepEqual(tc.want, got) {
t.Fatalf("expected: %v, got: %v", tc.want, got)
}
})

}
}

As each sub test now has a name we get that name automatically printed out in any test runs.

% go test
--- FAIL: TestSplit (0.00s)
--- FAIL: TestSplit/trailing_sep (0.00s)
split_test.go:25: expected: [a b c], got: [a b c ]

Each subtest is its own anonymous function, therefore we can use t.Fatalft.Skipf, and all the other testing.Thelpers, while retaining the compactness of a table driven test.

Individual sub test cases can be executed directly

Because sub tests have a name, you can run a selection of sub tests by name using the go test -run flag.

% go test -run=.*/trailing -v
=== RUN TestSplit
=== RUN TestSplit/trailing_sep
--- FAIL: TestSplit (0.00s)
--- FAIL: TestSplit/trailing_sep (0.00s)
split_test.go:25: expected: [a b c], got: [a b c ]

Comparing what we got with what we wanted

Now we’re ready to fix the test case. Let’s look at the error.

--- FAIL: TestSplit (0.00s)
--- FAIL: TestSplit/trailing_sep (0.00s)
split_test.go:25: expected: [a b c], got: [a b c ]

Can you spot the problem? Clearly the slices are different, that’s what reflect.DeepEqual is upset about. But spotting the actual difference isn’t easy, you have to spot that extra space after c. This might look simple in this simple example, but it is any thing but when you’re comparing two complicated deeply nested gRPC structures.

We can improve the output if we switch to the %#v syntax to view the value as a Go(ish) declaration:

got := Split(tc.input, tc.sep)
if !reflect.DeepEqual(tc.want, got) {
t.Fatalf("expected: %#v, got: %#v", tc.want, got)
}

Now when we run our test it’s clear that the problem is there is an extra blank element in the slice.

% go test
--- FAIL: TestSplit (0.00s)
--- FAIL: TestSplit/trailing_sep (0.00s)
split_test.go:25: expected: []string{"a", "b", "c"}, got: []string{"a", "b", "c", ""}

But before we go to fix our test failure I want to talk a little bit more about choosing the right way to present test failures. Our Split function is simple, it takes a primitive string and returns a slice of strings, but what if it worked with structs, or worse, pointers to structs?

Here is an example where %#v does not work as well:

func main() {
type T struct {
I int
}
x := []*T{{1}, {2}, {3}}
y := []*T{{1}, {2}, {4}}
fmt.Printf("%v %v\n", x, y)
fmt.Printf("%#v %#v\n", x, y)
}

The first fmt.Printfprints the unhelpful, but expected slice of addresses; [0xc000096000 0xc000096008 0xc000096010] [0xc000096018 0xc000096020 0xc000096028]. However our %#v version doesn’t fare any better, printing a slice of addresses cast to *main.T;[]*main.T{(*main.T)(0xc000096000), (*main.T)(0xc000096008), (*main.T)(0xc000096010)} []*main.T{(*main.T)(0xc000096018), (*main.T)(0xc000096020), (*main.T)(0xc000096028)}

Because of the limitations in using any fmt.Printf verb, I want to introduce the go-cmp library from Google.

The goal of the cmp library is it is specifically to compare two values. This is similar to reflect.DeepEqual, but it has more capabilities. Using the cmp pacakge you can, of course, write:

func main() {
type T struct {
I int
}
x := []*T{{1}, {2}, {3}}
y := []*T{{1}, {2}, {4}}
fmt.Println(cmp.Equal(x, y)) // false
}

But far more useful for us with our test function is the cmp.Diff function which will produce a textual description of what is different between the two values, recursively.

func main() {
type T struct {
I int
}
x := []*T{{1}, {2}, {3}}
y := []*T{{1}, {2}, {4}}
diff := cmp.Diff(x, y)
fmt.Printf(diff)
}

Which instead produces:

% go run
{[]*main.T}[2].I:
-: 3
+: 4

Telling us that at element 2 of the slice of Ts the Ifield was expected to be 3, but was actually 4.

Putting this all together we have our table driven go-cmp test

func TestSplit(t *testing.T) {
tests := map[string]struct {
input string
sep string
want []string
}{
"simple": {input: "a/b/c", sep: "/", want: []string{"a", "b", "c"}},
"wrong sep": {input: "a/b/c", sep: ",", want: []string{"a/b/c"}},
"no sep": {input: "abc", sep: "/", want: []string{"abc"}},
"trailing sep": {input: "a/b/c/", sep: "/", want: []string{"a", "b", "c"}},
}

for name, tc := range tests {
t.Run(name, func(t *testing.T) {
got := Split(tc.input, tc.sep)
diff := cmp.Diff(tc.want, got)
if diff != "" {
t.Fatalf(diff)
}

})
}
}

Running this we get

% go test
--- FAIL: TestSplit (0.00s)
--- FAIL: TestSplit/trailing_sep (0.00s)
split_test.go:27: {[]string}[?->3]:
-: <non-existent>
+: ""
FAIL
exit status 1
FAIL split 0.006s

Using cmp.Diff our test harness isn’t just telling us that what we got and what we wanted were different. Our test is telling us that the strings are different lengths, the third index in the fixture shouldn’t exist, but the actual output we got an empty string, “”. From here fixing the test failure is straight forward.

Talk, then code

The open source projects that I contribute to follow a philosophy which I describe as talk, then code. I think this is generally a good way to develop software and I want to spend a little time talking about the benefits of this methodology.

Avoiding hurt feelings

The most important reason for discussing the change you want to make is it avoids hurt feelings. Often I see a contributor work hard in isolation on a pull request only to find their work is rejected. This can be for a bunch of reasons; the PR is too large, the PR doesn’t follow the local style, the PR fixes an issue which wasn’t important to the project or was recently fixed indirectly, and many more.

The underlying cause of all these issues is a lack of communication. The goal of the talk, then code philosophy is not to impede or frustrate, but to ensure that a feature lands correctly the first time, without incurring significant maintenance debt, and neither the author of the change, or the reviewer, has to carry the emotional burden of dealing with hurt feelings when a change appears out of the blue with an implicit “well, I’ve done the work, all you have to do is merge it, right?”

What does discussion look like?

Every new feature or bug fix should be discussed with the maintainer(s) of the project before work commences. It’s fine to experiment privately, but do not send a change without discussing it first.

The definition of talk for simple changes can be as little as a design sketch in a GitHub issue. If your PR fixes a bug, you should link to the bug it fixes. If there isn’t one, you should raise a bug and wait for the maintainers to acknowledge it before sending a PR. This might seem a little backward–who wouldn’t want a bug fixed–but consider the bug could be a misunderstanding in how the software works or it could be a symptom of a larger problem that needs further investigation.

For more complicated changes, especially feature requests, I recommend that a design document be circulated and agreed upon before sending code. This doesn’t have to be a full blown document, a sketch in an issue may be sufficient, but the key is to reach agreement using words, before locking it in stone with code.

In all cases you shouldn’t proceed to send code until there is a positive agreement from the maintainer that the approach is one they are happy with. A pull request is for life, not just for Christmas.

Code review, not design by committee

A code review is not the place for arguments about design. This is for two reasons. First, most code review tools are not suitable for long comment threads, GitHub’s PR interface is very bad at this, Gerrit is better, but few have a team of admins to maintain a Gerrit instance. More importantly, disagreements at the code review stage suggests there wasn’t agreement on how the change should be implemented.


Talk about what you want to code, then code what you talked about. Please don’t do it the other way around.

Maybe adding generics to Go IS about syntax after all

This is a short response to the recently announced Go 2 generics draft proposals

Update: This proposal is incomplete. It cannot replace two common use cases. The first is ensuring that several formal parameters are of the same type:

contract comparable(t T) {
t > t
}

func max(type T comparable)(a, b T) T

Here a, and b must be the same parameterised type — my suggestion would only assert that they had at least the same contract.

Secondly the it would not be possible to parameterise the type of return values:

contract viaStrings(t To, f From) {
var x string = f.String()
t.Set(string(""))
}

func SetViaStrings(type To, From viaStrings)(s []From) []To

Thanks to Ian Dawes and Sam Whited for their insight.

Bummer.


My lasting reaction to the Generics proposal is the proliferation of parenthesis in function declarations.

Although several of the Go team suggested that generics would probably be used sparingly, and the additional syntax would only be burden for the writer of the generic code, not the reader, I am sceptical that this long requested feature will be sufficiently niche as to be unnoticed by most Go developers.

It is true that type parameters can be inferred from their arguments, the declaration of generic functions and methods require a clumsy (type parameter declaration in place of the more common <T> syntaxes found in C++ and Java.

The reason for (type, it was explained to me, is Go is designed to be parsed without a symbol table. This rules out both <T> and [T] syntaxes as the parser needs ahead of time what kind of declaration a T is to avoid interpreting the angle or square braces as comparison or indexing operators respectively.

Contract as a superset of interfaces

The astute Roger Peppe quickly identified that contracts represent a superset of interfaces

Any behaviour you can express with an interface, you can do so and more, with a contract.

The remainder of this post are my suggestions for an alternative generic function declaration syntax that avoids add additional parenthesis by leveraging Roger’s observation.

Contract as a kind of type

The earlier Type Functions proposal showed that a type declaration can support a parameter. If this is correct, then the proposed contract declaration could be rewritten from

contract stringer(x T) {
var s string = x.String()
}

to

type stringer(x T) contract {
var s string = x.String()
}

This supports Roger’s observation that a contract is a superset of an interface. type stringer(x T) contract { ... } introduces a new contract type in the same way type stringer interface { ... }introduces a new interface type.

If you buy my argument that a contract is a kind of type is debatable, but if you’re prepared to take it on faith then the remainder of the syntax introduced in the generics proposal could be further simplified.

If contracts are types, use them as types

If a contract is an identifier then we can use a contract anywhere that a built-in type or interface is used. For example

func Stringify(type T stringer)(s []T) (ret []string) {
for _, v := range s {
ret = append(ret, v.String())
} return ret
}

Could be expressed as

func Stringify(s []stringer) (ret []string) {
for _, v := range s {
ret = append(ret, v.String())
} return ret
}

That is, in place of explicitly binding T to the contract stringer only for T to be referenced seven characters later, we bind the formal parameter s to a slice of stringer s directly. The similarity with the way this would previously be done with a stringer interface emphasises Roger’s observation.

1

Unifying unknown type parameters

The first example in the design proposal introduces an unknown type parameter.

func Print(type T)(s []T) {
for _, v := range s {
fmt.Println(v)
}
}

The operations on unknown types are limited, they are in some senses values that can only be read. Again drawing on Roger’s observation above, the syntax could potentially be expressed as:

func Print(s []contract{}) {
for _, v := range s {
fmt.Println(v)
}
}

Or maybe even

type T contract {} func Print(s []T) {
for _, v := range s {
fmt.Println(v)
}
}

In essence the literal contract{} syntax defines an anonymous unknown type analogous to interface{}‘s anonymous interface type.

Conclusion

The great irony is, after years of my bloviation that “adding generics to Go has nothing to do with the syntax”

2

, it turns out that, actually, yes, the syntax is crucial.

Using Go modules with Travis CI

In my previous post I converted httpstat to use Go 1.11’s upcoming module support. In this post I continue to explore integrating Go modules into a continuous integration workflow via Travis CI.

Life in mixed mode

The first scenario is probably the most likely for existing Go projects, a library or application targeting Go 1.10 and Go 1.11. httpstat has an existing CI story–I’m using Travis CI for my examples, if you use something else, please blog about your experience–and I wanted to test against the current and development versions of Go.

GO111MODULE

The straddling of two worlds is best accomplished via the GO111MODULE environment variable. GO111MODULE dictates when the Go module behaviour will be preferred over the Go 1.5-1.10’s vendor/ directory behaviour. In Go 1.11 the Go module behaviour is disabled by default for packages within $GOPATH (this also includes the default $GOPATH introduced in Go1.8). Thus, without additional configuration, Go1.11 inside Travis CI will behave like Go 1.10.

In my previous post I chose the working directory ~/devel/httpstat to ensure I was not working within a $GOPATH workspace. However CI vendors have worked hard to make sure that their CI bots always check out of the branch under test inside a working $GOPATH.

Fortunately there is a simple workaround for this, add env GO111MODULE=on before any go buildor test invocations in your .travis.yml to force Go module behaviour and ignore any vendor/ directories that may be present inside your repo.

1
language: go
go:
- 1.10.x
- master
os:
- linux
- osx
dist: trusty
sudo: false
install: true
script:
- env GO111MODULE=on go build
- env GO111MODULE=on go test

Creating a go.mod on the fly

You’ll note that I didn’t check in the go.mod module manifest I created in my previous post. This was initially an accident on my part, but one that turned out to be beneficial. By not checking in the go.mod file, the source of truth for dependencies remained httpstat’s Gopkg.toml file. When the call to env GO111MODULE=on go build executes on the Travis CI builder, the go tool converts my Gopkg.toml on the fly, then uses it to fetch dependencies before building.

$ env GO111MODULE=on go build
go: creating new go.mod: module github.com/davecheney/httpstat
go: copying requirements from Gopkg.lock
go: finding github.com/fatih/color v1.5.0
go: finding golang.org/x/sys v0.0.0-20170922123423-429f518978ab
go: finding golang.org/x/net v0.0.0-20170922011244-0744d001aa84
go: finding golang.org/x/text v0.0.0-20170915090833-1cbadb444a80
go: finding github.com/mattn/go-colorable v0.0.9
go: finding github.com/mattn/go-isatty v0.0.3
go: downloading github.com/fatih/color v1.5.0
go: downloading github.com/mattn/go-colorable v0.0.9
go: downloading github.com/mattn/go-isatty v0.0.3
go: downloading golang.org/x/net v0.0.0-20170922011244-0744d001aa84
go: downloading golang.org/x/text v0.0.0-20170915090833-1cbadb444a80

If you’re not using a dependency management tool that go mod knows how to convert from this advice may not work for you and you may have to maintain a go.mod manifest in parallel with you previous dependency management solution.

A clean slate

The second option I investigated, but ultimately did not pursue, was to treat the Travis CI builder, like my fresh Ubuntu 18.04 install, as a blank canvas. Rather than working around Travis CI’s attempts to check the branch out inside a working $GOPATH I experimented with treating the build as a C project

2

then invoking gimme directly. This also required me to check in my go.mod file as without Travis’ language: go support, the checkout was not moved into a $GOPATH folder. The latter seems like a reasonable approach if your project doesn’t intend to be compatible with Go 1.10 or earlier.

language: c
os:
- linux
- osx
dist: trusty
sudo: false
install:
- eval "$(curl -sL https://raw.githubusercontent.com/travis-ci/gimme/master/gimme | GIMME_GO_VERSION=master bash)"
script:
- go build
- go test

You can see the output from this branch here.

Sadly when run in this mode gimme is unable to take advantage of the caching provided by the language: go environment and must build Go 1.11 from source, adding three to four minutes delay to the install phase of the build. Once Go 1.11 is released and gimme can source a binary distribution this will hopefully address the setup latency.

Ultimately this option may end up being redundant if GO111MODULE=on becomes the default behaviour in Go 1.12 and the location Travis places the checkout becomes immaterial.