This is an experience report about a gotcha in Go that catches every Go programmer at least once. The following program is extracted from a larger version that caused my co-workers to lose several hours today.
package main
import "fmt"
type T struct{}
func (t T) F() {}
type P interface {
F()
}
func newT() *T { return new(T) }
type Thing struct {
P
}
func factory(p P) *Thing {
return &Thing{P: p}
}
const ENABLE_FEATURE = false
func main() {
t := newT()
t2 := t
if !ENABLE_FEATURE {
t2 = nil
}
thing := factory(t2)
fmt.Println(thing.P == nil)
}
This distilled version of the program in question, while non-sensical, contains all the attributes of the original. Take some time to study the program and ask yourself, does the program print true
or false
?
nil != nil
Not to spoil the surprise, but the program prints false
. The reason is, while nil
is assigned to t2
, when t2
is passed to factory
it is “boxed” into an variable of type P
; an interface. Thus, thing.P
does not equal nil
because while the value of P
was nil
, its concrete type was *T
.
Typed nil
You’ve probably realised the cause of this problem is the dreaded typed nil, a gotcha that has its own entry in the Go FAQ. The typed nil emerges as a result of the definition of a interface type; a structure which contains the concrete type of the value stored in the interface, and the value itself. This structure can’t be expressed in pure Go, but can be visualised with this example:
var n int = 200
var i interface{} = n
The interface value i
is assigned a copy of the value of n
, so i
‘s type slot holds n
‘s type; int
, and it’s data slot holds the value 200
. We can write this more concisely as (int, 200)
.
In the original program we effectively have the following:
var t2 *T = nil
var p P = t2
Which results in p
, using our nomenclature, holding the value (*T, nil)
. So then, why does the expression p == nil
evaluate to false
? The explanation I prefer is:
nil
is a compile time constant which is converted to whatever type is required, just as constant literals like200
are converted to the required integer type automatically.- Given the expression
p == nil
, both arguments must be of the same type, thereforenil
is converted to the same type asp
, which is an interface type. So we can rewrite the expression as(*T, nil) == (nil, nil)
. - As equality in Go almost always operates as a bitwise comparison it is clear that the memory bits which hold the interface value
(*T, nil)
are different to the bits that hold(nil, nil)
thus the expression evaluates tofalse.
Put simply, an interface value is only equal to nil
if both the type and the value stored inside the interface are both nil
.
For a detailed explanation of the mechanics behind Go’s interface implementation, Russ Cox has a great post on his blog.
The future of typed nil
s in Go 2
Typed nils are an entirely logical result of the way dynamic types, aka interfaces, are implemented, but are almost never what the programmer wanted. To tie this back to Russ’s GopherCon keynote, I believe typed nils are an example where Go fails to scale for programming teams.
This explanation has consumed 700 words–and several hours over chat today–to explain, and in the end my co-workers were left with a bad taste in their mouths. The clarity of interfaces was soured by a suspicion that gotchas like this were lurking in their codebase. As an experienced Go programmer I’ve learnt to be wary of the possibility of a typed nil during code review, but it is unfortunate that they remain something that each Go programmer has to learn the hard way.
For Go 2.0 I’d like to start the discussion of what it would mean if comparing an interface value to nil
considered the value portion of the interface such that the following evaluated to true
:
var b *bytes.Buffer
var r io.Reader = b
fmt.Println(r == nil)
There are obviously some subtleties that this pithy demand fails to capture, but a desire to make this seemingly straight forward comparison less error prone would, at least in my mind, make Go 2 easier to scale to larger development teams.