This is a post about data races. The code for this post lives on Github, github.com/davecheney/benandjerry.
The example program simulates two Ice cream makers, Ben and Jerry, who greet their customers randomly.
package main import "fmt" type IceCreamMaker interface { // Hello greets a customer Hello() } type Ben struct { name string } func (b *Ben) Hello() { fmt.Printf("Ben says, \"Hello my name is %s\"\n", b.name) } type Jerry struct { name string } func (j *Jerry) Hello() { fmt.Printf("Jerry says, \"Hello my name is %s\"\n", j.name) } func main() { var ben = &Ben{"Ben"} var jerry = &Jerry{"Jerry"} var maker IceCreamMaker = ben var loop0, loop1 func() loop0 = func() { maker = ben go loop1() } loop1 = func() { maker = jerry go loop0() } go loop0() for { maker.Hello() } }
It’s a data race, silly
Most programmers should easily spot the data race in this program.
The loop
functions are changing the value of maker
without using a lock, so it is undefined which implementation of Hello
will be called when maker.Hello()
is executed by the for
loop in the main function.
Some programmers appear to be happy with this data race; either Ben or Jerry will greet the customer, it doesn’t matter which.
Lets run this code, and see what happens.
% env GOMAXPROCS=2 go run main.go ... Ben says, "Hello my name is Ben" Jerry says, "Hello my name is Jerry" Jerry says, "Hello my name is Jerry" Ben says, "Hello my name is Jerry" Ben says, "Hello my name is Ben" ...
What! Hold up. Ben sometimes thinks that he is Jerry. How is this possible?
Interface values
The key to understanding this race is to understand how interface values are represented in memory.
An interface is conceptually a struct
with two fields.
If we were to describe an interface in Go, it would look something like this.
type interface struct { Type uintptr // points to the type of the interface implementation Data uintptr // holds the data for the interface's receiver }
Type
points to a structure that describes the type of the value that implements this interface. Data
points to the value of the implementation itself. The contents of Data
are passed as the receiver of any method called via the interface.
For the statement var maker IceCreamMaker = ben
, the compiler will generate code that does the following.
The interface’s Type
field is set to point to the definition of the *Ben
type, and the Data
field contains a copy of ben
, that is, a pointer to a Ben
value.
When loop1()
executes the statement, maker = jerry
, both fields of the interface value must be updated.
Type
now points to the definition of a *Jerry
and Data
contains a pointer to an instance of Jerry
.
The Go memory model says that writes to a single machine word will be atomic, but interfaces are two word values. It is possible that another goroutine may observe the contents of the interface value while it is being changed. In this case it may see something like this
And so Jerry
‘s Hello()
function is called with ben
as the receiver.
Conclusion
There is no such thing as a safe data race. Your program either has no data races, or its operation is undefined.
In this example, the layout of the Ben
and Jerry
structs were identical in memory, so they were in some sense compatible. Imagine the chaos that would occur if they had different memory representations (this is left as an exercise to the reader).
The Go race detector will spot this error, and many others, and is as simple to use as adding the -race
flag to your go test
, build
, or install
command.
Bonus question
In the example code, the Hello
method is declared on Ben
or Jerry
‘s pointer receiver. If this were instead declared as a method on the Ben
or Jerry
value, would this solve the data race ?
Futher reading
- Russ Cox’s original piece on Go interfaces, and while you’re there, you should read the piece Russ wrote explaining this issue.
- The Go race detector blog post.