This post is a continuation of a suggestion I made on twitter a few days ago.
In Go, for any type T
, there exists a type *T
which is the result of an expression that takes the address of a variable of type T
1. For example:
type T struct { a int; b bool } var t T // t's type is T var p = &t // p's type is *T
These two types, T
and *T
are distinct, but *T
is not substitutable for T
2.
You can declare a method on any type that you own; that is, a type that you declare in your package3. Thus it follows that you can declare methods on both the type you declare, T
, and its corresponding derived pointer type, *T
. Another way to talk about this is to say methods on a type are declared to take a copy of their receiver’s value, or a pointer to their receiver’s value 4. So the question becomes, which is the most appropriate form to use?
Obviously if your method mutates its receiver, it should be declared on *T
. However, if the method does not mutate its receiver, is it safe to declare it on T
instead5?
It turns out that the cases where it is safe to do so are very limited. For example, it is well known that you should not copy a sync.Mutex
value as that breaks the invariants of the mutex. As mutexes control access to other things, they are frequently wrapped up in a struct
with the value they control:
package counter type Val struct { mu sync.Mutex val int } func (v *Val) Get() int { v.mu.Lock() defer v.mu.Unlock() return v.val } func (v *Val) Add(n int) { v.mu.Lock() defer v.mu.Unlock() v.val += n }
Most Go programmers know that it is a mistake to forget to declare the Get
or Add
methods on the pointer receiver *Val
. However any type that embeds a Val
to utilise its zero value, must also only declare methods on its pointer receiver otherwise it may inadvertently copy the contents of its embedded type’s values.
type Stats struct { a, b, c counter.Val } func (s Stats) Sum() int { return s.a.Get() + s.b.Get() + s.c.Get() // whoops }
A similar pitfall can occur with types that maintain slices of values, and of course there is the possibility for an unintended data race.
In short, I think that you should prefer declaring methods on *T
unless you have a strong reason to do otherwise.
- We say
T
but that is just a place holder for a type that you declare. - This rule is recursive, taking the address of a variable of type
*T
returns a result of type**T
. - This is why nobody can declare methods on primitive types like
int
. - Methods in Go are just syntactic sugar for a function which passes the receiver as the first formal parameter.
- If the method does not mutate its receiver, does it need to be a method?