Should methods be declared on T or *T

This post is a continuation of a suggestion I made on twitter a few days ago.

In Go, for any type T, there exists a type *T which is the result of an expression that takes the address of a variable of type T¹. For example:

type T struct { a int; b bool }
var t T    // t's type is T
var p = &t // p's type is *T

These two types, T and *T are distinct, but *T is not substitutable for T².

You can declare a method on any type that you own; that is, a type that you declare in your package³. Thus it follows that you can declare methods on both the type you declare, T, and its corresponding derived pointer type, *T. Another way to talk about this is to say methods on a type are declared to take a copy of their receiver’s value, or a pointer to their receiver’s value ⁴. So the question becomes, which is the most appropriate form to use?

Obviously if your method mutates its receiver, it should be declared on *T. However, if the method does not mutate its receiver, is it safe to declare it on T instead⁵?

It turns out that the cases where it is safe to do so are very limited. For example, it is well known that you should not copy a sync.Mutex value as that breaks the invariants of the mutex. As mutexes control access to other things, they are frequently wrapped up in a struct with the value they control:

package counter

type Val struct {
        mu  sync.Mutex
        val int
}

func (v *Val) Get() int {
        v.mu.Lock()
        defer v.mu.Unlock()
        return v.val
}

func (v *Val) Add(n int) {
        v.mu.Lock()
        defer v.mu.Unlock()
        v.val += n
}

Most Go programmers know that it is a mistake to forget to declare the Get or Add methods on the pointer receiver *Val. However any type that embeds a Val to utilise its zero value, must also only declare methods on its pointer receiver otherwise it may inadvertently copy the contents of its embedded type’s values.

type Stats struct {
        a, b, c counter.Val
}

func (s Stats) Sum() int {
        return s.a.Get() + s.b.Get() + s.c.Get() // whoops
}

A similar pitfall can occur with types that maintain slices of values, and of course there is the possibility for an unintended data race.

In short, I think that you should prefer declaring methods on *T unless you have a strong reason to do otherwise.

We say T but that is just a place holder for a type that you declare.
This rule is recursive, taking the address of a variable of type *T returns a result of type **T.
This is why nobody can declare methods on primitive types like int.
Methods in Go are just syntactic sugar for a function which passes the receiver as the first formal parameter.
If the method does not mutate its receiver, does it need to be a method?

Dave Cheney

The acme of foolishness

Should methods be declared on T or *T

Related posts: