Tag Archives: pointers

Understand Go pointers in less than 800 words or your money back

This post is for programmers coming to Go who are unfamiliar with the idea of pointers or a pointer type in Go.

What is a pointer?

Simply put, a pointer is a value which points to the address of another. This is the textbook explanation, but if you’re coming from a language that doesn’t let you talk about address of a variable, it could very well be written in Cuneiform.

Let’s break this down.

What is memory?

Computer memory, RAM, can be thought of as a sequence of boxes, placed one after another in a line. Each box, or cell, is labeled with a unique number, which increments sequentially; this is the address of the cell, its memory location.

Each cell holds a single value. If you know the memory address of a cell, you can go to that cell and read its contents. You can place a value in that cell; replacing anything that was in there previously.

That’s all there is to know about memory. Everything the CPU does is expressed as fetching and depositing values into memory cells.

What is a variable?

To write a program that retrieves the value stored in memory location 200, multiples it by 3 and deposits the result into memory location 201, we could write something like this in pseudocode:

  • retrieve the value stored in address 200 and place it in the CPU.
  • multiple the value stored in the CPU by 3.
  • deposit the value stored in the CPU into memory location 201.


This is exactly how early programs were written; programmers would keep a list of memory locations, who used it, when, and what the value stored there represented.

Obviously this was tedious and error prone, and meant every possible value stored in memory had to be assigned an address during the construction of the program. Worse, this arrangement made it difficult to allocate storage to variables dynamically as the program ran– just imagine if you had to write large programs using only global variables.

To address this, the notion of a variable was created. A variable is just a convenient, alphanumeric pseudonym for a memory location; a label, or nickname.

Now, rather than talking about memory locations, we can talk about variables, which are convenient names we give to memory locations. The previous program can now be expressed as:

  • Retrieve the value stored in variable a and place it in the CPU.
  • multiple it by 3
  • deposit the value into the variable b.

This is the same program, with one crucial improvement–because we no longer need to talk about memory locations directly, we no longer need to keep track of them–that drudgery is left to the compiler.

Now we can write a program like

var a = 6
var b = a * 3

And the compiler will make sure that the variables a and b are assigned unique memory locations to hold their value for as long as needed.

What is a pointer?

Now that we know that memory is just a series of numbered cells, and variables are just nicknames for a memory location assigned by the compiler, what is a pointer?

A pointer is a value that points to the memory address of another variable.

The pointer points to memory address of a variable, just as a variable represents the memory address of value.

Let’s have a look at this program fragment

func main() {
        a := 200
        b := &a
        *b++
        fmt.Println(a)
}

On the first line of main we declare a new variable a and assign it the value 200.

Next we declare a variable b and assign it the address a. Remember that we don’t know the exact memory location where a is stored, but we can still store a‘s address in b.

The third line is probably the most confusing, because of the strongly typed nature of Go. b contains the address of variable a, but we want to increment the value stored in a. To do this we must dereference b, follow the pointer from b to a.

Then we add one the value, and store it back in the memory location stored in b.

The final line prints the value of a, showing that it has increased to 201.

Conclusion

If you are coming from a language with no notion of pointers, or where every variable is implicitly a pointer don’t panic, forming a mental model of how variables and pointers relate takes time and practice. Just remember this rule:

A pointer is a value that points to the memory address of another variable.

Next: There is no pass-by-reference in Go

Should methods be declared on T or *T

This post is a continuation of a suggestion I made on twitter a few days ago.

In Go, for any type T, there exists a type *T which is the result of an expression that takes the address of a variable of type T1. For example:

type T struct { a int; b bool }
var t T    // t's type is T
var p = &t // p's type is *T

These two types, T and *T are distinct, but *T is not substitutable for T2.

You can declare a method on any type that you own; that is, a type that you declare in your package3. Thus it follows that you can declare methods on both the type you declare, T, and its corresponding derived pointer type, *T. Another way to talk about this is to say methods on a type are declared to take a copy of their receiver’s value, or a pointer to their receiver’s value 4. So the question becomes, which is the most appropriate form to use?

Obviously if your method mutates its receiver, it should be declared on *T. However, if the method does not mutate its receiver, is it safe to declare it on T instead5?

It turns out that the cases where it is safe to do so are very limited. For example, it is well known that you should not copy a sync.Mutex value as that breaks the invariants of the mutex. As mutexes control access to other things, they are frequently wrapped up in a struct with the value they control:

package counter

type Val struct {
        mu  sync.Mutex
        val int
}

func (v *Val) Get() int {
        v.mu.Lock()
        defer v.mu.Unlock()
        return v.val
}

func (v *Val) Add(n int) {
        v.mu.Lock()
        defer v.mu.Unlock()
        v.val += n
}

Most Go programmers know that it is a mistake to forget to declare the Get or Add methods on the pointer receiver *Val. However any type that embeds a Val to utilise its zero value, must also only declare methods on its pointer receiver otherwise it may inadvertently copy the contents of its embedded type’s values.

type Stats struct {
        a, b, c counter.Val
}

func (s Stats) Sum() int {
        return s.a.Get() + s.b.Get() + s.c.Get() // whoops
}

A similar pitfall can occur with types that maintain slices of values, and of course there is the possibility for an unintended data race.

In short, I think that you should prefer declaring methods on *T unless you have a strong reason to do otherwise.


  1. We say T but that is just a place holder for a type that you declare.
  2. This rule is recursive, taking the address of a variable of type *T returns a result of type **T.
  3. This is why nobody can declare methods on primitive types like int.
  4. Methods in Go are just syntactic sugar for a function which passes the receiver as the first formal parameter.
  5. If the method does not mutate its receiver, does it need to be a method?

Pointers in Go

This blog post was originally a comment on a Google Plus page, but apparently one cannot create a href to a comment so it was suggested I rewrite it as a blog post.


Go pointers, like C pointers, are values that, uh, point to other values. This is a tremendously important concept and shouldn’t be considered dangerous or something to get hung up on.

Here are several ways that Go improves over C pointers, and C++, for that matter.

  1. There is no pointer arithmetic. You cannot write in Go
    var p *int
    p++

    That is, you cannot alter the address p points to unless you assign another address to it.

  2. This means there is no pointer/array duality in Go. If you don’t know what I’m talking about, read this book. Even if you have no intention of programming in C or Go, it will enrich your life.
  3. Once a value is assigned to a pointer, with the exception of nil which I’ll cover in the next point, Go guarantees that the thing being pointed to will continue to be valid for the lifetime of the pointer. So
    func f() *int { 
            i := 1
            return &i
    }

    is totally safe to do in Go. The compiler will arrange for the memory location holding the value of i to be valid after f() returns.

  4. Nil pointers. Yes, you can still have nil pointers and panics because of them, however in my experience the general level of hysteria generated by nil pointer errors, and the amount of defensive programming present in other languages like Java is not present in Go.

    I believe this is for two three reasons

    1. multiple return values, nil is not used as a sentinel for something went wrong. Obviously this leaves the question of programmers not checking their errors, but this is simply a matter of education.
    2. Strings are value types, not pointers, which is the, IMO, the number one cause of null pointer exceptions in languages like Java and C++.
      var s string // the zero value of s is "", not nil
    3. In fact, most of the built in data types, maps, slices, channels, and arrays, have a sensible default if they are left uninitialized. Thanks to Dustin Sallings for pointing this out.