Go, without package scoped variables

This is a thought experiment, what would Go look like if we could no longer declare variables at the package level? What would be the impact of removing package scoped variable declarations, and what could we learn about the design of Go programs?

I’m only talking about expunging var, the other five top level declarations would still be permitted as they are effectively constant at compile time. You can, of course, continue to declare variables at the function or block scope.

Why are package scoped variables bad?

But first, why are package scoped variables bad? Putting aside the problem of globally visible mutable state in a heavily concurrent language, package scoped variables are fundamentally singletons, used to smuggle state between unrelated concerns, encourage tight coupling and makes the code that relies on them hard to test.

As Peter Bourgon wrote recently:

tl;dr: magic is bad; global state is magic → [therefore, you want] no package level vars; no func init.

Removing package scoped variables, in practice

To put this idea to the test I surveyed the most popular Go code base in existence; the standard library, to see how package scoped variables were used, and assessed the effect applying this experiment would have.

Errors

One of the most frequent uses of public package level var declarations are errors; io.EOF,
sql.ErrNoRowscrypto/x509.ErrUnsupportedAlgorithm, and so on. Removing the use of package scoped variables would remove the ability to use public variables for sentinel error values. But what could be used to replace them?

I’ve written previously that you should prefer behaviour over type or identity when inspecting errors. Where that isn’t possible, declaring error constants removes the potential for modification which retaining their identity semantics.

The remaining error variables are private declarations which give a symbolic name to an error message. These error values are unexported so they cannot be used for comparison by callers outside the package. Declaring them at the package level, rather than at the point they occur inside a function negates the opportunity to add additional context to the error. Instead I recommend using something like pkg/errors to capture a stack trace at the point the error occurs.

Registration

A registration pattern is followed by several packages in the standard library such as net/http, database/sql, flag, and to a lesser extent log. It commonly involves a package scoped private map or struct which is mutated by a public function—a textbook singleton.

Not being able to create a package scoped placeholder for this state would remove the side effects in the image, database/sql, and crypto packages to register image decoders, database drivers and cryptographic schemes. However, this is precisely the magic that Peter is referring to–importing a package for the side effect of changing some global state of your program is truly spooky action at a distance.

Registration also promotes duplicated business logic. The net/http/pprof package registers itself, via a side effect with net/http.DefaultServeMux, which is both a potential security issue—other code cannot use the default mux without exposing the pprof endpoints—and makes it difficult to convince the net/http/pprof package to register its handlers with another mux.

If package scoped variables were no longer used, packages like net/http/pprof could provide a function that registers routes on a supplied http.ServeMux, rather than relying on side effects to altering global state.

Removing the ability to apply the registry pattern would also solve the issues encountered when multiple copies of the same package are imported in the final binary and try to register themselves during startup.

Interface satisfaction assertions

The interface satisfaction idiom

var _ SomeInterface = new(SomeType)

occurred at least 19 times in the standard library. In my opinion these assertions are tests. They don’t need to be compiled, only to be eliminated, every time you build your package. Instead they should be moved to the corresponding _test.go file. But if we’re prohibiting package scoped variables, this prohibition also applies to tests, so how can we keep this test?

One option is to move the declaration from package scope to function scope, which will still fail to compile if SomeType stop implementing SomeInterface

func TestSomeTypeImplementsSomeInterface(t *testing.T) {
       // won't compile if SomeType does not implement SomeInterface
       var _ SomeInterface = new(SomeType)
}

But, as this is actually a test, it’s not hard to rewrite this idiom as a standard Go test.

func TestSomeTypeImplementsSomeInterface(t *testing.T) {
       var i interface{} = new(SomeType)
       if _, ok := i.(SomeInterface); !ok {
               t.Fatalf("expected %t to implement SomeInterface", i)
       }
}

As a side note, because the spec says that assignment to the blank identifier must fully evaluate the right hand side of the expression, there are probably a few suspicious package level initialisation constructs hidden in those var declarations.

It’s not all beer and skittles

The previous sections showed that avoiding package scoped variables might be possible, but there are some areas of the standard library which have proved more difficult to apply this idea.

Real singletons

While I think that the singleton pattern is generally overplayed, especially in its registration form, there are always some real singleton values in every program. A good example of this is  os.Stdout and friends.

package os 

var (
        Stdin  = NewFile(uintptr(syscall.Stdin), "/dev/stdin")
        Stdout = NewFile(uintptr(syscall.Stdout), "/dev/stdout")
        Stderr = NewFile(uintptr(syscall.Stderr), "/dev/stderr")
)

There are a few problems with this declaration. Firstly Stdin, Stdout, and Stderr are of type *os.File, not their respective io.Reader or io.Writer interfaces. This makes replacing them with alternatives problematic. However the notion of replacing them is exactly the kind of magic that this experiment seeks to avoid.

As the previous constant error example showed, we can retain the singleton nature of the standard IO file descriptors, such that packages like log and fmt can address them directly, but avoid declaring them as mutable public variables with something like this:

package main

import (
        "fmt"
        "syscall"
)

type readfd int

func (r readfd) Read(buf []byte) (int, error) {
        return syscall.Read(int(r), buf)
}

type writefd int

func (w writefd) Write(buf []byte) (int, error) {
        return syscall.Write(int(w), buf)
}

const (
        Stdin  = readfd(0)
        Stdout = writefd(1)
        Stderr = writefd(2)
)

func main() {
        fmt.Fprintf(Stdout, "Hello world")
}

Caches

The second most common use of unexported package scoped variables are caches. These come in two forms; real caches made out of maps (see the registration pattern above) and sync.Pool, and quasi constant variables that ameliorate the cost of a compilation.

As an example the crypto/ecsda package has a zr type whose Read method zeros any buffer passed to it. The package keeps a single instance of zr around because it is embedded in other structs as an io.Reader, potentially escaping to the heap each time it is instantiated.

package ecdsa 

type zr struct {
        io.Reader
}

// Read replaces the contents of dst with zeros.
func (z *zr) Read(dst []byte) (n int, err error) {
        for i := range dst {
                dst[i] = 0
        }
        return len(dst), nil
}

var zeroReader = &zr{}

However zr doesn’t embed an io.Reader, it is an io.Reader, so the unused zr.Reader field could be eliminated, giving zr a width of zero. In my testing this modified type can be created directly where it is used without performance regression.

        csprng := cipher.StreamReader{
                R: zr{},
                S: cipher.NewCTR(block, []byte(aesIV)),
        }

Perhaps some of the caching decision could be revisited as the inlining and escape analysis options available to the compiler have improved significantly since the standard library was first written.

Tables

The last major use of  common use of private package scoped variables is for tables, as seen in the unicode, crypto/*, and math packages. These tables either encode constant data in the form of arrays of integer types, or less commonly simple structs and maps.

Replacing package scoped variables with constants would require a language change along the lines of #20443. So, fundamentally, providing there is no way to modify those tables at run time, they are probably a reasonable exception to this proposal.

A bridge too far

Even though this post was just a thought experiment, it’s clear that forbidding all package scoped variables is too draconian to be workable as a language precept. Addressing the bespoke uses of private var usage may prove impractical from a performance standpoint, would be akin to pinning a “kick me” sign to ones back and inviting all the Go haters to take a free swing.

However, I believe there are a few concrete recommendations that can be drawn from this exercise, without going to the extreme of changing the language spec.

  • Firstly, public var declarations should be eschewed. This is not a controversial conclusion and not one that is unique to Go. The singleton pattern is discouraged, and an unadorned public variable that can be changed at any time by any party that knows its name should be a design, and concurrency, red flag.
  • Secondly, where public package var declarations are used, the type of those variables should be carefully constructed to expose as little surface area as possible. It should not be the default to take a type expected to be used on a per instance basis, and assign it to a package scoped variable.

Private variable declarations are more nuanced, but certain patterns can be observed:

  • Private variables with public setters, which I labelled registries, have the same effect on the overall program design as their public counterparts. Rather than registering dependencies globally, they should instead be passed in during declaration using a constructor function, compact literal, config structure, or option function.
  • Caches of []byte vars can often be expressed as consts at no performance cost.  Don’t forget the compiler is pretty good at avoiding string([]byte) conversions where they don’t escape the function call.
  • Private variables that hold tables, like the unicode package, are an unavoidable consequence of the lack of a constant array type. As long as they are unexported, and do not expose any way to mutate them, they can be considered effectively constant for the purpose of this discussion.

The bottom line; think long and hard about adding package scoped variables that are mutated during the operation of your program. It may be a sign that you’ve introduced magic global state.