A short talk about unit testing that I gave at the Go London User Group last month.
Links:
A short talk about unit testing that I gave at the Go London User Group last month.
Links:
The name of a variable should describe its contents, not the type of the contents. Consider this example:
var usersMap map[string]*User
What are some good properties of this declaration? We can see that it’s a map, and it has something to do with the *User
type, so that’s probably good. But usersMap
is a map and Go, being a statically typed language, won’t let us accidentally use a map where a different type is required, so the Map
suffix as a safety precaution is redundant.
Now, consider what happens if we declare other variables using this pattern:
var (
companiesMap map[string]*Company
productsMap map[string]*Products
)
Now we have three map type variables in scope, usersMap
, companiesMap
, and productsMap
, all mapping string
s to different struct
types. We know they are maps, and we also know that their declarations prevent us from using one in place of another—the compiler will throw an error if we try to use companiesMap
where the code is expecting a map[string]*User
. In this situation it’s clear that the Map
suffix does not improve the clarity of the code, its just extra boilerplate to type.
My suggestion is avoid any suffix that resembles the type of the variable. Said another way, if users
isn’t descriptive enough, then usersMap
won’t be either.
This advice also applies to function parameters. For example:
type Config struct {
//
}
func WriteConfig(w io.Writer, config *Config)
Naming the *Config
parameter config
is redundant. We know it’s a pointer to a Config
, it says so right there in the declaration. Instead consider if conf
will do, or maybe just c
if the lifetime of the variable is short enough.
This advice is more than just a desire for brevity. If there is more that one *Config
in scope at any one time, calling them config1
and config2
is less descriptive than calling them original
and updated
. The latter are less likely to be accidentally transposed—something the compiler won’t catch—while the former differ only in a one character suffix.
Finally, don’t let package names steal good variable names. The name of an imported identifier includes its package name. For example the Context
type in the context
package will be known as context.Context
when imported into another package . This makes it impossible to use context
as a variable or type, unless of course you rename the import, but that’s throwing good after bad. This is why the local declaration for context.Context
types is traditionally ctx
. eg.
func WriteLog(ctx context.Context, message string)
A variable’s name should be independent of its type. You shouldn’t name your variables after their types for the same reason you wouldn’t name your pets “dog” or “cat”. You shouldn’t include the name of your type in the name of your variable for the same reason.
Go 2 aims to improve the overhead of error handling, but do you know what is better than an improved syntax for handling errors? Not needing to handle errors at all. Now, I’m not saying “delete your error handling code”, instead I’m suggesting changing your code so you don’t have as many errors to handle.
This article draws inspiration from a chapter in John Ousterhout’s, A philosophy of Software Design, “Define Errors Out of Existence”. I’m going to try to apply his advice to Go.
Here’s a function to count the number of lines in a file,
func CountLines(r io.Reader) (int, error) {
var (
br = bufio.NewReader(r)
lines int
err error
)
for {
_, err = br.ReadString('\n')
lines++
if err != nil {
break
}
}
if err != io.EOF {
return 0, err
}
return lines, nil
}
We construct a bufio.Reader
, then sit in a loop calling the ReadString
method, incrementing a counter until we reach the end of the file, then we return the number of lines read. That’s the code we wanted to write, instead CountLines
is made more complicated by its error handling. For example, there is this strange construction:
_, err = br.ReadString('\n')
lines++
if err != nil {
break
}
We increment the count of lines before checking the error—that looks odd. The reason we have to write it this way is ReadString
will return an error if it encounters an end-of-file—io.EOF
—before hitting a newline character. This can happen if there is no trailing newline.
To address this problem, we rearrange the logic to increment the line count, then see if we need to exit the loop.1
But we’re not done checking errors yet. ReadString
will return io.EOF
when it hits the end of the file. This is expected, ReadString
needs some way of saying stop, there is nothing more to read. So before we return the error to the caller of CountLine
, we need to check if the error was not io.EOF
, and in that case propagate it up, otherwise we return nil
to say that everything worked fine. This is why the final line of the function is not simply
return lines, err
I think this is a good example of Russ Cox’s observation that error handling can obscure the operation of the function. Let’s look at an improved version.
func CountLines(r io.Reader) (int, error) {
sc := bufio.NewScanner(r)
lines := 0
for sc.Scan() {
lines++
}
return lines, sc.Err()
}
This improved version switches from using bufio.Reader
to bufio.Scanner
. Under the hood bufio.Scanner
uses bufio.Reader
adding a layer of abstraction which helps remove the error handling which obscured the operation of our previous version of CountLines
2
The method sc.Scan()
returns true
if the scanner has matched a line of text and has not encountered an error. So, the body of our for
loop will be called only when there is a line of text in the scanner’s buffer. This means our revised CountLines
correctly handles the case where there is no trailing newline, It also correctly handles the case where the file is empty.
Secondly, as sc.Scan
returns false
once an error is encountered, our for
loop will exit when the end-of-file is reached or an error is encountered. The bufio.Scanner
type memoises the first error it encounters and we recover that error once we’ve exited the loop using the sc.Err()
method.
Lastly, buffo.Scanner
takes care of handling io.EOF
and will convert it to a nil
if the end of file was reached without encountering another error.
My second example is inspired by Rob Pikes’ Errors are values blog post.
When dealing with opening, writing and closing files, the error handling is present but not overwhelming as, the operations can be encapsulated in helpers like ioutil.ReadFile
and ioutil.WriteFile
. However, when dealing with low level network protocols it often becomes necessary to build the response directly using I/O primitives, thus the error handling can become repetitive. Consider this fragment of a HTTP server which is constructing a HTTP/1.1 response.
type Header struct {
Key, Value string
}
type Status struct {
Code int
Reason string
}
func WriteResponse(w io.Writer, st Status, headers []Header, body io.Reader) error {
_, err := fmt.Fprintf(w, "HTTP/1.1 %d %s\r\n", st.Code, st.Reason)
if err != nil {
return err
}
for _, h := range headers {
_, err := fmt.Fprintf(w, "%s: %s\r\n", h.Key, h.Value)
if err != nil {
return err
}
}
if _, err := fmt.Fprint(w, "\r\n"); err != nil {
return err
}
_, err = io.Copy(w, body)
return err
}
First we construct the status line using fmt.Fprintf
, and check the error. Then for each header we write the header key and value, checking the error each time. Lastly we terminate the header section with an additional \r\n
, check the error, and copy the response body to the client. Finally, although we don’t need to check the error from io.Copy
, we do need to translate it from the two return value form that io.Copy
returns into the single return value that WriteResponse
expects.
Not only is this a lot of repetitive work, each operation—fundamentally writing bytes to an io.Writer
—has a different form of error handling. But we can make it easier on ourselves by introducing a small wrapper type.
type errWriter struct {
io.Writer
err error
}
func (e *errWriter) Write(buf []byte) (int, error) {
if e.err != nil {
return 0, e.err
}
var n int
n, e.err = e.Writer.Write(buf)
return n, nil
}
errWriter
fulfils the io.Writer
contract so it can be used to wrap an existing io.Writer
. errWriter
passes writes through to its underlying writer until an error is detected. From that point on, it discards any writes and returns the previous error.
func WriteResponse(w io.Writer, st Status, headers []Header, body io.Reader) error {
ew := &errWriter{Writer: w}
fmt.Fprintf(ew, "HTTP/1.1 %d %s\r\n", st.Code, st.Reason)
for _, h := range headers {
fmt.Fprintf(ew, "%s: %s\r\n", h.Key, h.Value)
}
fmt.Fprint(ew, "\r\n")
io.Copy(ew, body)
return ew.err
}
Applying errWriter
to WriteResponse
dramatically improves the clarity of the code. Each of the operations no longer needs to bracket itself with an error check. Reporting the error is moved to the end of the function by inspecting the ew.err
field, avoiding the annoying translation from io.Copy
’s return values.
When you find yourself faced with overbearing error handling, try to extract some of the operations into a helper type.
Writing a good Go package starts with its name. Think of your package’s name as an elevator pitch, you have to describe what it does using just one word.
A common cause of poor package names are utility packages. These are packages where helpers and utility code congeal. These packages contain an assortment of unrelated functions, as such their utility is hard to describe in terms of what the package provides. This often leads to a package’s name being derived from what the package contains—utilities.
Package names like utils
or helpers
are commonly found in projects which have developed deep package hierarchies and want to share helper functions without introducing import loops. Extracting utility functions to new package breaks the import loop, but as the package stems from a design problem in the project, its name doesn’t reflect its purpose, only its function in breaking the import cycle.
[A little] duplication is far cheaper than the wrong abstraction.
— Sandy Metz
My recommendation to improve the name of utils
or helpers
packages is to analyse where they are imported and move the relevant functions into the calling package. Even if this results in some code duplication this is preferable to introducing an import dependency between two packages. In the case where utility functions are used in many places, prefer multiple packages, each focused on a single aspect with a correspondingly descriptive name.
Packages with names like base
or common
are often found when functionality common to two or more related facilities, for example common types between a client and server or a server and its mock, has been refactored into a separate package. Instead the solution is to reduce the number of packages by combining client, server, and common code into a single package named after the facility the package provides.
For example, the net/http
package does not have client
and server
packages, instead it has client.go
and server.go
files, each holding their respective types. transport.go
holds for the common message transport code used by both HTTP clients and servers.
Name your packages after what they provide, not what they contain.
Garbage collection is a field with its own terminology. Concepts like like mutators, card marking, and write barriers create a hurdle to understanding how garbage collectors work. Here’s an analogy to explain the operations of a concurrent garbage collector using everyday items found in the workplace.
Before we discuss the operation of concurrent garbage collection, let’s introduce the dramatis personae. In offices around the world you’ll find one of these:
In the workplace coffee is a natural resource. Employees visit the break room and fill their cups as required. That is, until the point someone goes to fill their cup only to discover the pot is empty!
Immediately the office is thrown into chaos. Meeting are called. Investigations are held. The perpetrator who took the last cup without refilling the machine is found and reprimanded. Despite many passive aggressive notes the situation keeps happening, thus a committee is formed to decide if a larger coffee pot should be requisitioned. Once the coffee maker is again full office productivity slowly returns to normal.
This is the model of stop the world garbage collection. The various parts of your program proceed through their day consuming memory, or in our analogy coffee, without a care about the next allocation that needs to be made. Eventually one unlucky attempt to allocate memory is made only to find the heap, or the coffee pot, exhausted, triggering a stop the world garbage collection.
Down the road at a more enlightened workplace, management have adopted a different strategy for mitigating their break room’s coffee problems. Their policy is simple: if the pot is more than half full, fill your cup and be on your way. However, if the pot is less than half full, before filling your cup, you must add a little coffee and a little water to the top of the machine. In this way, by the time the next person arrives for their re-up, the level in the pot will hopefully have risen higher than when the first person found it.
This policy does come at a cost to office productivity. Rather than filling their cup and hoping for the best, each worker may, depending on the aggregate level of consumption in the office, have to spend a little time refilling the percolator and topping up the water. However, this is time spent by a person who was already heading to the break room. It costs a few extra minutes to maintain the coffee machine, but does not impact their officemates who aren’t in need of caffeination. If several people take a break at the same time, they will all find the level in the pot below the half way mark and all proceed to top up the coffee maker–the more consumption, the greater the rate the machine will be refilled, although this takes a little longer as the break room becomes congested.
This is the model of concurrent garbage collection as practiced by the Go runtime (and probably other language runtimes with concurrent collectors). Rather than each heap allocation proceeding blindly until the heap is exhausted, leading to a long stop the world pause, concurrent collection algorithms spread the work of walking the heap to find memory which is no longer reachable over the parts of the program allocating memory. In this way the parts of the program which allocate memory each pay a small cost–in terms of latency–for those allocations rather than the whole program being forced to halt when the heap is exhausted.
Lastly, in keeping with the office coffee model, if the rate of coffee consumption in the office is so high that management discovers that their staff are always in the break room trying desperately to refill the coffee machine, it’s time to invest in a machine with a bigger pot–or in garbage collection terms, grow the heap.
As the tech lead on non SaaS product I spend a lot of my time worrying about testing. Specifically we have tests that cover code, but what is covering the tests? Tests are important to give you certainty that what your product says on the tin is what it will do when people take it home and unwrap it, but what’s backstopping the tests? Testing lets you refactor with impunity, but what if you want to refactor your tests?
This presentation by Ian Cooper takes a little while to get going but is worth persisting with. Cooper’s observations that the unit of the unit test is not a type, or a class, but the API–in Go terms, the public API of a package–was revelatory for me.
Bonus: Michael Feathers’ YOW ! 2016 presentation; Testing Patience.
This is a short response to the recently announced Go 2 generics draft proposals
Update: This proposal is incomplete. It cannot replace two common use cases. The first is ensuring that several formal parameters are of the same type:
contract comparable(t T) {
t > t
}
func max(type T comparable)(a, b T) T
Here a
, and b
must be the same parameterised type — my suggestion would only assert that they had at least the same contract.
Secondly the it would not be possible to parameterise the type of return values:
contract viaStrings(t To, f From) {
var x string = f.String()
t.Set(string(""))
}
func SetViaStrings(type To, From viaStrings)(s []From) []To
Thanks to Ian Dawes and Sam Whited for their insight.
Bummer.
My lasting reaction to the Generics proposal is the proliferation of parenthesis in function declarations.
Although several of the Go team suggested that generics would probably be used sparingly, and the additional syntax would only be burden for the writer of the generic code, not the reader, I am sceptical that this long requested feature will be sufficiently niche as to be unnoticed by most Go developers.
It is true that type parameters can be inferred from their arguments, the declaration of generic functions and methods require a clumsy (type
parameter declaration in place of the more common <T>
syntaxes found in C++ and Java.
The reason for (type
, it was explained to me, is Go is designed to be parsed without a symbol table. This rules out both <T>
and [T]
syntaxes as the parser needs ahead of time what kind of declaration a T
is to avoid interpreting the angle or square braces as comparison or indexing operators respectively.
The astute Roger Peppe quickly identified that contracts represent a superset of interfaces
Any behaviour you can express with an interface, you can do so and more, with a contract.
The remainder of this post are my suggestions for an alternative generic function declaration syntax that avoids add additional parenthesis by leveraging Roger’s observation.
The earlier Type Functions proposal showed that a type
declaration can support a parameter. If this is correct, then the proposed contract
declaration could be rewritten from
contract stringer(x T) {
var s string = x.String()
}
to
type stringer(x T) contract {
var s string = x.String()
}
This supports Roger’s observation that a contract
is a superset of an interface
. type stringer(x T) contract { ... }
introduces a new contract
type in the same way type stringer interface { ... }
introduces a new interface
type.
If you buy my argument that a contract
is a kind of type
is debatable, but if you’re prepared to take it on faith then the remainder of the syntax introduced in the generics proposal could be further simplified.
If a contract
is an identifier then we can use a contract
anywhere that a built-in type or interface is used. For example
func Stringify(type T stringer)(s []T) (ret []string) {
for _, v := range s {
ret = append(ret, v.String())
} return ret
}
Could be expressed as
func Stringify(s []stringer) (ret []string) {
for _, v := range s {
ret = append(ret, v.String())
} return ret
}
That is, in place of explicitly binding T
to the contract stringer
only for T
to be referenced seven characters later, we bind the formal parameter s
to a slice of stringer
s directly. The similarity with the way this would previously be done with a stringer
interface emphasises Roger’s observation.
The first example in the design proposal introduces an unknown type parameter.
func Print(type T)(s []T) {
for _, v := range s {
fmt.Println(v)
}
}
The operations on unknown types are limited, they are in some senses values that can only be read. Again drawing on Roger’s observation above, the syntax could potentially be expressed as:
func Print(s []contract{}) {
for _, v := range s {
fmt.Println(v)
}
}
Or maybe even
type T contract {} func Print(s []T) {
for _, v := range s {
fmt.Println(v)
}
}
In essence the literal contract{}
syntax defines an anonymous unknown type analogous to interface{}
‘s anonymous interface type.
The great irony is, after years of my bloviation that “adding generics to Go has nothing to do with the syntax”
2, it turns out that, actually, yes, the syntax is crucial.
In my previous post I converted httpstat to use Go 1.11’s upcoming module support. In this post I continue to explore integrating Go modules into a continuous integration workflow via Travis CI.
The first scenario is probably the most likely for existing Go projects, a library or application targeting Go 1.10 and Go 1.11. httpstat has an existing CI story–I’m using Travis CI for my examples, if you use something else, please blog about your experience–and I wanted to test against the current and development versions of Go.
The straddling of two worlds is best accomplished via the GO111MODULE
environment variable. GO111MODULE
dictates when the Go module behaviour will be preferred over the Go 1.5-1.10’s vendor/
directory behaviour. In Go 1.11 the Go module behaviour is disabled by default for packages within $GOPATH
(this also includes the default $GOPATH
introduced in Go1.8). Thus, without additional configuration, Go1.11 inside Travis CI will behave like Go 1.10.
In my previous post I chose the working directory ~/devel/httpstat
to ensure I was not working within a $GOPATH
workspace. However CI vendors have worked hard to make sure that their CI bots always check out of the branch under test inside a working $GOPATH
.
Fortunately there is a simple workaround for this, add env GO111MODULE=on
before any go build
or test
invocations in your .travis.yml
to force Go module behaviour and ignore any vendor/
directories that may be present inside your repo.
language: go
go:
- 1.10.x
- master
os:
- linux
- osx
dist: trusty
sudo: false
install: true
script:
- env GO111MODULE=on go build
- env GO111MODULE=on go test
You’ll note that I didn’t check in the go.mod
module manifest I created in my previous post. This was initially an accident on my part, but one that turned out to be beneficial. By not checking in the go.mod
file, the source of truth for dependencies remained httpstat’s Gopkg.toml
file. When the call to env GO111MODULE=on go build
executes on the Travis CI builder, the go
tool converts my Gopkg.toml
on the fly, then uses it to fetch dependencies before building.
$ env GO111MODULE=on go build
go: creating new go.mod: module github.com/davecheney/httpstat
go: copying requirements from Gopkg.lock
go: finding github.com/fatih/color v1.5.0
go: finding golang.org/x/sys v0.0.0-20170922123423-429f518978ab
go: finding golang.org/x/net v0.0.0-20170922011244-0744d001aa84
go: finding golang.org/x/text v0.0.0-20170915090833-1cbadb444a80
go: finding github.com/mattn/go-colorable v0.0.9
go: finding github.com/mattn/go-isatty v0.0.3
go: downloading github.com/fatih/color v1.5.0
go: downloading github.com/mattn/go-colorable v0.0.9
go: downloading github.com/mattn/go-isatty v0.0.3
go: downloading golang.org/x/net v0.0.0-20170922011244-0744d001aa84
go: downloading golang.org/x/text v0.0.0-20170915090833-1cbadb444a80
If you’re not using a dependency management tool that go mod
knows how to convert from this advice may not work for you and you may have to maintain a go.mod
manifest in parallel with you previous dependency management solution.
The second option I investigated, but ultimately did not pursue, was to treat the Travis CI builder, like my fresh Ubuntu 18.04 install, as a blank canvas. Rather than working around Travis CI’s attempts to check the branch out inside a working $GOPATH
I experimented with treating the build as a C project
then invoking gimme
directly. This also required me to check in my go.mod
file as without Travis’ language: go
support, the checkout was not moved into a $GOPATH
folder. The latter seems like a reasonable approach if your project doesn’t intend to be compatible with Go 1.10 or earlier.
language: c
os:
- linux
- osx
dist: trusty
sudo: false
install:
- eval "$(curl -sL https://raw.githubusercontent.com/travis-ci/gimme/master/gimme | GIMME_GO_VERSION=master bash)"
script:
- go build
- go test
You can see the output from this branch here.
Sadly when run in this mode gimme
is unable to take advantage of the caching provided by the language: go
environment and must build Go 1.11 from source, adding three to four minutes delay to the install phase of the build. Once Go 1.11 is released and gimme
can source a binary distribution this will hopefully address the setup latency.
Ultimately this option may end up being redundant if GO111MODULE=on
becomes the default behaviour in Go 1.12 and the location Travis places the checkout becomes immaterial.
Update: Since this post was written, Go 1.11beta2 has been released. I’ve updated the setup section to reflect this. Russ Cox kindly wrote to me to explain the reasoning behind storing the Go module cache in $GOPATH
. I’ve included his response inline.
This weekend I wanted to play with Ubuntu 18.04 on a spare machine. This gave me a perfect excuse to try out the modules feature recently merged into the Go 1.11 development branch.
TL;DR: When Go 1.11 ships you’ll be able to download the tarball and unpack it anywhere you like. When Go 1.11 ships you’ll be able to write Go modules anywhere you like.
The recently released Go 1.11beta2 has support for Go modules.
% curl https://dl.google.com/go/go1.11beta2.linux-amd64.tar.gz | \
tar xz --transform=s/^go/go1.11/g
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 169M 100 169M 0 0 23.6M 0 0:00:07 0:00:07 --:--:-- 21.2M
% go1.11/bin/go version go version go1.11beta2 linux/amd64
That’s all you need to do to install Go 1.11beta2. Out of shot, I’ve added $HOME/go1.11/bin
to my $PATH
.
Now we have a version of Go with module support installed, I wanted to try to use it to manage the dependencies for httpstat, a clone of the Python tool of the same name that many collaborators swarmed on to build in late 2016.
To show that Go 1.11 won’t need you to declare a $GOPATH
or use a specific directly layout for the location of your project, I’m going to use my favourite directory for source code, ~/devel
.
% git clone https://github.com/davecheney/httpstat devel/httpstat
Cloning into 'devel/httpstat'...
remote: Counting objects: 2326, done.
remote: Total 2326 (delta 0), reused 0 (delta 0), pack-reused 2326
Receiving objects: 100% (2326/2326), 8.73 MiB | 830.00 KiB/s, done.
Resolving deltas: 100% (673/673), done.
Checking out files: 100% (1361/1361), done.
% cd devel/httpstat % go mod -init -module github.com/davecheney/httpstat
go: creating new go.mod: module github.com/davecheney/httpstat
go: copying requirements from Gopkg.lock
Nice, go mod -init
translated my existing Gopkg.lock
file into its own go.mod
format.
% cat go.mod
module github.com/davecheney/httpstat
require (
github.com/fatih/color v1.5.0
github.com/mattn/go-colorable v0.0.9
github.com/mattn/go-isatty v0.0.3
golang.org/x/net v0.0.0-20170922011244-0744d001aa84
golang.org/x/sys v0.0.0-20170922123423-429f518978ab
golang.org/x/text v0.0.0-20170915090833-1cbadb444a80
)
Let’s give it a try
% go build
go: finding golang.org/x/net v0.0.0-20170922011244-0744d001aa84
go: finding github.com/mattn/go-colorable v0.0.9
go: finding github.com/mattn/go-isatty v0.0.3
go: finding golang.org/x/sys v0.0.0-20170922123423-429f518978ab
go: finding github.com/fatih/color v1.5.0
go: finding golang.org/x/text v0.0.0-20170915090833-1cbadb444a80
go: downloading github.com/fatih/color v1.5.0
go: downloading github.com/mattn/go-isatty v0.0.3
go: downloading golang.org/x/net v0.0.0-20170922011244-0744d001aa84
go: downloading github.com/mattn/go-colorable v0.0.9
go: downloading golang.org/x/text v0.0.0-20170915090833-1cbadb444a80
Very nice, go build
ignored the vendor/
folder in this repository (because we’re outside $GOPATH
) and fetched the revisions it needed. Let’s try out the binary and make sure it works.
% ./httpstat golang.org
Connected to 216.58.196.145:443
HTTP/2.0 200 OK
Server: Google Frontend
Alt-Svc: quic=":443"; ma=2592000; v="44,43,39,35"
Cache-Control: private
Content-Type: text/html; charset=utf-8
Date: Sat, 14 Jul 2018 08:20:43 GMT Strict-Transport-Security: max-age=31536000; preload
Vary: Accept-Encoding
X-Cloud-Trace-Context: 323cd59570cc084fed506f7e85d79d9f
Body discarded
Move along, nothing to see here.
In the previous version of this article I included a footnote mentioning that go get
in module mode stored its downloaded source in $GOPATH/src/mod
not the cache added in Go 1.10. Russ Cox kindly wrote to me to explain the rational behind this choice and also copied this to a recent thread on golang-dev. For completeness, here is his response:
The build cache ($GOCACHE, defaulting to $HOME/.cache/go-build) is for storing recent compilation results, so that if you need to do that exact compilation again, you can just reuse the file. The build cache holds entries that are like “if you run this exact compiler on these exact inputs. this is the output you’d get.” If the answer is not in the cache, your build uses a little more CPU to run the compiler nstead of reusing the output. But you are guaranteed to be able to run the compiler instead, since you have the exact inputs and the compiler binary (or else you couldn’t even look up the answer in the cache).
The module cache ($GOPATH/src/mod, defaulting to $HOME/go/src/mod) is for storing downloaded source code, so that every build does not redownload the same code and does not require the network or the original code to be available. The module cache holds entries that are like “if you need to download mymodule@v1.2.3, here are the files you’d get.” If the answer is not in the cache, you have to go out to the network. Maybe you don’t have a network right now. Maybe the code has been deleted. It’s not anywhere near guaranteed that you can redownload the sources and also get the same result. Hopefully you can, but it’s not an absolute certainty like for the build cache. (The go.sum file will detect if you get a different answer on re-download, but knowing you got the wrong bits doesn’t help you make progress on actually building your code. Also these paths end up in file-line information in binaries, so they show up in stack traces, and the like and feed into tools like text editors or debuggers that don’t necessarily know how to trigger the right cache refresh.)
You can build Go 1.11 from source right now anywhere you like. You don’t need to set an environment variable or follow a predefined location.
With Go 1.11 and modules you can write your Go modules anywhere you like. You’re no longer forced into having one copy of a project checked out in a specific sub directory of your $GOPATH
.
This blog post was inspired by a conversation with a co-worker about using a slice as a stack. The conversation turned into a wider discussion on the way slices work in Go, so I thought it would be useful to write it up.
Every discussion of Go’s slice type starts by talking about something that isn’t a slice, namely, Go’s array type. Arrays in Go have two relevant properties:
[5]int
is both an array of 5 int
s and is distinct from [3]int
.package main import "fmt" func main() { var a [5]int b := a b[2] = 7 fmt.Println(a, b) // prints [0 0 0 0 0] [0 0 7 0 0] }
The statement b := a
declares a new variable, b
, of type [5]int
, and copies the contents of a
to b
. Updating b
has no effect on the contents of a
because a
and b
are independent values.1
Go’s slice type differs from its array counterpart in two important ways:
len
.2As a result of the second property, two slices can share the same underlying array. Consider these examples:
package main import "fmt" func main() { var a = []int{1,2,3,4,5} b := a[2:] b[0] = 0 fmt.Println(a, b) // prints [1 2 0 4 5] [0 4 5] }
In this example a
and b
share the same underlying array–even though b
starts at a different offset in that array, and has a different length. Changes to the underlying array via b
are thus visible to a
.
package main import "fmt" func negate(s []int) { for i := range s { s[i] = -s[i] } } func main() { var a = []int{1, 2, 3, 4, 5} negate(a) fmt.Println(a) // prints [-1 -2 -3 -4 -5] }
In this example a
is passed to negate
as the formal parameter s.
negate
iterates over the elements of s
, negating their sign. Even though negate
does not return a value, or have any way to access the declaration of a
in main
, the contents of a
are modified when passed to negate
.
Most programmers have an intuitive understanding of how a Go slice’s underlying array works because it matches how array-like concepts in other languages tend to work. For example, here’s the first example of this section rewritten in Python:
Python 2.7.10 (default, Feb 7 2017, 00:08:15) [GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.34)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> a = [1,2,3,4,5] >>> b = a >>> b[2] = 0 >>> a [1, 2, 0, 4, 5]
And also in Ruby:
irb(main):001:0> a = [1,2,3,4,5] => [1, 2, 3, 4, 5] irb(main):002:0> b = a => [1, 2, 3, 4, 5] irb(main):003:0> b[2] = 0 => 0 irb(main):004:0> a => [1, 2, 0, 4, 5]
The same applies to most languages that treat arrays as objects or reference types.4
The magic that makes a slice behave both as a value and a pointer is to understand that a slice is actually a struct type. This is commonly referred to as a slice header after its counterpart in the reflect package. The definition of a slice header looks something like this:
package runtime type slice struct { ptr unsafe.Pointer len int cap int }
This is important because unlike map
and chan
types slices are value types and are copied when assigned or passed as arguments to functions.
To illustrate this, programmers instinctively understand that square
‘s formal parameter v
is an independent copy of the v
declared in main
.
package main import "fmt" func square(v int) { v = v * v } func main() { v := 3 square(v) fmt.Println(v) // prints 3, not 9 }
So the operation of square
on its v
has no effect on main
‘s v
. So too the formal parameter s
of double
is an independent copy of the slice s
declared in main
, not a pointer to main
‘s s
value.
package main import "fmt" func double(s []int) { s = append(s, s...) } func main() { s := []int{1, 2, 3} double(s) fmt.Println(s, len(s)) // prints [1 2 3] 3 }
The slightly unusual nature of a Go slice variable is it’s passed around as a value, not than a pointer. 90% of the time when you declare a struct in Go, you will pass around a pointer to values of that struct.5 This is quite uncommon, the only other example of passing a struct around as a value I can think of off hand is time.Time
.
It is this exceptional behaviour of slices as values, rather than pointers to values, that can confuses Go programmer’s understanding of how slices work. Just remember that any time you assign, subslice, or pass or return, a slice, you’re making a copy of the three fields in the slice header; the pointer to the underlying array, and the current length and capacity.
I’m going to conclude this post on the example of a slice as a stack that I opened this post with:
package main import "fmt" func f(s []string, level int) { if level > 5 { return } s = append(s, fmt.Sprint(level)) f(s, level+1) fmt.Println("level:", level, "slice:", s) } func main() { f(nil, 0) }
Starting from main
we pass a nil
slice into f
as level
0. Inside f
we append to s
the current level
before incrementing level
and recursing. Once level
exceeds 5, the calls to f
return, printing their current level and the contents of their copy of s
.
level: 5 slice: [0 1 2 3 4 5] level: 4 slice: [0 1 2 3 4] level: 3 slice: [0 1 2 3] level: 2 slice: [0 1 2] level: 1 slice: [0 1] level: 0 slice: [0]
You can see that at each level the value of s
was unaffected by the operation of other callers of f
, and that while four underlying arrays were created 6 higher levels of f
in the call stack are unaffected by the copy and reallocation of new underlying arrays as a by-product of append
.
If you want to find out more about how slices work in Go, I recommend these posts from the Go blog: