Associative commentary follow up

This post is a follow up to Friday’s post on comments in Go.

Keith Rarick and Nate Finch pointed out that I had neglected to include two important practical use cases.

Build tags

I’ve previously written about how to use // +build tags to perform conditional compilation. In light of the previous post it’s probably worth recapping them here.

  • Build tags must use the // form.
    // +build right
    /* +build wrong */
  • Build tags must be their own comment, they must not be associated with a declaration.
    // Copyright Microsoft 1981
    
    // +build !darwin
    
    // Package basic implements Dartmouth's BASIC interpreter.  
    package basic
  • Build tags must occur early in the file. Only the first few lines of the file are scanned when filtering files by build tags.
    package wrong
    
    import "io"
    
    // +build whoops too,late

Copyright headers

The second is managing procedural issues if your licence requires you to include a copyright block at the top of source.

This was also briefly covered in the conditional compilation article. To recap

  • Most licences that recommend copyright headers require them to be at the top of the file, this means they must come before a package declaration, and its comment.
  • You probably don’t want the copyright header being part of your godoc, so the comment block holding the copyright header and the package declaration should be separated by a newline.
  • If you have any build tags, they should also appear between the copyright block and the package declaration. As all three are separate comment, they should be separated by a newline.
    // Copyright Commodore Inc, 1982
    
    // +build 6502
    
    // Package c64 is the computer for the masses, not the classes.
    package c64
  • If this leads to a verbose combination of copyright header, build tag, and package comment for godoc, consider moving the comment on the package declaration to a separate file. This is traditionally named doc.go and contains only the package declaration and its commentary.

Associative commentary

This is a quick post to discuss the rules of comments in Go.

To quickly recap, Go comments come in two forms

// everything from the double slash to the end of line is a comment
/* everything from the opening slash star, to the closing one is a comment */

As the first form declares that the remainder of the line is a comment, if you want to comment out more than one line, you need to do this

// this comment form is useful for
// commenting out sections of your code
// as you are working

The second form is also useful, and generally preferred, for large blocks of commentary

/*
The generally accepted rule when writing large
comment blocks in this form is to leave a newline
at the start and the end of your comment
*/

One important thing to note is that comments do not nest, thus

// // This is fine because everything from the double 
// // slash to the end of line is ignored

/* 
But, if you you were to start a new
/* comment inside an old one 
the closing star slash would close the comment block and */
this line would generate a syntax error 
*/

Association

A feature of the tools that consume Go source code, not the language, is the convention that a comment which directly preceeds a declaration is associated with that declaration.

// A Foo is a Fooid in the class of Endofoos.
func Foo() { .... }

Conversely, a comment followed by a newline stands alone, it’s just a comment.

package foo

// utility foos

func Quxx() { ... }

godoc allows comments to be associated with any of the top level declarations; package, var, const, type, and func.

import “C”

The trap that catches people out when they are using cgo is they don’t realise the significance of the newline, or more correctly, the lack of newline between their block of C code and the import "C" declaration.

package main

/*
#include "stdio.h"
*/
import "C"

func main() {
        C.printf(C.CString("Hello world\n"))
}

In this example the import "C" declaration is immediately preceded by the comment block containing our C code, in this case including stdio.h to obtain a reference to the printf function.

If there was a newline between the comment block and import "C" then the cgo preprocessor would not associate the comment with the import declaration and act as if #include "stdio.h" was never there.

% go run cgo.go
# command-line-arguments
37: error: use of undeclared identifier 'printf'

Update don’t forget to read the follow up to this post.

The empty struct

Introduction

This post explores the properties of my favourite Go data type, the empty struct.

The empty struct is a struct type that has no fields. Here are a few examples in named and anonymous forms

type Q struct{}
var q struct{}

So, if an empty struct contains no fields, contains no data, what use is it ? What can we do with it ?

Width

Before diving into the empty struct itself, I wanted to take a brief detour to talk about width.

The term width comes, like most terms, from the gc compiler, although its etymology probably goes back decades.

Width describes the number of bytes of storage an instance of a type occupies. As a process’s address space is one dimensional, I think width is a more apt than size.

Width is a property of a type. As every value in a Go program has a type, the width of the value is defined by its type and is always a multiple of 8 bits.

We can discover the width of any value, and thus the width of its type using the unsafe.Sizeof() function.

var s string
var c complex128
fmt.Println(unsafe.Sizeof(s))	 // prints 8
fmt.Println(unsafe.Sizeof(c))	 // prints 16

http://play.golang.org/p/4mzdOKW6uQ

The width of an array type is a multiple of its element type.

var a [3]uint32
fmt.Println(unsafe.Sizeof(a)) // prints 12

http://play.golang.org/p/YC97xsGG73

Structs provide a more flexible way of defining composite types, whose width is the sum of the width of the constituent types, plus padding

type S struct {
        a uint16
        b uint32
}
var s S
fmt.Println(unsafe.Sizeof(s)) // prints 8, not 6

The example above demonstrates one aspect of padding, that a value must be aligned in memory to a multiple of its width. In this case there are two bytes of padding added by the compiler between a and b.

Update Russ Cox has kindly written to explain that width is unrelated to alignment. You can read his comment below.

An empty struct

Now that we’ve explored width it should be evident that the empty struct has a width of zero. It occupies zero bytes of storage.

var s struct{}
fmt.Println(unsafe.Sizeof(s)) // prints 0

As the empty struct consumes zero bytes, it follows that it needs no padding. Thus a struct comprised of empty structs also consumes no storage.

type S struct {
        A struct{}
        B struct{}
}
var s S
fmt.Println(unsafe.Sizeof(s)) // prints 0

http://play.golang.org/p/PyGYFmPmMt

What can you do with an empty struct

True to Go’s orthogonality, an empty struct is a struct type like any other. All the properties you are used to with normal structs apply equally to the empty struct.

You can declare an array of structs{}s, but they of course consume no storage.

var x [1000000000]struct{}
fmt.Println(unsafe.Sizeof(x)) // prints 0

http://play.golang.org/p/0lWjhSQmkc

Slices of struct{}s consume only the space for their slice header. As demonstrated above, their backing array consumes no space.

var x = make([]struct{}, 1000000000)
fmt.Println(unsafe.Sizeof(x)) // prints 12 in the playground

http://play.golang.org/p/vBKP8VQpd8

Of course the normal subslice, len, and cap builtins work as expected.

var x = make([]struct{}, 100)
var y = x[:50]
fmt.Println(len(y), cap(y)) // prints 50 100

http://play.golang.org/p/8cO4SbrWVP

You can take the address of a struct{} value, when it is addressable, just like any other value.

var a struct{}
var b = &a

Interestingly, the address of two struct{} values may be the same.

var a, b struct{}
fmt.Println(&a == &b) // true

http://play.golang.org/p/uMjQpOOkX1

This property is also observable for []struct{}s.

a := make([]struct{}, 10)
b := make([]struct{}, 20)
fmt.Println(&a == &b)       // false, a and b are different slices
fmt.Println(&a[0] == &b[0]) // true, their backing arrays are the same

http://play.golang.org/p/oehdExdd96

Why is this? Well if you think about it, empty structs contain no fields, so can hold no data. If empty structs hold no data, it is not possible to determine if two struct{} values are different. They are in effect, fungible.

a := struct{}{} // not the zero value, a real new struct{} instance
b := struct{}{}
fmt.Println(a == b) // true

http://play.golang.org/p/K9qjnPiwM8

note: this property is not required by the spec, but it does note that Two distinct zero-size variables may have the same address in memory.

struct{} as a method receiver

Now we’ve demonstrated that empty structs behave just like any other type, it follows that we may use them as method receivers.

type S struct{}

func (s *S) addr() { fmt.Printf("%p\n", s) }

func main() {
        var a, b S
        a.addr() // 0x1beeb0
        b.addr() // 0x1beeb0
}

http://play.golang.org/p/YSQCczP-Pt

In this example it shows that the address of all zero sized values is 0x1beeb0. The exact address will probably vary for different versions of Go.

Wrapping up

Thank you for reading this far. At close to 800 words this article turned out to be longer than expected, and there was still more I was planning to write.

While this article concentrated on language obscura, there is one important practical use of empty structs, and that is the chan struct{} construct used for signaling between go routines

I’ve talked about the use of chan struct{} in my Curious Channels article.

Translations


Update Damian Gryski pointed out that I had omitted Brad Fitzpatrick’s iter package. I’ll leave it as an exercise to the reader to explore the profound implications of Brad’s contribution.

Thoughts on Go package management six months on

It has been roughly six months since I wrote about the problems I saw with package management in Go.

In the intervening months there has been lots of discussion; the issue continues to be one of the two most continually and hotly debated on the golang-nuts and go-pm mailing lists. No prizes for guessing what the other topic is.

In the current climate of mistrust there appears to be little support for delegating the problem of package management to a central repository. Informed by the soap box drama of the npm repository it looks like the days of standing up a central repository for a new language are over. Perl, Python, Java; enjoy it while it lasts.

In lieu of this, two camps have formed around complementary ideas.

The first camp, popularised by tools like Gustavo Niemeyer’s gopkg.in redirector, places stability of an import path, a versioned API if you like, as paramount. However this arrangement does not adequately address issues of build reproducibility or multiple revisions of a package being present in your final executable.

This camp also expresses a profound dislike for any sort of manifest file or other metadata in their repo. I find this position odd as most Go repos I find on GitHub are sprayed with Dockerfiles, Makefiles, Gruntfiles, Travisfiles, and all manner of CI or build metadata.

The second camp, informed by the statements of the Go team themselves, choose to vendor, or bring into their own repo the source of packages they depend on. The leading tool in this area is Keith Rarick’s godep.

This model should be very familiar to anyone in the Java community who used Ant. It is hard to argue that it was not a success for Java, at a cost of jar files committed to your repo (Hi SVN!). At least with godep you always carry around the source of your package, not some binary jar.

If this then is the current state of Go package management, so be it.

Channel Axioms

Most new Go programmers quickly grasp the idea of a channel as a queue of values and are comfortable with the notion that channel operations may block when full or empty.

This post explores four of the less common properties of channels:

  • A send to a nil channel blocks forever
  • A receive from a nil channel blocks forever
  • A send to a closed channel panics
  • A receive from a closed channel returns the zero value immediately

A send to a nil channel blocks forever

The first case which is a little surprising to newcomers is a send on a nil channel blocks forever.

This example program will deadlock on line 5 because the zero value for an uninitalised channel is nil.

package main

func main() {
        var c chan string
        c <- "let's get started" // deadlock
}

http://play.golang.org/p/1i4SjNSDWS

A receive from a nil channel blocks forever

Similarly receiving from a nil channel blocks the receiver forever.

package main

import "fmt"

func main() {
        var c chan string
        fmt.Println(<-c) // deadlock
}

http://play.golang.org/p/tjwSfLi7x0

So why does this happen ? Here is one possible explanation

  • The size of a channel’s buffer is not part of its type declaration, so it must be part of the channel’s value.
  • If the channel is not initalised then its buffer size will be zero.
  • If the size of the channel’s buffer is zero, then the channel is unbuffered.
  • If the channel is unbuffered, then a send will block until another goroutine is ready to receive.
  • If the channel is nil then the sender and receiver have no reference to each other; they are both blocked waiting on independent channels and will never unblock.

A send to a closed channel panics

The following program will likely panic as the first goroutine to reach 10 will close the channel before its siblings have time to finish sending their values.

package main

import "fmt"

func main() {
        var c = make(chan int, 100)
        for i := 0; i < 10; i++ {
                go func() {
                        for j := 0; j < 10; j++ {
                                c <- j
                        }
                        close(c)
                }()
        }
        for i := range c {
                fmt.Println(i)
        }
}

http://play.golang.org/p/hxUVqy7Qj-

So why isn’t there a version of close() that lets you check if a channel is closed ?

if !isClosed(c) {
        // c isn't closed, send the value
        c <- v
}

But this function would have an inherent race. Someone may close the channel after we checked isClosed(c) but before the code gets to c <- v.

Solutions for dealing with this fan in problem are discussed in the 2nd article linked at the bottom of this post.

A receive from a closed channel returns the zero value immediately

The final case is the inverse of the previous. Once a channel is closed, and all values drained from its buffer, the channel will always return zero values immediately.

package main

import "fmt"

func main() {
        	c := make(chan int, 3)
	        c <- 1
        	c <- 2
	        c <- 3
	        close(c)
	        for i := 0; i < 4; i++ {
		                fmt.Printf("%d ", <-c) // prints 1 2 3 0
	        }
}

http://play.golang.org/p/ajtVMsu8EO

The correct solution to this problem is to use a for range loop.

for v := range c {
        	// do something with v
}

for v, ok := <- c; ok ; v, ok = <- c {
	        // do something with v
}

These two statements are equivalent in function, and demonstrate what for range is doing under the hood.

Further reading

Pointers in Go

This blog post was originally a comment on a Google Plus page, but apparently one cannot create a href to a comment so it was suggested I rewrite it as a blog post.


Go pointers, like C pointers, are values that, uh, point to other values. This is a tremendously important concept and shouldn’t be considered dangerous or something to get hung up on.

Here are several ways that Go improves over C pointers, and C++, for that matter.

  1. There is no pointer arithmetic. You cannot write in Go
    var p *int
    p++

    That is, you cannot alter the address p points to unless you assign another address to it.

  2. This means there is no pointer/array duality in Go. If you don’t know what I’m talking about, read this book. Even if you have no intention of programming in C or Go, it will enrich your life.
  3. Once a value is assigned to a pointer, with the exception of nil which I’ll cover in the next point, Go guarantees that the thing being pointed to will continue to be valid for the lifetime of the pointer. So
    func f() *int { 
            i := 1
            return &i
    }

    is totally safe to do in Go. The compiler will arrange for the memory location holding the value of i to be valid after f() returns.

  4. Nil pointers. Yes, you can still have nil pointers and panics because of them, however in my experience the general level of hysteria generated by nil pointer errors, and the amount of defensive programming present in other languages like Java is not present in Go.

    I believe this is for two three reasons

    1. multiple return values, nil is not used as a sentinel for something went wrong. Obviously this leaves the question of programmers not checking their errors, but this is simply a matter of education.
    2. Strings are value types, not pointers, which is the, IMO, the number one cause of null pointer exceptions in languages like Java and C++.
      var s string // the zero value of s is "", not nil
    3. In fact, most of the built in data types, maps, slices, channels, and arrays, have a sensible default if they are left uninitialized. Thanks to Dustin Sallings for pointing this out.