Monthly Archives: October 2015

Programming language markets

Last night at the Sydney Go Users’ meetup, Jason Buberel, product manager for the Go project, gave an excellent presentation on a product manager’s perspective on the Go project.

As part of his presentation, Buberel broke down the marketplace for a programming language into seven segments.

Programming language market for Go, courtesy Jason Buberel

As a thought experiment, I’ve taken Buberel’s market segments and applied them across a bunch of contemporary languages.

Disclaimer: I’m not a product manager, I’ve just seen one on stage.

Language	Embedded and devices¹	Systems and drivers²	Server and infrastructure³	Web and mobile⁴	Big data and scientific computing	Desktop applications⁵	Mobile applications
Go	0	0	3	2	1	1⁶	1
Rust	1	1	0	2	0	2^{6, 11}	0
Java	2¹⁴	0	2	3	3	2⁷	3
Python	1	0	3¹²	3	3	2^{6, 10}	0
Ruby	0	0	3	3	0	1⁶	0
Node.js (Javascript / v8)	1¹³	0	0	2	0	0	2⁸
Objective-C / Swift	0	3	2	2⁹	0	3	3
C/C++	3	3	3	2	3	3	2

Is your favourite language missing ? Feel free to print this table out and draw in the missing row.

Scoring system: 0 – no presence, lack of interest or technical limitation. 1 – emerging presence or proof of concept. 2 – active competitor. 3 – market leader.

Conclusion

If there is a conclusion to be drawn from this rather unscientific study, every language is in competition to be the language of the backend. As for the other market segments, everyone competes with C and C++, even Java.

Notes:

The internet of things that are too small to run linux; micrcontrollers, arduino, esp8266, etc.
Can you write a kernel, kernel module, or operating system in it ?
Monitoring systems, databases, configuration management systems, that sort of thing.
Web application backends, REST APIs, microservices of all sorts.
Desktop applications, including games, because the mobile applications category would certainly include games.
OpenGL libraries or SDL bindings.
Swing, ugh.
Phonegap, React Native.
Who remembers WebObjects ?
Python is a popular scripting language for games.
Servo, the browser rendering engine is targeting Firefox.
Openstack.
Technically Lua pretending to be Javascript, but who’s counting.
Thanks to @rakyll for reminding me about the Blu Ray drives, and j2me running in everyone’s credit cards.

Bootstrapping Go 1.5 on non Intel platforms

This post is a continuation of my previous post on bootstrapping Go 1.5 on the Raspberry Pi.

Now that Go 1.5 is written entirely in Go there is a bootstrapping problem — you need Go to build Go. For most people running Windows, Mac or Linux, this isn’t a big issue as the Go project provides installers for Go 1.4. However, if you’re using one of the new platforms added in Go 1.5; ARM64 or PPC64, there is no pre built installer as Go 1.4 did not support those platforms.

This post explains how to bootstrap Go 1.5 onto a platform that has no pre built version of Go 1.4. The process assumes you are building on a Darwin or Linux host, if you’re on Windows, sorry you’re out of luck.

We use the names host and target to describe the machine preparing the bootstrap installation and the non Intel machine receiving the bootstrap installation respectively.

As always, please uninstall any version of Go you may have on your workstation before beginning; check your $PATH and do not set $GOROOT.

Build Go 1.4 on your host

After uninstalling any previous version of Go from the host, fetch the source for Go 1.4 and build it locally.

% git clone https://go.googlesource.com/go $HOME/go1.4
% cd $HOME/go1.4/src
% git checkout release-branch.go1.4
% ./make.bash

This process will build Go 1.4 into the directory $HOME/go1.4.
Notes:

This procedure assumes that Go 1.4 is built in $HOME/go1.4, if you choose to use another path, please adjust accordingly.
We use ./make.bash to skip running the full unit tests, you can use ./all.bash to run the unit tests if you prefer.
Do not add $HOME/go1.4/bin to your $PATH.

Build Go 1.5 on your host

Now you have Go 1.4 on your host, you can use that to bootstrap Go 1.5 on your host.

% git clone https://go.googlesource.com/go $HOME/go
% cd $HOME/go/src
% git checkout release-branch.go1.5
% env GOROOT_BOOTSTRAP=$HOME/go1.4 ./make.bash

This process will build Go 1.5 into the directory $HOME/go.
Notes:

Again, we use ./make.bash to skip running the full unit tests, you can use ./all.bash to run the unit tests if you prefer.
You should add $HOME/go/bin to your $PATH to use the version of Go 1.5 you just built as your Go install.

Build a bootstrap distribution for your target

From your Go 1.5 installation, build a bootstrap version of Go for your target.

The process is similar to cross compiling and uses the same GOOS and GOARCH environment variables. In this example we’ll build a bootstrap for linux/ppc64. To build for other architectures, adjust accordingly.

% cd $HOME/go/src
% env GOOS=linux GOARCH=ppc64 ./bootstrap.bash
...
Bootstrap toolchain for linux/ppc64 installed in /home/dfc/go-linux-ppc64-bootstrap.
Building tbz.
-rw-rw-r-- 1 dfc dfc 46704160 Oct 16 10:39 /home/dfc/go-linux-ppc64-bootstrap.tbz

The bootstrap script is hard coded to place the output tarball two directories above ./bootstrap.bash, which will be $HOME/go-linux-ppc64-bootstrap.tbz in this case.

Now scp go-linux-ppc64-bootstrap.tbz to the target, and unpack it to $HOME.
Notes:

This bootstrap distribution should only be used for bootstrapping Go 1.5 on the target.

Build Go 1.5 on your target

On the target you should have the bootstrap distribution in your home directory, ie $HOME/go-linux-ppc64-bootstrap. We’ll use that as our GOROOT_BOOTSTRAP and build Go 1.5 on the target.

% git clone https://go.googlesource.com/go $HOME/go
% cd $HOME/go/src
% git checkout release-branch.go1.5
% env GOROOT_BOOTSTRAP=$HOME/go-linux-ppc64-bootstrap ./all.bash

Now you’ll have Go 1.5 built natively on your target, in this case linux/ppc64, but this procedure has been tested on linux/arm64 and also linux/ppc64le.

Padding is hard

We all know that the empty struct consumes no storage, right ? Here is a curious case where this turns out to not be true.

This shows the “after” graph, prior to CL 15522 each allocation was 320 bytes.

This is a story about trying to speed up the Go compiler. Since Go 1.5 we’ve had the great concurrent GC, which reduces the cost of garbage collection, but no matter how efficient a garbage collector is, it doesn’t make allocations free of cost.

The specific allocation I was targeting was the obj.Prog structure which represents a quasi machine code instruction late in the compile cycle. As you can see, for this compilation (which was cmd/compile/internal/gc) obj.Prog values constitute 34.72% of the total allocations for this program—reducing the size of obj.Prog will pay off in real terms.

obj.Prog itself contains a field of type obj.ProgInfo, which is what this blog post will focus on today. Here is the definition for obj.ProgInfo

type ProgInfo struct {
        Flags    uint32   // flag bits
        Reguse   uint64   // registers implicitly used by this instruction
        Regset   uint64   // registers implicitly set by this instruction
        Regindex uint64   // registers used by addressing mode
        _        struct{} // to prevent unkeyed literals
}

How much space will values of this type consume ? TL;DR, here is a table of the results:

	`GOARCH=386`	`GOARCH=amd64`
Go 1.4.x	28 bytes	32 bytes
Go 1.5.x	32 bytes	40 bytes

Go 1.4 32 bit says the value is 28 bytes large, but Go 1.4 64 bit says it’s 32 bytes large. Worse, Go 1.5 32 bit says the value is 32 bytes, and Go 1.5 64 bit says it’s 40 bytes! This type uses fixed sized integer types, but its total size changes every time we compile it. What the heck is going on here ?

Alignment and padding

The reason for the difference between the 32 bit and 64 bit answers is alignment. The Go spec says the address of a struct’s fields must be naturally aligned. So on a 64 bit machine, the address of any uint64 field must be a multiple of 8 bytes. In effect the compiler sees this:

type ProgInfo struct {
        Flags    uint32   // flag bits
        _        [4]byte  // padding added by compiler
        Reguse   uint64   // registers implicitly used by this instruction
        Regset   uint64   // registers implicitly set by this instruction
        Regindex uint64   // registers used by addressing mode
        _        struct{} // to prevent unkeyed literals
}

On 32 bit machines, word boundaries occur every 4 bytes, so there is no requirement to pad for alignment. Note that operations on 64 bit quantities may still require the value to be properly aligned, this is the infamous issue 599.

So, that explains the difference in results between 32 bit and 64 bit machines, Flags (4 bytes) + Reguse (8 bytes) + Regset (8 bytes) + Regindex (8 bytes) == 28 bytes on 32 bit machines, plus another 4 bytes for padding for 64 bit machines.

Well, except that Go 1.5 seems to be consuming another word over and above the Go 1.4 requirements, where is this memory going ?

To give you a hint, I’ll rearrange the definition slightly:

type ProgInfo struct {
        Reguse   uint64   // registers implicitly used by this instruction
        Regset   uint64   // registers implicitly set by this instruction
        Regindex uint64   // registers used by addressing mode
        Flags    uint32   // flag bits
        _        struct{} // to prevent unkeyed literals
}

Now the results are:

	`GOARCH=386`	`GOARCH=amd64`
Go 1.4.x	28 bytes	32 bytes
Go 1.5.x	32 bytes	32 bytes

Hmm, so some progress. We’ve managed to reduce the storage in the Go 1.5 64 bit case, but the others appear unchanged, especially Go 1.5’s 32 bit case.

For the Go 1.5 64 bit case, the padding that we saw above is still there, but it has moved. Effectively what the compiler is seeing is this:

type ProgInfo struct {
        Reguse   uint64   // registers implicitly used by this instruction
        Regset   uint64   // registers implicitly set by this instruction
        Regindex uint64   // registers used by addressing mode
        Flags    uint32   // flag bits
        _        [4]byte  // padding for alignment of ProgInfo
        _        struct{} // to prevent unkeyed literals
}

The reason this padding is still present is to preserve the alignment of the other fields in this structure. Remember that uint64 fields always have to be naturally aligned, every 4 bytes on 32 bit platforms, and every 8 bytes on 64 bit platforms. Consider this small fragment:

var pp [4]ProgInfo	
fmt.Println(&pp[1].Reguse) // address must be a multiple of 8 on 64 bit platforms

The compiler adds padding at the end of the structure to bring its size up to a multiple of the largest element in the structure.

So, that explains everything except for the Go 1.5 32 bit result. For 32 bit platforms, the structure needs to be aligned on a 4 byte boundary, not 8, so no padding should be required. The size under both Go 1.4 and Go 1.5 should be 28 bytes, not 32 bytes … what is going on ?

That empty struct

You’ve probably figured out by now that the only place the additional storage could come from is the unnamed struct{} field. What is it about the presence of an empty struct at the bottom of the type that causes it to increase the size of the struct ?

The answer is that while empty struct{} values consume no storage, you can take their address. That is, if you have a type

type T struct {
      X uint32
      Y struct{}
}

var t T

It is perfectly valid to take the address of t.Y, and as such, the address of t.Y would point beyond the end of the struct!¹

Now, Go code cannot do anything useful with this invalid pointer, but as it is a pointer, the garbage collector has to follow it, and that may lead it to access unmapped memory or dereference an invalid pointer.

The fix for this is detailed in issue 9401 and delivered in Go 1.5. In summary, zero sized values, like struct{} and [0]byte occurring at the end of a structure are assumed to have a size of one byte². With this knowledge, we can now explain the behaviour of the compiler for the rearranged type.

type ProgInfo struct {
        Reguse   uint64   // registers implicitly used by this instruction
        Regset   uint64   // registers implicitly set by this instruction
        Regindex uint64   // registers used by addressing mode
        Flags    uint32   // flag bits
        _        struct{} // to prevent unkeyed literals
        _        [1]byte  // to prevent unkeyed literals  
}

For a 64 bit compiler, it would observe that the largest field inside ProgInfo is 8 bytes, so it would add three additional bytes of padding after the [1]byte to round up to 32 bytes. This is fine, it was already planning to add 4 bytes after Flags.

For a 32 bit compiler, it would observe the largest field inside ProgInfo is 8 bytes, however the natural alignment of 32 bit machines is 4 bytes and the sum of the fields inside the type is 28 bytes, which is what Go 1.4 reported. However because the final field is of zero length, the compiler has replaced it with a [1]byte, causing the total size of the structure to be 29 bytes, forcing the compiler to add three more bytes of padding to round up to the next word boundary, 32 bytes.

Fortunately this last issue can be corrected by moving the zero field to the top of the structure, restoring the original sizes we observed in Go 1.4.

Wrapping up

Do Go programmers need to care about this minutiae ? Probably not. This is the stuff of brain teasers.

Certainly if you are looking to optimise the memory usage of your program, you need to be looking closely at the definitions of all your data structures, but it is unlikely that you’ll encounter the perfect storm of all these factors.

Not to be outdone by this morning’s excitement, Matthew Dempsky put together a quick tool to spot this inefficiency, and has found two other candidates.

One more thing

At the start of this piece I mentioned that CL 15522 resulted in a saving of 32 bytes, yet the biggest saving demonstrated above was 8 bytes. How did CL 15522 achieve such a reduction ? The answer is a thing known inside the runtime as size classes.

For small allocations, that is, less than a few kilobytes, the overhead of tracking the allocation becomes a significant percentage of the object itself. Imagine if every allocation had one word of overhead, every time you put an int into an interface³ value, you’d incur 100% overhead!

To reduce this overhead, rather than tracking every object on the heap individually, the runtime allocates things of the same size together. This leads to lower overhead—one bit per object of a certain size, and excellent space efficiency—all objects are of the same size, eliminating fragmentation.

However, as Go types can be any size, potentially the runtime would have to allocate a pool for every possible size (obviously not until they are actually allocated), and this would undo the efficiency of the mechanism as each pool is a number of machine pages, yet most pools would be largely unused.

The solution to this problem is to not create pools for every possible size, but to start to round up the allocation to the next largest size class, for example an allocation of 40 bytes may be rounded up to 48 bytes. The details of this mechanism are left to the curiosity of the reader⁴, but suffice to say, as allocations get larger, the distance between size classes increases.

Hence, by shaving 8 bytes from obj.ProgInfo, the size of the enclosing obj.Prog type decreased from 296 bytes to 288 bytes, bringing it down to from the previous 320 byte size class, to the 288 byte size class, a saving of 32 bytes on 64 bit platforms.

The explanation for why this is not a violation of Go’s memory safety guarantee is left as an exercise for the reader.
The explanation for why this does not break the semantics of Go programs is left as an exercise for the reader.
The explanation for why storing an int into an interface causes a heap allocation is left as an exercise for the reader.
runtime/msize.go

Dave Cheney

The acme of foolishness