Tag Archives: generics

Maybe adding generics to Go IS about syntax after all

This is a short response to the recently announced Go 2 generics draft proposals

Update: This proposal is incomplete. It cannot replace two common use cases. The first is ensuring that several formal parameters are of the same type:

contract comparable(t T) {
        t > t
}

func max(type T comparable)(a, b T) T

Here a, and b must be the same parameterised type — my suggestion would only assert that they had at least the same contract.

Secondly the it would not be possible to parameterise the type of return values:

contract viaStrings(t To, f From) {
        var x string = f.String()
        t.Set(string(""))
}

func SetViaStrings(type To, From viaStrings)(s []From) []To

Thanks to Ian Dawes and Sam Whited for their insight.

Bummer.

My lasting reaction to the Generics proposal is the proliferation of parenthesis in function declarations.

Although several of the Go team suggested that generics would probably be used sparingly, and the additional syntax would only be burden for the writer of the generic code, not the reader, I am sceptical that this long requested feature will be sufficiently niche as to be unnoticed by most Go developers.

It is true that type parameters can be inferred from their arguments, the declaration of generic functions and methods require a clumsy (type parameter declaration in place of the more common <T> syntaxes found in C++ and Java.

The reason for (type, it was explained to me, is Go is designed to be parsed without a symbol table. This rules out both <T> and [T] syntaxes as the parser needs ahead of time what kind of declaration a T is to avoid interpreting the angle or square braces as comparison or indexing operators respectively.

Contract as a superset of interfaces

The astute Roger Peppe quickly identified that contracts represent a superset of interfaces

A new way to spell interface{}:
contract(_){}#golang
— Roger Peppe (@rogpeppe) September 1, 2018

Any behaviour you can express with an interface, you can do so and more, with a contract.

The remainder of this post are my suggestions for an alternative generic function declaration syntax that avoids add additional parenthesis by leveraging Roger’s observation.

Contract as a kind of type

The earlier Type Functions proposal showed that a type declaration can support a parameter. If this is correct, then the proposed contract declaration could be rewritten from

contract stringer(x T) {
        var s string = x.String()
}

type stringer(x T) contract {
        var s string = x.String()
}

This supports Roger’s observation that a contract is a superset of an interface. type stringer(x T) contract { ... } introduces a new contract type in the same way type stringer interface { ... }introduces a new interface type.

If you buy my argument that a contract is a kind of type is debatable, but if you’re prepared to take it on faith then the remainder of the syntax introduced in the generics proposal could be further simplified.

If contracts are types, use them as types

If a contract is an identifier then we can use a contract anywhere that a built-in type or interface is used. For example

func Stringify(type T stringer)(s []T) (ret []string) {
        for _, v := range s {
                ret = append(ret, v.String())
        } return ret
}

Could be expressed as

func Stringify(s []stringer) (ret []string) {
        for _, v := range s {
                ret = append(ret, v.String())
        } return ret
}

That is, in place of explicitly binding T to the contract stringer only for T to be referenced seven characters later, we bind the formal parameter s to a slice of stringer s directly. The similarity with the way this would previously be done with a stringer interface emphasises Roger’s observation.

Unifying unknown type parameters

The first example in the design proposal introduces an unknown type parameter.

func Print(type T)(s []T) {
        for _, v := range s {
                fmt.Println(v)
        }
}

The operations on unknown types are limited, they are in some senses values that can only be read. Again drawing on Roger’s observation above, the syntax could potentially be expressed as:

func Print(s []contract{}) {
        for _, v := range s {
                fmt.Println(v)
        }
}

Or maybe even

type T contract {} func Print(s []T) {
        for _, v := range s {
                fmt.Println(v)
        }
}

In essence the literal contract{} syntax defines an anonymous unknown type analogous to interface{}‘s anonymous interface type.

Conclusion

The great irony is, after years of my bloviation that “adding generics to Go has nothing to do with the syntax”

, it turns out that, actually, yes, the syntax is crucial.

How the Go runtime implements maps efficiently (without generics)

This post discusses how maps are implemented in Go. It is based on a presentation I gave at the GoCon Spring 2018 conference in Tokyo, Japan.

What is a map function?

To understand how a map works, let’s first talk about the idea of the map function. A map function maps one value to another. Given one value, called a key, it will return a second, the value.

map(key) → value

Now, a map isn’t going to be very useful unless we can put some data in the map. We’ll need a function that adds data to the map

insert(map, key, value)

and a function that removes data from the map

delete(map, key)

There are other interesting properties of map implementations like querying if a key is present in the map, but they’re outside the scope of what we’re going to discuss today. Instead we’re just going to focus on these properties of a map; insertion, deletion and mapping keys to values.

Go’s map is a hashmap

The specific map implementation I’m going to talk about is the hashmap, because this is the implementation that the Go runtime uses. A hashmap is a classic data structure offering O(1) lookups on average and O(n) in the worst case. That is, when things are working well, the time to execute the map function is a near constant.

The size of this constant is part of the hashmap design and the point at which the map moves from O(1) to O(n) access time is determined by its hash function.

The hash function

What is a hash function? A hash function takes a key of an unknown length and returns a value with a fixed length.

hash(key) → integer

this hash value is almost always an integer for reasons that we’ll see in a moment.

Hash and map functions are similar. They both take a key and return a value. However in the case of the former, it returns a value derived from the key, not the value associated with the key.

Important properties of a hash function

It’s important to talk about the properties of a good hash function as the quality of the hash function determines how likely the map function is to run near O(1).

When used with a hashmap, hash functions have two important properties. The first is stability. The hash function must be stable. Given the same key, your hash function must return the same answer. If it doesn’t you will not be able to find things you put into the map.

The second property is good distribution. Given two near identical keys, the result should be wildly different. This is important for two reasons. Firstly, as we’ll see, values in a hashmap should be distributed evenly across buckets, otherwise the access time is not O(1). Secondly as the user can control some of the aspects of the input to the hash function, they may be able to control the output of the hash function, leading to poor distribution which has been a DDoS vector for some languages. This property is also known as collision resistance.

The hashmap data structure

The second part of a hashmap is the way data is stored.
The classical hashmap is an array of buckets each of which contains a pointer to an array of key/value entries. In this case our hashmap has eight buckets (as this is the value that the Go implementation uses) and each bucket can hold up to eight entries each (again drawn from the Go implementation). Using powers of two allows the use of cheap bit masks and shifts rather than expensive division.

As entries are added to a map, assuming a good hash function distribution, then the buckets will fill at roughly the same rate. Once the number of entries across each bucket passes some percentage of their total size, known as the load factor, then the map will grow by doubling the number of buckets and redistributing the entries across them.

With this data structure in mind, if we had a map of project names to GitHub stars, how would we go about inserting a value into the map?

We start with the key, feed it through our hash function, then mask off the bottom few bits to get the correct offset into our bucket array. This is the bucket that will hold all the entries whose hash ends in three (011 in binary). Finally we walk down the list of entries in the bucket until we find a free slot and we insert our key and value there. If the key was already present, we’d just overwrite the value.

Now, lets use the same diagram to look up a value in our map. The process is similar. We hash the key as before, then masking off the lower 3 bits, as our bucket array contains 8 entries, to navigate to the fifth bucket (101 in binary). If our hash function is correct then the string "moby/moby" will always hash to the same value, so we know that the key will not be in any other bucket. Now it’s a case of a linear search through the bucket comparing the key provided with the one stored in the entry.

Four properties of a hash map

That was a very high level explanation of the classical hashmap. We’ve seen there are four properties you need to implement a hashmap;

1. You need a hash function for the key.
2. You need an equality function to compare keys.
3. You need to know the size of the key and,
4. You need to know the size of the value because these affect the size of the bucket structure, which the compiler needs to know, as you walk or insert into that structure, how far to advance in memory.

Hashmaps in other languages

Before we talk about the way Go implements a hashmap, I wanted to give a brief overview of how two popular languages implement hashmaps. I’ve chosen these languages as both offer a single map type that works across a variety of key and values.

C++

The first language we’ll discuss is C++. The C++ Standard Template Library (STL) provides std::unordered_map which is usually implemented as a hashmap.

This is the declaration for std::unordered_map. It’s a template, so the actual values of the parameters depend on how the template is instantiated.

template<
    class Key,                             // the type of the key
    class T,                               // the type of the value
    class Hash = std::hash<Key>,            // the hash function
    class KeyEqual = std::equal_to<Key>,    // the key equality function
    class Allocator = std::allocator< std::pair<const Key, T> > 
> class unordered_map;

There is a lot here, but the important things to take away are;

The template takes the type of the key and value as parameters, so it knows their size.
The template takes a std::hash function specialised on the key type, so it knows how to hash a key passed to it.
And the template takes an std::equal_to function, also specialised on key type, so it knows how to compare two keys.

Now we know how the four properties of a hashmap are communicated to the compiler in C++’s std::unordered_map, let’s look at how they work in practice.

First we take the key, pass it to the std::hash function to obtain the hash value of the key. We mask and index into the bucket array, then walk the entries in that bucket comparing the keys using the std::equal_to function.

Java

The second language we’ll discuss is Java. In java the hashmap type is called, unsurprisingly, java.util.Hashmap.

In java, the java.util.Hashmap type can only operate on objects, which is fine because in Java almost everything is a subclass of java.lang.Object. As every object in Java descends from java.lang.Object they inherit, or override, a hashCode and an equals method.

However, you cannot directly store the eight primitive types; boolean, int, short, long, byte, char, float, and double, because they are not subclasss of java.lang.Object. You cannot use them as a key, you cannot store them as a value. To work around this limitation, those types are silently converted into objects representing their primitive values. This is known as boxing.

Putting this limitation to one side for the moment, let’s look at how a lookup in Java’s hashmap would operate.

First we take the key and call its hashCode method to obtain the hash value of the key. We mask and index into the bucket array, which in Java is a pointer to an Entry, which holds a key and value, and a pointer to the next Entry in the bucket forming a linked list of entries.

Tradeoffs

Now that we’ve seen how C++ and Java implement a Hashmap, let’s compare their relative advantages and disadvantages.

C++ templated `std::unordered_map`

Advantages

Size of the key and value types known at compile time.
Data structure are always exactly the right size, no need for boxing or indiretion.
As code is specialised at compile time, other compile time optimisations like inlining, constant folding, and dead code elimination, can come into play.

In a word, maps in C++ can be as fast as hand writing a custom map for each key/value combination, because that is what is happening.

Disadvantages

Code bloat. Each different map are different types. For N map types in your source, you will have N copies of the map code in your binary.
Compile time bloat. Due to the way header files and template work, each file that mentions a std::unordered_map the source code for that implementation has to be generated, compiled, and optimised.

Java util Hashmap

Advantages

One implementation of a map that works for any subclass of java.util.Object. Only one copy of java.util.HashMap is compiled, and its referenced from every single class.

Disadvantages

Everything must be an object, even things which are not objects, this means maps of primitive values must be converted to objects via boxing. This adds gc pressure for wrapper objects, and cache pressure because of additional pointer indirections (each object is effective another pointer lookup)
Buckets are stored as linked lists, not sequential arrays. This leads to lots of pointer chasing while comparing objects.
Hash and equality functions are left as an exercise to the author of the class. Incorrect hash and equals functions can slow down maps using those types, or worse, fail to implement the map behaviour.

Go’s hashmap implementation

Now, let’s talk about how the hashmap implementation in Go allows us to retain many of the benfits of the best map implementations we’ve seen, without paying for the disadvantages.

Just like C++ and just like Java, Go’s hashmap written in Go. But–Go does not provide generic types, so how can we write a hashmap that works for (almost) any type, in Go?

Does the Go runtime use interface{} ?

No, the Go runtime does not use interface{} to implement its hashmap. While we have the container/{list,heap} packages which do use the empty interface, the runtime’s map implementation does not use interface{}.

Does the compiler use code generation?

No, there is only one copy of the map implementation in a Go binary. There is only one map implementation, and unlike Java, it doesn’t use interface{} boxing. So, how does it work?

There are two parts to the answer, and they both involve co-operation between the compiler and the runtime.

Compile time rewriting

The first part of the answer is to understand that map lookups, insertion, and removal, are implemented in the runtime package. During compilation map operations are rewritten to calls to the runtime. eg.

v := m["key"]     → runtime.mapaccess1(m, ”key", &v)
v, ok := m["key"] → runtime.mapaccess2(m, ”key”, &v, &ok)
m["key"] = 9001   → runtime.mapinsert(m, ”key", 9001)
delete(m, "key")  → runtime.mapdelete(m, “key”)

It’s also useful to note that the same thing happens with channels, but not with slices.

The reason for this is channels are complicated data types. Send, receive, and select have complex interactions with the scheduler so that’s delegated to the runtime. By comparison slices are much simpler data structures, so the compiler natively handles operations like slice access, len and cap while deferring complicated cases in copy and append to the runtime.

Only one copy of the map code

Now we know that the compiler rewrites map operations to calls to the runtime. We also know that inside the runtime, because this is Go, there is only one function called mapaccess, one function called mapaccess2, and so on.

So, how can the compiler can rewrite this

v := m[“key"]

into this

 runtime.mapaccess(m, ”key”, &v)

without using something like interface{}? The easiest way to explain how map types work in Go is to show you the actual signature of runtime.mapaccess1.

func mapaccess1(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer

Let’s walk through the parameters.

key is a pointer to the key, this is the value you provided as the key.
h is a pointer to a runtime.hmap structure. hmap is the runtime’s hashmap structure that holds the buckets and other housekeeping values ³.
t is a pointer to a maptype, which is odd.

Why do we need a *maptype if we already have a *hmap? *maptype is the special sauce that makes the generic *hmap work for (almost) any combination of key and value types. There is a maptype value for each unique map declaration in your program. There will be one that describes maps from strings to ints, from strings to http.Headers, and so on.

Rather than having, as C++ has, a complete map implementation for each unique map declaration, the Go compiler creates a maptype during compilation and uses that value when calling into the runtime’s map functions.

type maptype struct {
         typ           _type
         key         *_type
         elem        *_type
         bucket        *_type // internal type representing a hash bucket 
        hmap          *_type // internal type representing a hmap
         keysize       uint8  // size of key slot
         indirectkey   bool   // store ptr to key instead of key itself
         valuesize     uint8  // size of value slot
         indirectvalue bool   // store ptr to value instead of value itself
         bucketsize    uint16 // size of bucket
         reflexivekey  bool   // true if k==k for all keys
         needkeyupdate bool   // true if we need to update key on overwrite 
}

Each maptype contains details about properties of this kind of map from key to elem. It contains infomation about the key, and the elements. maptype.key contains information about the pointer to the key we were passed. We call these type descriptors.

type _type struct {
         size       uintptr
         ptrdata    uintptr // size of memory prefix holding all pointers
         hash       uint32
         tflag      tflag
         align      uint8 
        fieldalign uint8
         kind       uint8
         alg       *typeAlg 
        // gcdata stores the GC type data for the garbage collector.
         // If the KindGCProg bit is set in kind, gcdata is a GC program. 
        // Otherwise it is a ptrmask bitmap. See mbitmap.go for details.
         gcdata    *byte
         str       nameOff 
        ptrToThis typeOff
 }

In the _type type, we have things like it’s size, which is important because we just have a pointer to the key value, but we need to know how large it is, what kind of a type it is; it is an integer, is it a struct, and so on. We also need to know how to compare values of this type and how to hash values of that type, and that is what the _type.alg field is for.

type typeAlg struct {
         // function for hashing objects of this type
         // (ptr to object, seed) -> hash
         hash func(unsafe.Pointer, uintptr) uintptr
         // function for comparing objects of this type
         // (ptr to object A, ptr to object B) -> ==? 
        equal func(unsafe.Pointer, unsafe.Pointer) bool 
}

There is one typeAlg value for each type in your Go program.

Putting it all together, here is the (slightly edited for clarity) runtime.mapaccess1 function.

// mapaccess1 returns a pointer to h[key].  Never returns nil, instead 
// it will return a reference to the zero object for the value type if 
// the key is not in the map.
 func mapaccess1(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer { 
        if h == nil || h.count == 0 {
                 return unsafe.Pointer(&zeroVal[0]) 
        } 
        alg := t.key.alg
         hash := alg.hash(key, uintptr(h.hash0))
         m := bucketMask(h.B)
         b := (*bmap)(add(h.buckets, (hash&m)*uintptr(t.bucketsize)))

One thing to note is the h.hash0 parameter passed into alg.hash. h.hash0 is a random seed generated when the map is created. It is how the Go runtime avoids hash collisions.

Anyone can read the Go source code, so they could come up with a set of values which, using the hash ago that go uses, all hash to the same bucket. The seed value adds an amount of randomness to the hash function, providing some protection against collision attack.

Conclusion

I was inspired to give this presentation at GoCon because Go’s map implementation is a delightful compromise between C++’s and Java’s, taking most of the good without having to accomodate most of the bad.

Unlike Java, you can use scalar values like characters and integers without the overhead of boxing. Unlike C++, instead of N runtime.hashmap implementations in the final binary, there are only N runtime.maptype values, a substantial saving in program space and compile time.

Now I want to be clear that I am not trying to tell you that Go should not have generics. My goal today was to describe the situation we have today in Go 1 and how the map type in Go works under the hood. The Go map implementation we have today is very fast and provides most of the benefits of templated types, without the downsides of code generation and compile time bloat.

I see this as a case study in design that deserves recognition.

Should Go 2.0 support generics?

A long time ago, someone–I normally attribute this to David Symonds, but I can’t be sure he was the first to say it–said that the reason for adding generics to Go would be the reason for calling it Go 2.0. That is to say, adding generics to the language would be half baked if they were not used throughout the standard library. I wrote about this in a series of blog posts where I explored what I felt would be the repercussions of integrating templated types into Go.

Do I think Go should have generics? Well, there are really two answers to that question.

As I argued in my Simplicity Debt posts, mainstream programmers in 2017 expect a set of features in their languages. Many of us work in polyglot environments. Even if we want to be writing in Go as much as possible, there’s usually some Javascript, some CSS, some Python, maybe some Java, Swift, C#, PHP or even C++ in the project. Maybe this will change in the future, but right now, if you’re a commercial programmer working for a crust, every day you’ll touch a bunch of languages, so their differences tend to rub against one another.

Mainstream programmers expect static typing, not for performance, but for readability and maintainability–just look at what Typescript and Dart are bringing to Javascript, and Python’s formative efforts with optional typing.
Mainstream programmers expect concurrency. They expect to be able to do more than one thing at a time–just look at node.js and the compromises programmers were prepared to make to move away from heavy-weight thread per connection models. Go is obviously well positioned here.
Mainstream programmers expect some form of templated types because they’re used to it in the other languages they interact with alongside Go.

So my first answer is: Go should have some form of generics because it is a mainstream, imperative, block scoped language and it is expected these days.

My second answer is if the designers of the language choose not to add templated types or parameterised functions–and keep in mind that I am not one of the language designers, only an exuberant fan–because, as I wrote in my series of posts, the repercussions for the simplicity and readability of the language may prove too jarring. If that were to happen, my recommendation would be that Go should own that decision.

What do I mean by that? Well, the best explanation I can give is a counterexample. Let’s look at Haskell. Haskell is what most functional programmers consider to be the baseline for a real FP language, and thus it looks pretty much like nothing programmers schooled in side effect ridden, block structured, imperative languages are used to. But Haskell programmers own that, they own their difference. A Haskell programmer doesn’t see a reason to make their language work more like PHP, or C++, or Rust, or even Go, and they are happy to explain the Haskell way of doing things to anyone who asks. My point is that if Go is not going to have a story for templated types, then we need to own it, just like Haskell programmers own their decisions.

This isn’t simply a case of saying “nope, sorry, no generics for Go 2.0, maybe in another 5 years”, but a more fundamental statement that they are not something that will be implemented in Go because we believe there is a better way to solve the underlying problem. Note that I did not say a better way to implement a templated type or parameterised function, but a better way to solve the underlying business problem. There is a difference.

This isn’t without precedent, Go was one of the first C style languages to eschew type inheritance, a decision which lead to a radical simplification of the language and a focus on the mantras of communicating intent via interfaces, and encapsulation over inheritance. Before Go, it was assumed that a mainstream language would have classes and a type hierarchy, nowadays that is less true.

So, should Go 2.0 have generics? If the decision is to add them then I’m sure it can be done, after all the syntax is the least important part of the decision, and there is a wealth of prior art in other languages to guide us. However, if the decision is not to add templated types, then it should be made so explicitly. Then it is incumbent upon all Go programmers to explain the Go Way of solving problems.

Simplicity Debt Redux

In my previous post I discussed my concerns the additional complexity adding generics or immutability would bring to a future Go 2.0. As it was an opinion piece, I tried to keep it around 500 words. This post is an exploration of the most important (and possibly overlooked) point of that post.

Indeed, the addition of [generics and/or immutability] would have a knock-on effect that would profoundly alter the way error handling, collections, and concurrency are implemented.

Specifically, what I believe would be the possible knock-on effect of adding generics or immutability to the language.

Error handling

A powerful motivation for adding generic types to Go is to enable programmers to adopt a monadic error handling pattern. My concerns with this approach have little to do with the notion of the maybe monad itself. Instead I want to explore the question of how this additional form of error handling might be integrated into the stdlib, and thus the general population of Go programmers.

Right now, to understand how io.Reader works you need to know how slices work, how interfaces work, and know how nil works. If the if err != nil { return err } idiom was replaced by an option type or maybe monad, then everyone who wanted to do basic things like read input or write output would have to understand how option types or maybe monads work in addition to discussion of what templated types are, and how they are implemented in Go.

Obviously it’s not impossible to learn, but it is more complex than what we have today. Newcomers to the language would have to integrate more concepts before they could understand basic things, like reading from a file.

The next question is, would this monadic form become the single way errors are handled? It seems confusing, and gives unclear guidiance to newcomers to Go 2.0, to continue to support both the error interface model and a new monadic maybe type. Also, if some form of templated maybe type was added, would it be a built in, like error, or would it have to be imported in almost every package. Note: we’ve been here before with os.Error.

What began as the simple request to create the ability to write a templated maybe or option type has ballooned into a set of question that would affect every single Go package ever written.

Collections

Another reason to add templated types to Go is to facilitate custom collection types without the need for interface{} boxing and type assertions.

On the surface this sounds like a grand idea, especially as these types are leaking into the standard library anyway. But that leaves the question of what to do with the built in slice and map types. Should slices and maps co-exist with user defined collections, or should they be removed in favour of defining everything as a generic type?

To keep both sounds redundant and confusing, as all Go developers would have to be fluent in both and develop a sophisticated design sensibility about when and where to choose one over the other. But to remove slices and maps in favour of collection types provided by a library raises other questions.

Slicing

For example, if there is no slice type, only types like a vector or linked list, what happens to slicing? Does it go away, if so, how would that impact common operations like handling the result a call to io.Reader.Read? If slicing doesn’t go away, would that require the addition of operator overloading so that user defined collection types can implement a slice operator?

Then there are questions on how to marry the built in map type with a user defined map or set. Should user defined maps support the index and assignment operators? If so, how could a user defined map offer both the one and two return value forms of lookup without requiring polymophic dispatch based on the number of return arguments? How would those operators work in the presence of set operations which have no value, only a key?

Which types could use the delete function? Would delete need to be modified to work with types that implement some kind of Deleteable interface? The same questions apply to append, len, cap, and copy.

What about addressability? Values in the built in map type are not addressable, but should that be permitted or disallowed for user defined map types? How would that interact with operator overloading designed to make user defined maps look more like the built in map?

What sounded like a good idea on paper—make it possible for programmers to define their own efficient collection data types—has highlighted how deeply integrated the built in map and slice are and spawned not only a requirement for templated types, but operator overloading, polymorphic dispatch, and some kind of return value addressability semantics.

How could you implement a vector?

So, maybe you make the argument that now we have templated types we can do away with the built in slice and map, and replace them with a Java-esque list of collection types.

Go’s Pascal-like array type has a fixed size known at compile time. How could you implement a growable vector without resorting to unsafe hacks? I’ll leave that as an exercise to the reader. But I put it to you that if you cannot implement simple templated vector type with the memory safety we enjoy today with slices, then that is a very strong design smell.

Iteration

I’ll admit that the inability to use the for ... range statement over my own types was something that frustrated me for a long time when I came to Go, as I was accustomed to the flexibility of the iterator types in the Java collections library.

But iterating over in-memory data structures is boring—what you really want to be able to do is compose iterators over database results and network requests. In short, data from outside your process—and when data is outside your process, retrieving it might fail. In that case you have a choice, does your Iterable interface return a value, a value and an error, or perhaps you go down the option type route. Each would require a new form of range loop semantic sugar in an area which already contains its share of footguns.

You can see that adding the ability to write template collection types sounds great on paper, but in practice it would perpetuate a situation where the built in collection types live on in addition to their user defined counterparts. Each would have their strengths and weaknesses, and a Go developer would have to become proficient in both. This is something that Go developers just don’t have to think about today as slices and maps are practically ubiquitous.

Immutability

Russ wrote at the start of the year that a story for reference immutability was an important area of exploration for the future of Go. Having surveyed hundreds of Go packages and found few which are written with an understanding of the problem of data races—let alone actually tried running their tests under the race detector—it is tempting to agree with Russ that the ‘after the fact’ model of checking for races at run time has some problems.

On balance, after thinking about the problems of integrating templated types into Go, I think if I had to choose between generics and immutability, I’d choose the latter.

But the ability to mark a function parameter as const is insufficient, because while it restricts the receiver from mutating the value, it does not prohibit the caller from doing so, which is the majority of the data races I see in Go programs today. Perhaps what Go needs is not immutability, but ownership semantics.

While the Rust ownership model is undoubtedly correct—iff your program complies, it has no data races—nobody can argue that the ownership model is simple or easy for newcomers. Nor would adding an extra dimension of immutability to every variable declaration in Go be simple as it would force every user of the language to write their programs from the most pessimistic standpoint of assuming every variable will be shared and will be mutated concurrently.

In conclusion

These are some of the knock on effects that I see of adding generics or immutability to Go. To be clear, I’m not saying that it should not be done, in fact in my previous post I argued the opposite.

What I want to make clear is adding generics or immutability has nothing to do with the syntax of those features, little to do with their underlying implementation, and everything to do with the impact on the overall complexity budget of the language and its libraries, that these features would unlock.

David Symonds argued years ago that there would be no benefit in adding generics to Go if they were not used heavily in the stdlib. The question, and concern, I have is; would the result be more complex than what we have today with our quaint built in slice, map, and error types?

I think it is worth keeping in mind the guiding principals of the language—simplicity and readability. The design of Go does not follow the accretive model of C++ or Java The goal is not to reinvent those languages, minus the semicolons.

Simplicity Debt

Fifteen years ago Python’s GIL wasn’t a big issue. Concurrency was something dismissed as probably unnecessary. What people really was needed was a faster interpreter, after all, who had more than one CPU? But, slowly, as the requirement for concurrency increased, the problems with the GIL increased.

By the time this decade rolled around, Node.js and Go had arrived on the scene, highlighting the need for concurrency as a first class concept. Various async contortions papered over the single threaded cracks of Python programs, but it was too late. Other languages had shown that concurrency must be a built-in facility, and Python had missed the boat.

When Go launched in 2009, it didn’t have a story for templated types. First we said they were important, but we didn’t know how to implement them. Then we argued that you probably didn’t need them, instead Go programmers should focus on interfaces, not types. Meanwhile Rust, Nim, Pony, Crystal, and Swift showed that basic templated types are a useful, and increasingly, expected feature of any language—just like concurrency.

There is no question that templated types and immutability are on their way to becoming mandatory in any modern programming language. But there is equally no question that adding these features to Go would make it more complex.

Just as efforts to improve Go’s dependency management situation have made it easier to build programs that consume larger dependency graphs, producing larger and more complex pieces of software, efforts to add templated types and immutability to the language would unlock the ability to write more complex, less readable software. Indeed, the addition of these features would have a knock on effect that would profoundly alter the way error handling, collections, and concurrency are implemented.

I have no doubt that adding templated types to Go will make it a more complicated language, just as I have no doubt that not adding them would be a mistake–lest Go find itself, like Python, on the wrong side of history. But, no matter how important and useful templated types and immutability would be, integrating them into a hypothetical Go 2 would decrease its readability and increase compilation times—two things which Go was designed to address. They would, in effect, impose a simplicity debt.

If you want generics, immutability, ownership semantics, option types, etc, those are already available in other languages. There is a reason Go programmers choose to program in Go, and I believe that reason stems from our core tenets of simplicity and readability. The question is, how can we pay down the cost in complexity of adding templated types or immutability to Go?

Go 2 isn’t here yet, but its arrival is a lot more certain than previously believed. As it stands now, generics or immutability can’t just be added to Go and still call it simple. As important as the discussions on how to add these features to Go 2 would be, equal weight must be given to the discussion of how to first offset their inherent complexity.

We have to build up a bankroll to spend on the complexity generics and immutability would add, otherwise Go 2 will start its life in simplicity debt.

Next: Simplicity Debt Redux

Contract as a superset of interfaces

Contract as a kind of type

If contracts are types, use them as types

Unifying unknown type parameters

Conclusion

What is a map function?

Go’s map is a hashmap

The hash function

Important properties of a hash function

The hashmap data structure

Four properties of a hash map

Hashmaps in other languages

C++

Java

Tradeoffs

C++ templated std::unordered_map

Advantages

Disadvantages

Java util Hashmap

Advantages

Disadvantages

Go’s hashmap implementation

Does the Go runtime use interface{} ?

Does the compiler use code generation?

Compile time rewriting

Only one copy of the map code

Conclusion

Error handling

Collections

Slicing

How could you implement a vector?

Iteration

Immutability

In conclusion

C++ templated `std::unordered_map`

Does the Go runtime use interface{} ?