What Have We Learned from the PDP-11?

This post is a slightly edited version of my November presentation to the San Francisco chapter of Papers We Love.


C. G. Bell, W. D. Strecker, “Computer What Have We Learned from the PDP-11,” The 3rd Annual Symposium on Computer Architecture Conference Proceedings, pp. l-14, 1976.

The paper I have chosen tonight is a retrospective on a computer design. It is one of a series of papers by Gordon Bell, and various co-authors, spanning the design, growth, and eventual replacement of the companies iconic line of PDP-11 mini computers.

  • C. G. Bell, R. Cady, H. McFarland, B. Delagi, J. O’Laughlin, R. Noonan and W. Wulf, “A New Architecture for Mini-Computers – The DEC PDP- 11,” Proceedings of the Sprint Joint Computer Conference, pp. 657-675, AFIPS Press, 1970.
  • C. G. Bell, W. D. Strecker, “Computer What Have We Learned from the PDP-11,” The 3rd Annual Symposium on Computer Architecture Conference Proceedings, pp. l-14, 1976.
  • W. D. Strecker, “VAX-11/780: A Virtual Address Extension to the DEC PDP-11 Family,” Proceedings of the National Computer Conference, pp. 967-980, AFIPS Press, 1978.
  • C. G. Bell, W. D. Strecker, “Retrospective: what have we learned from the PDP-11—what we have learned from VAX and Alpha”, Proceedings of the 25th Annual International Symposium on Computer Architecture, pp. 6-10, 1998.

This year represents the 60th anniversary of the founding of the company that produced the PDP-11. It is also 40 years since this paper was written, so I thought it would be entertaining to review Bell’s retrospective through the lens of our own 20/20 hindsight.

Image credit: saccade.com

Who were Digital Equipment Corporation?

To set the scene for this paper, first we should talk a little about the company that produced the PDP-11, the Digital Equipment Corporation of Maynard, Massachusetts. Better known as DEC.

Image Credit: boston.com

DEC was founded in 1957 by Ken Olsen, and Harlan Anderson. Olsen and Anderson had worked together at MIT’s Lincoln Laboratory where they noticed that students would queue for hours to use the TX-0, an experimental interactive computer designed by Wes Clarke1.

Image credit: computerhistory.org

This is the TX-0. What do you notice about it?

Image credit: computerhistory.org

Let’s compare it to something like the IBM 704, a contemporary mainframe, which the MIT students largely ignored2. What Oslen and Anderson recognised was the desire for an interactive computer experience was so strong that there was a market for “small” computers dedicated to this role.

Image credit: computer-history.info

DEC’s first offering was the PDP-1, effectively a commercial version of the TX-0.


Correction: Michael Cheponis, the lead on the PDP-1 restoration project, kindly wrote to me with the following information.

Whirlwind and PDP-1 have 5 bit operation codes, TX-0 started with a 2 or 3 bit operation code, but gradually grew more after it lost the large memory to TX-2. A comparison of the PDP-1 and Whirlwind order codes makes it clear that the PDP-1 is a cost reduced and slightly improved version of the Whirlwind architecture.

    • Improvement: add indirect addressing.
    • Cost reduction: eliminate the live register by adding the subroutine call instructions, and eliminate the shift counter by replacing integer multiply and divide with multiply step and divide step instructions and making the shift instructions shift once for each bit in the address field.The biggest cost reduction was due to taking advantage of the advancement in logic design and packaging from Whirlwind to the PDP-1. (Using the Barta building for packaging versus 4 standard electronics racks.)

The contrast between the PDP-1 and the IBM 704 was visible at MIT, but the small, slow interactive computers, the Librascope LGP-30 and the Bendix G-15 preceded the PDP-1 by several years and sold in similar quantities.


It’s also worth noting that the name PDP is an acronym for “Programmed Data Processor”, as at the time, computers had a reputation of being large, complicated, and expensive machines, and DEC’s venture capitalists would not support them if they built a “computer”3.

Following the success of the PDP-1, DEC’s offerings blossomed into several families of computers, many of which were designed at least in part by Gordon Bell, the author of tonight’s paper.

Introduction

A computer is not solely determined by its architecture; it reflects the technological, economic, and human aspects of the environment in which it was designed and built. […] The finished computer is a product of the total design environment.

Right from the get go, Bell is letting us know that the success of any computer project is not abstractly building the best computer but building the right computer, and that takes context.

In this chapter, we reflect on the PDP-11: its goals, its architecture, its various implementations, and the people who designed it. We examine the design, beginning with the architectural specifications, and observe how it was affected by technology, by the development organization, the sales, application, and manufacturing organizations, and the nature of the final users.

By this time, 1976, Bell has been the VP of engineering at DEC for nearly four years. It’s clear that he’s considering the success of the PDP-11 in the wider context of the market which it was both developed to serve, and which would later influence the evolution of the PDP-11 line.

In keeping with the spirit of Bell’s words, this presentation focuses on two aspects of the paper; the technology and the people.

Background: Thoughts Behind The Design

Bell opens with this observation

It is the nature of computer engineering to be goal-oriented, with pressure to produce deliverable products. It is therefore difficult to plan for an extensive lifetime.

This is your agile mindset, right here. This is 25 years before Snowbird and the birth of the Agile manifesto. When this was written DEC weren’t a scrappy startup desperate to make their bones, they were an established company with several successful product lines in the market, and Bell is talking about the pressure to ship a minimum viable product trampling all over notions of being able to plan elaborately.

Like the IBM/360, the PDP-11 was designed not just as a single model of computer, but a range of models, for which software written for a small PDP-11 would be compatible with the larger one.

“The term architecture is used here to describe the attributes of a system as seen by the programmer, i.e., the conceptual structure and functional behaviour, as distinct from the organisation of the data flow and controls, the logical design, and the physical implementation.”  — G. M. Amdahl G. A. Blaauw and F. P. Brooks Jr. Architecture of the IBM System/360, 1964

Because of the open nature of the PDP-11, anything which interpreted the instructions according to the processor specification, was a PDP-11, so there had been a rush within DEC, once it was clear that the PDP-11 market was heating up, to build implementations; you had different groups building fast, expensive ones and cost reduced slower ones.

Despite its evolutionary planning, the PDP-11 has been quite successful in the marketplace: over 20,000 have been sold in the six years that it has been on the market (197Cb1975). It is not clear how rigorous a test (aside from the marketplace) we have given the design, since a large and aggressive marketing organization, armed with software to correct architectural inconsistencies and omissions, can save almost any design.

Here Bell is introducing his hypothesis for the paper; was the PDP-11 a great design, or was it simply the beneficiary of a hyperactive marketing department? To answer his own question Bell begins to reflect on the product that emerged and evaluate it against the design criteria that he and his fellow authors identified six years earlier.

Address space

The first weakness of minicomputers was their limited addressing capability. The biggest (and most common) mistake that can be made in a computer design is that of not providing enough address bits for memory addressing and management.

Minicomputers of the era often came with a 12 bit address space, offering just 4096 addresses each holding a 12 bit value called a word.

It’s worth taking an aside to note that the word minicomputer, which later came to be understood to be an indication of their physical size, was originally a contraction of the words minimal computer. The canonical example was the PDP-8, DEC’s previous offering, which offered just eight instructions.

Image credit: wikipedia

The reason for this tiny address spaces was cost. Memory was extremely expensive in the 60’s and early 70’s as each bit of storage consisted of a tiny iron doughnut, or core, woven into a mesh of control wires.

Cores were arranged into a plane, in this case representing 4096 bits, then stacked to produce words, so a 4096 word memory contained 16 million cores, all of which were at least partially hand assembled. You can see why memory was expensive.

Image credit: bitsavers.org

Bell and the other PDP-11 designers knew that core memory prices were continuing to fall, and that semiconductor memory, while not cost effective at that time, would continue to drive down the price per bit of storage. Thus the amount of memory that customers could afford to buy would increase over time as “… users tend to buy “constant dollar” systems”. But alas,

The PDP-11 followed this hallowed tradition of skimping on address bits, but it was saved by the principle that a good design can evolve through at least one major change.

Even with the designers foresight, Bell noted that not two years after its introduction, the PDP-11 had to be redesigned to include a memory management unit to allow access to a larger 18 bit address space at the cost of increased programming complexity. A few years later a further 4 bits were added.

In fairness, while Bell chastised himself for not seeing this coming, this pattern of insufficient address bits continues to this day. Who remembers the dos 640k limit? Who remembers fighting with himem.sys? Who has struggled with 32 bit programs that need more than 2gb of heap?

So, none of us are perfect.

Insufficient registers

A second weakness of minicomputers was their tendency not to have enough registers. This was corrected for the PDP-11 by providing eight 16-bit registers. Later, six 32-bit registers were added for floating-point arithmetic.  […] More registers would increase the multiprogramming context switch time and confuse the user.

It was not uncommon, even for mainframes of the day, to offer only a single register—the accumulator. If additional registers were provided, they would be specifically for use in indexing operations, they were not general purpose.

It’s also interesting to note Bell’s concern that additional registers would confuse the user. In the early 1970’s the assumption that the machine would be programmed directly in assembly was still the prevailing mindset.

There is a strong interplay between the number of registers in an architecture, the number of address bits, and the size of the instruction. All of these factors are rooted in the scarcity of memory, so it’s worth taking a little detour into instruction set design.

In Von Neumann machines (of which almost all computers of the 60’s were) the program and its data share the same, limited, address space, so inefficient programs didn’t just waste computing time, they wasted memory. A slow program is somewhat tolerable, but a program that is too large to fit in memory is a fatal condition, so you want your instruction encoding to be as efficient as possible.

Let’s consider the very common case of moving an value from one location in memory to another. How many bits would you need to describe that operation? One implementation might be:

MOV <addr> <addr>

Which would take 16 bits for the source, and a further 16 for the destination, and the some bits to encode the MOV instruction itself. Let’s call it 40 bits; this is not a multiple of 16, which would mean a complicated 2.5 word instruction encoding. However, what if we were to load that address into a register?

MOV (R0), (R1)

Then we’d only need to describe the register, the PDP had 8 registers, so only 3 bits per register, plus some for the operator. This could easily fit into a 16 bit word, rather than requiring a complex variable length encoding.

Image credit: bitsavers.org

It actually turns out that the PDP-11 uses 6 bits per register, but that still left 4 bits for the operation when two operands were present.

Hardware stack

A third weakness of minicomputers was their lack of hardware stack capability. In the PDP-11, this was solved with the autoincrement/autodecrement addressing mechanism. This solution is unique to the PDP-11 and has proven to be exceptionally useful. (In fact, it has been copied by other designers.)

Nowadays it’s hard to imagine hardware that doesn’t have a notion of a stack, but consider that a stack isn’t important if you don’t need recursion.

The design for the PDP-11 was laid down in 1969 and if we look at the programming languages of the time, FORTRAN and COBOL, neither supported recursive function calls. The function call sequence would often store the return address at a blank word at the start of the procedure making recursion impossible.

Image credit: bitsavers.org

The PDP-11 defined a stack pointer as we understand today, a register that controlled the operation of PUSH and POP style instructions, but the PDP-11 went one better and permitted any register to operate as a stack pointer by the addition of an auto increment/decrement modifier on all register operands.

For example, this single instruction:

MOV R4, -(R6)

will decrement the value stored in R6 by two, then store the value in R4 into the address stored in R6. This is the how you push a value onto the stack in PDP-11 assembler. If anyone has done any ARM programming, this will be very familiar to you.

This means there is no need for a dedicated PUSH or POP instruction, saving instruction encoding space, and allowing any register to be used as a stack pointer, although traditionally  R6 was assumed by the hardware if you used the native subroutine call instruction.

Interrupt latency

A fourth weakness, limited interrupt capability and slow context switching, was essentially solved with the device of UNIBUS interrupt vectors, which direct device interrupts.

At this point in DEC’s lifetime, almost all of its products, save the PDP-10 which was DEC’s mainframe offering, were aimed at interactive, laboratory, or process control applications. Interrupt responsiveness, the delay between an interrupt signal being raised, and the computer being ready to process the interrupt, is key to real time performance.

The PDP-11 addressed this by permitting the device that raised the interrupt to supply the address to service the interrupt. Bell proudly reported that:

The basic mechanism is very fast, requiring only four memory cycles from the time an interrupt request is issued until the first instruction of the interrupt routine begins execution.

Character handling

A fifth weakness of prior minicomputers, inadequate character-handling capability, was met in the PDP-11 by providing direct byte addressing capability.

Strings and character handling were of increasing importance during the 1960’s as scientific and business computing converged. The predominant character encodings at the time were 6 bit character sets which provided just enough space for upper case letters, the digits 0 to 9, space, and a few punctuation characters sufficient for printing financial reports.

DEC Sixbit encoding

Because memory was so expensive, placing one 6 bit character into a 12 or 18 bit word was simply unacceptable so characters would be packed into words.

This proved efficient for storage, but complex for operations like move, compare, and concatenate, which had to account for a character appearing in the top or bottom of the word, expending valuable words of program storage to cope.

The problem was addressed in the PDP-11 by allowing the machine to operate on memory as both a 16-bit word, and the increasingly popular 8-bit byte. The expenditure of 2 additional bits per character was felt to be worth it for simpler string handling, and also eased the adoption of the increasingly popular 7-bit ASCII standard of which DEC were a proponent at the time. Bell concludes this point with the throw away line:

Although string instructions are not yet provided in the hard- ware, the common string operations (move, compare, concatenate) can be programmed with very short loops.

And indeed they can. One can write a string copy routine using two instructions, assuming that the source and destination are already in registers.

loop:   MOVB (src)+, (dst)+
        BNE loop

The routine takes full advantage of the fact that MOV updates the processor flag. The loop will continue until the value at the source address is zero, at which point the branch will fall through to the next instruction. This is why C strings are terminated with zeros.

Read only memory

A sixth weakness, the inability to use read-only memories, was avoided in the PDP-11. Most code written for the PDP-11 tends to be pure and reentrant without special effort by the programmer, allowing a read-only memory (ROM) to be used directly.

In process control applications, where the program is relatively fixed, having to load the program each time from magnetic or punched tape is expensive. You have to buy and maintain infrequently used I/O devices. It would be far more convenient if the program could always be present in the computer at startup. However, because of the extreme memory shortage of early minicomputers, and the lack of notion of a hardware stack, self modifying code was often unavoidable, which seriously limited the use of read only memory. Bell is justifiably proud that the PDP-11 design knocked that one out of the park.

Primitive I/O Capabilities

A seventh weakness, one common to many minicomputers, was primitive I/O capabilities.

During the late 60’s when the PDP-11 was being designed, input/output was very expensive. Mainframes of the time use a model called channel I/O, where the main CPU sent a small program to a channel controller, which would execute the program and report the result. The program would usually instruct a tape drive to load a record, or a punch to punch a card.

Channel I/O was important because it allowed the mainframe to offload the oversight of the I/O operation to another processor, freeing its valuable cycles, and permitting overlapping I/O, which in turn increased processor utilisation. The downside was channel I/O required a smaller CPU inside each channel controller which increased the cost of the installation substantially.

In the minicomputer world, I/O was usually performed directly by the CPU, usually with specialised instructions hard coded for a specific device, eg. the paper tape reader or console printer.

The PDP-11 introduced something unique, memory mapped I/O. This isn’t memory mapped I/O that you might be used to with the mmap(2) system call, but rather the convention that specific addresses in memory weren’t just dumb storage, instead their contents were mapped onto cards plugged into the backplane, which DEC called the UNIBUS.

Image credit: bitsavers.org

For example, a value written to 7775664 would be written to device attached to the console, usually a hard copy terminal.

Image credit: bitsavers.org

If you read the value at address 777570, you get the value entered on the front panel switches. This was often used as an early form of bootstrap configuration.

Image credit: bitsavers.org

Similarly talking to the RK05 disk drive was accomplished by writing the sector number you wanted to access to 777412, the address to transfer that sector to address 777410, and the number of words to 777406. Then by setting bit zero at address 777404 to 1 (the GO bit), the drive will transfer the number of words you asked for directly into memory.

Image credit: bitsavers.org

My favorite has got to be the KW11-L line clock. A write to bit 6 of address 777546 starts an interrupt firing every 20ms. What’s special about 20ms? Because that’s the reciprocal of the frequency of the AC waveform here in the US5. Yup, just like a bedside clock, the PDP-11 told time by counting the number of power line cycles.

High programming costs

A ninth weakness of minicomputers was the high cost of programming them. Many users program in assembly language, without the comfortable environment of editors, file systems, and debuggers available on bigger systems. The PDP-11 does not seem to have overcome this weakness, although it appears that more complex systems are being built successfully with the PDP-11 than with its predecessors, the PDP-8 and PDP-15.

Because of their minimal nature, mini computers were not pleasant environments on which to write programs. Often this would involve tedious switch flipping, or perhaps editing and assembling a program on another, larger, computer and then transferring it via paper tape.

This is very much akin to how those of us who work with micro controllers are still programming today; editing on a large workstation, compiling a target hex file, then transferring that hex to the micro controller’s flash storage.

Image credit: Dennis Ritchie

However, it seems Bell was unaware of the work of Thompson and Ritchie, who were busy constructing their own programming environment on their PDP-11 in New Jersey.

People: Builders of the Design

Now we move to the second section; the people.

As an amateur historian this section is the most interesting to me, because the study of the history of computing, or really any historical subject, is fundamentally a study of people, and the context surrounding the decisions they made.

Bell recognises that while computers are built from technology, they are built by people and thus he dedicates this section to describing the group dynamics at DEC during the development of the PDP-11.

The problems faced by computer designers can usually be attributed to one of two causes: inexperience or second-systemitis.

Here Bell recalls the words of Fred Brooks from his recent (at the time) book The Mythical Man-Month.

Brooks, the lead for the OS/360 project struggled for years to build a single general purpose operating system that would run across the entire range of IBM/360 modes, this was after all the goal of the 360 project. Brooks’ words must have been fresh in Bell’s mind when he wrote this paragraph.

Chronology of the design

This section is a perfect window into the operation of DEC in the late 1960’s

The internal organization of DEC design groups has through the years oscillated between market orientation and product orientation. Since the company has been growing at a rate of 30 to 40% a year, there has been a constant need for reorganization. At any given time, one third of the staff has been with the company less than a year.

Raise your hand if this sounds familiar to you.

At the time of the PDP-11 design, the company was structured along product lines. The design talent in the company was organized into tight groups: the PDP-10 group, the PDP-15 (an 18-bit machine) group, the PDP-8 group, an ad hoc PDP-8/S subgroup, and the LINC-8 group. Each group included marketing and engineering people responsible for designing a product, software and hardware. As a result of this organization, architectural experience was diffused among the groups, and there was little understanding of the notion of a range of products.

Here Bell spends some time iterating through each group, listing their strengths and weaknesses as sponsors for the PDP-11. I won’t recount all the choices, with one exception.

The PDP-10 group was the strongest group in the company. They built large powerful time-shared machines. It was essentially a separate division of the company, with little or no interaction with the other groups. Although the PDP-10 group as a whole had the best understanding of system architectural controls, they had no notion of system range, and were only interested in building higher-performance computers.

Having recently worked for a software company where one or two of the oldest products made almost all the company profit I have some sympathy for Bell’s position. The PDP-10 was DEC’s version of a mainframe; enormously powerful, but was only available at one price point.

The first design work for a 16-bit computer was carried out under the eye of the PDP-15 manager, a marketing person with engineering background. This first design was called PDP-X, and included specification for a range of machines. As a range architecture, it was better designed than the later PDP-11, but was not otherwise particularly innovative. Unfortunately, this group managed to convince management that their design was potentially as complex as the PDP-10 (which it was not), and thus ensured its demise, since no one wanted another large computer unrelated to the company’s main large computer.

And here Bell teaches us an important lesson; when your competition is in the same reporting chain, they have effective tools to ensure your project is killed before it reaches the market.

In retrospect, the people involved in designing PDP-X were apparently working simultaneously on the design of Data General.

This shade might go unnoticed by the casual reader, but it’s a reference to a defection to rival that of Shockely’s traitorous eight a decade earlier6.

Edson de Castro, the product manager of the PDP-8, and lead on the PDP-X project had left DEC, along with several of his team to form Data General. The record is not clear if de Castro left because the PDP-X was canceled or if his departure was the final straw for the faltering project. In either case, the result was obvious, as Bell writes.

As the PDP-X project folded, the DCM (Desk Calculator Machine, a code name chosen for security) was started. Design and planning were in disarray, as Data General had been formed and was competing with the PDP-8, using a very small 16-bit computer.

Data General were now competing with DEC with their 16 bit Nova in the space that the PDP-8 had defined and de Castro knew like the back of his hand; rack mounted laboratory equipment.

Image credit: vintagecomputer.net

The 12 bit PDP-8 versus Data General’s 16 bit Nova

By MBlairMartin (Own work) [CC BY-SA 4.0], via Wikimedia Commons

The PDP-11: An Evaluation

The last section of the paper, having evaluated the PDP-11 against its predecessors, Bell proceeds to evaluate the PDP-11 against itself. The biggest take away was the UNIBUS.

In general, the UNIBUS has behaved beyond all expectations. Several hundred types of memories and peripherals have been interfaced to it; it has become a standard architectural component of systems in the $3K to $100K price range (1975).

What is the UNIBUS?

Image credit: wikipedia

The earliest commercial computers were designed with plugin modules connected by a wired backplane. In the days of vacuum tubes this was a necessity because of the unreliability of tubes and the need to replace modules quickly.

Image credit: wikipedia

Later a desire to build computers out of a standardized modules resulted in the generalised logic blocks interconnected by a sophisticated backplane. This is an example of DEC’s early flip-chip offerings.

Image credit: piercefuller.com

You’d mount them on a complex wire wrapped backplane to build a computer, in this case a PDP-8.

The UNIBUS represented an evolution of the previous DEC designs as an abstraction of an idealised control plane. The availability of medium scale integrated components moved the complexity from the backplane to the modules that populated it. This in turn created a standard way for additional modules to be attached to the computer.

The UNIBUS, as a standard, has provided an architectural component for easily configuring systems. Any company, not just DEC, can easily build components that interface to the bus. Good buses make good engineering neighbours, since people can concentrate on structured design. Indeed, the UNIBUS has created a secondary industry providing alternative sources of supply for memories and peripherals. With the exception of the IBM 360 Multiplexor/Selector bus, the UNIBUS is the most widely used computer interconnection standard.

Prior to the UNIBUS, the I/O devices a minicomputer could support were dictated by the designers. You almost had to encode how to talk to them into the CPU logic. After the UNIBUS, the field for end user customisation and experimentation was cracked wide open.

What have we learned from the PDP-11?

Bell’s retrospective ended when the paper was written in 1976/77, but from our vantage point, forty years later, the impact that the PDP-11 has been huge.

RISC

First of all, while the PDP-11 was not designed, or even understood to be a RISC processor–that term would not be coined util 1976 by John Cocke and the IBM 801. However to anyone with experience with processors like the ARM, a contemporary RISC microprocessor, the similarities between the two instruction sets are striking. Just as programming language design is a process of evolution and cultural poaching, so too it seems is instruction set design.

The PDP-11 also drove a stake firmly through the heart of dedicated I/O instructions, it cemented the model of memory mapped I/O as the predominant control mechanism to this day. The only processors post the PDP that offered seperate input/output instructions that I can think of were the Intel 8080, and its cousin, the Z80.

UNIX

Next is the PDP-11’s impact on software and operating systems. The PDP-11 is the machine that Ken Thompson and Dennis Ritchie developed UNIX at Bell Labs7.

Before the PDP-11, there was no UNIX. Before the PDP-11, there was no C, this is the computer that C was designed on. If you want to know why the classical C int is 16 bits wide, it’s because of the PDP-11.

UNIX bought us ideas such as pipes, everything is a file, and interactive computing.

VAX-11/780

In interactive computing, memory usage is effectively unbounded, and while the PDP was perfect for process control applications, this demand for interactive computing was the hallmark and driver for the PDP-11’s replacement.

The year this paper was released, 1977, the PDP-11’s successor, the VAX-11, which stood for “virtual address extension”–you can see Bell was not going to address space mistake again–was released.

BSD

UNIX, which had arrived at Berkley in 1974 aboard a tape carried by Ken Thompson, would evolve into the west coast flavoured Berkley Systems Distribution.

Berkeley UNIX had been ported to the VAX by the start of the 1980’s and was thriving as the counter cultural alternative to DEC’s own VMS operating system. Berkeley UNIX spawned a new generation of hackers who would go on to form companies like Sun micro systems, and languages like Self, which lead directly to the development of Java.

UNIX was ported to a bewildering array of computer systems during the 80’s and the fallout from the UNIX wars gave us the various BSD operating systems who continue to this day.

NeXT

4BSD, a descendant of the original Berkeley distribution became the basis of the operating system for Steve Job’s NeXT line of computers. And when Apple purchased NeXT in 1997, NextSTEP and its BSD derived user space, became the foundations for Darwin, OSX, and iOS.

Windows NT

As we say earlier with Edson de Castro, DEC was no stranger to breakups.

Dave Cutler, the architect of the VAX VMS operating system, after a failed attempt to start a new combined operating system and hardware project, designed to succeed the VAX, decamped to Microsoft in 1988 bringing with him his team and lead the development of Windows NT. Those with a knowledge of Windows’ internals and VMS will perhaps spot the similarities.

Xerox Alto

To close the loop on the de Castro story, the Data General Nova series provided inspiration to Charles Thacker and Butler Lampson, the designers of the Xerox Alto, which itself was the fabled inspiration for the look and feel of the Apple Macintosh.

Data General Nova

The rivalry between Data General and DEC continued into the 32 bit era, the story of which is told in Tracy Kidder’s 1981 Pulitzer winner, The Soul of a New Machine.

What have we learned from the PDP-11?

While its development was sometimes chaotic, and not without its flaws, the PDP-11 is at the intersection of many threads of history.

Hardware, software, programming languages, operating systems, have all been influenced by the PDP-11. I wager there is not a single person in this room who cannot trace the lineage of the language they work with, the computer they use, or the operating system it runs, back to the PDP-11.

And that is worth celebrating.

Never edit a method, always rewrite it

At a recent RubyConf, Chad Fowler presented his ideas for writing software systems that mirror the process of continual replacement observed in biological systems.

The first principal of this approach is, unsurprisingly, to keep the components of the software system small–just as complex organisms like human beings are constituted from billions of tiny cells which are constantly undergoing a process of renewal.

Following from that Fowler proposed this idea:

What would happen if you had a rule on your team that said you never edit a method after it was written, you only rewrote it again from scratch?

Fowler quickly walked back this suggestion as possibly not a good idea, nevertheless the idea has stuck in my head all day. What would happen if we developed software this way? What benefits could it bring?

  • Would it have benefits for software reuse? Opening up a method to add another branch condition or switch clause would become more expensive, and having rewritten the same function over and over again, the author might be tempted to make it more generalisable over a class of problems.
  • Would it have an impact on function complexity? If you knew that changing a long, complex, function required writing it again from scratch, would it encourage you to make is smaller? Perhaps you would pull non critical setup or checking logic into other functions to limit the amount you had to rewrite.
  • Would it have an impact on the tests you write? Some functions are truly complex, they contain a core algorithm that can’t be reduced any further. If you had to rewrite them, how would you know you got it right? Are there tests? Do they cover the edge cases? Are there benchmarks so you could ensure your version ran comparably to the previous?
  • Would it have an impact on the name you chose? Is the name of the current function sufficient to describe how to re-implement it? Would a comment help? Does the current comment give you sufficient guidance?

I agree with Fowler that the idea of immutable source code is likely unworkable. But even if you never actually followed this rule in practice, what would be the impact on the quality, reliability, and usability of your programs if you always wrote your functions with the mindset of it being immutable?


Note: Fowler talks about methods, because in Ruby, everything is a method. I prefer to talk about functions, because in Go, methods are a syntactic sugar over functions. For the purpose of this article, please treat functions and methods as interchangable.

Please, vote Yes for marriage equality in Australia

I wanted to write a few words about the postal survey on marriage law currently underway in Australia.

As an Australian, our country and our government do so many things that make me ashamed as a citizen; the poverty of our indigenous population, the inhumane treatment of refugees on Manus Island, and the maniacal desire to burn every last ounce of coal in the country, come hell and high water, to name just a few.

It is, quite frankly, overwhelming how institutionally cruel our government, which is after all a representation of the majority of Australians, can be, and nothing has sharpened this meanness to a point than the way the Liberal government have approached this survey.

With everything that is wrong in the world right now; climate change, the threat of nuclear war, and an unqualified narcissist running the White House, voting yes to the survey’s simple question is, quite literally, the smallest thing you could do to bring joy to two people.

So please, when you get your postal survey, vote yes.

Thank you.

Why I joined Heptio

Everyone gets the same set of tools

Something that had long puzzled me was the question “Why do some people [in the organisation] have root, and others do not?” It seemed to me that the reason the sysadmins had the root passwords, and everyone else had to raise tickets, was a tooling problem. Giving everyone root would permit anyone in the organisation to fix their own problems, deploy their own software, or, less charitably, cowboy things or be downright naughty. And while everyone had root, it usually turned out that only the operations team had the on call pager.

After the wholesale failure of organisations to understand Devops, I’m a big fan of the “You build it, you run it” movement. So when George Barnett and I built the Atlassian OnDemand Cloud we made a deliberate decision that everyone would get the same tools, and (modulo permissions and audit logs) be empowered to use the platform to the full extent. There wouldn’t be one set of tools for regular users, and a super set of “power tools” reserved for operators.

To me you build it, you run it, means if you have a problem, we’ll help you learn to use the tools better, not fix your problem for you.

Virtualise the operating system, not the hardware

I remember playing with VMware in 1999 or early 2000. I thought it was an amazing trick, especially as the drivers for my sound card worked way better in virtualized Windows than the real thing.

Fast forward a few years and I was using VMware to maintain a fleet of foreign language Windows installations for testing. Skip forward a few more and the industry had figured out that virtualisation was a solution to the sprawl of single use Windows servers that cluttered up wiring cupboards and data centres.

Virtualisation is a neat trick taken well beyond the point of a joke, but it did shine a light on the dark corners of systems administration. Back when turning up a server involved purchase orders, waiting for hardware to be shipped, contract negotiations, and trips to the data centre, what was a few hours spent installing the operating system? But when virtual hardware could be conjured out of thin air in seconds, it cast a long shadow over the need to automate operating system installation and management.

This was the age of Puppet and Chef, who re-plowed the ground sowed a decade earlier by CFEngine. Now sysadmins could configure and manage servers at the speed they could be virtually provisioned. I remember, thinking back to when I started to use Puppet, and imagining about what it would have been like to have those tools in previous jobs, where automation involved SVN repositories full of perl scripts, and crontab entries lovingly copy pasta’d between machines. And so everything was good for a time in the age of configuration as code.

But, simulating the entirety of an x86 host on another, just so people can share a computer, is a ridiculous waste. This shouldn’t be a surprise, FreeBSD Jails and Solaris Zones (rest in peace) had been coughing loudly about this for decades. Bryan Cantrill said it best when he exclaimed that we should “virtualise the operating system, not the hardware“, or as we’ve come to know them: containers.

The death of the operating system

I remember where I saw Docker for the first time. The product wasn’t even a year old and they were carpet bombing any meetup that would have them to promote it. Canonical were sprinting at a hotel near SFO and I convinced several of my teammates to squeeze into a taxi for the first meetup in San Mateo. What I saw that night shook me to my core. It wasn’t just the speed–oh the speed, after spending two years waiting for EC2 and slow apt mirrors–it was the clarity of that Californian mindset. What would happen if I checked my entire application deployment into git?

It was clear to me that night that virtual machines were virtualising stuff that people didn’t care about; virtual video cards, virtual floppy drives, virtual ram that swapped to virtual disks. What people wanted was a virtual kernel–their own pid 1. Orchestration tools like Chef, Puppet, and Juju were trying to orchestrate an entire operating system when what developers really wanted was a way to take a single program, the one that they had written, and deploy it to a server. Filesystems, crontabs, init/upstart/systemd, apt-get and dpkg-reconfigure, weren’t just someone else’s problem, they were irrelevant.

Anyone who’s endured to my rants about product knows my unwavering belief in the Innovator’s Dilemma. Through the window of Christensen’s logic, it was clear that the server orchestration market had been upended in that moment. Squeezed between Docker images at the low end and Netflix’s “everything is an AMI” model at the top end, was a large middle ground filled with orchestration tools that expected to be given a running operating system to configure. The Chefs and Puppets and whatnots would be desperately trying to convince the biggest orchestration users–the Netflixs of the world, with their CI/CD pipelines that pooped AMIs–to adopt agent based tooling, while all the while each developer faced with the question “How should I deploy my application?” would default to docker push.

Orchestration as table stakes

If you’re building your own orchestration layer, then you are betting on the wrong horse–I say this as someone who’s built a bespoke container based PaaS.

Within the next year or two you’ll be able to buy access to a Kubernetes API server at every price point; on your laptop, shared as a VPS, in your own VPC, or even as an appliance. Building on top of the Kubernetes primitives is where the value lies. Building on top of the shared tooling the Kubernetes API provides the level playing field that every development team who is responsible for supporting their own software in production is entitled to.

Why did I join Heptio? Because I believe that the administration of operating systems has reached its endgame. Kubernetes is going to revolutionise the way software is developed, and deployed, and I’m honoured to be given the opportunity to join the company that is going to make that happen.

I’m talking about Go at DevFest Siberia 2017

In September i’ll be speaking about Go at events in Russia and Taiwan.

DevFest Siberia 2017, September 23rd and 24th

I’ve been accepted to give two presentations at the GDG Novosibirsk DevFest Siberia 2017 event in Russia.

High performance servers without the event loop

Conventional wisdom suggests that the key to high performance servers are native threads, or more recently event loops. Neither solution is without downside. Threads carry a high overhead in terms of scheduling cost and memory footprint. Event loops lessen those costs, but introduce their own requirements for a complex callback driven style.

Go is a general purpose programming language in use in a wide range of domains and is well suited to writing network software. Go was introduced in 2009 with the explicit goal of helping programmers write programs that could solve problems of Google’s scale, and that means writing high performance servers.

This talk will focus on the features of the Go language and runtime environment, that allow programmers to write simple, high performance network services without resorting to native threads or event loop-driven callbacks.

Workshop: Exploring the Go execution tracer

As a complement to my conference talk I’ll be teaching a workshop on the Go execution tracer. This workshop follows on from my GolangUK presentation from last year and my High Performance Go workshop, and specifically focuses on the Go execution tracer,

The execution tracer is a new profiling and tracing facility integrated into Go since version 1.5. Unlike “external” profiling tools like pprof, valgrind, or perf, the execution tracer is integrated directly into the Go runtime, giving it detailed knowledge of the scheduler, the network poller, and the garbage collector.

In this workshop I will explain the operation of the execution tracer, how to collect, then analyse, the results of a trace. The audience will step through a set of problems, framed as the trace output of unknown programs to learn how to interpret the results from the execution tracer, improve our code to address performance or scalability bottlenecks, and verify the results.

You can find more information and purchase tickets for the event at the DevFest 2017 website.

Go Taiwan Meetup, Taipei, September 26th

I’ll be visiting the Go meetup in Taipei, Taiwan on the 26th of September. You can find details of the meetup soon on the GolangTW website.


Russian translation by Elena Grahovac

В сентябре я расскажу о Go на мероприятиях в России и Тайване.

DevFest Siberia 2017, Новосибирск, 23-24 сентября

Оргкомитет конференции DevFest Siberia 2017, которая пройдет в Новосибирске (Россия), принял мои заявки на два выступления.

Высокопроизводительные серверы без цикла событий

Бытует мнение, что ключом к написанию высокопроизводительных серверов является использование собственных потоков (native threads), место которых в последнее время занимают циклы событий (event loops). Однако, у обоих этих решений есть свои недостатки. Потоки, с точки зрения затрат на планирование и объем памяти, несут высокие накладные расходы. Циклы событий уменьшают эти затраты, но ставят определенные требования к витиеватым принципам разработки, основанной на callback’ах.

Go – это универсальный язык программирования, который используется в широком диапазоне областей и отлично подходит для написания сетевого программного обеспечения. Go был представлен в 2009 году, его цель – помочь разработчикам писать программы, которые могли бы решать задачи масштаба Google, то есть задачи написания высокопроизводительных серверов.

В этом докладе будут рассмотрены особенности языка и среды выполнения (runtime) Go, которые позволяют программистам писать простые высокопроизводительные сетевые сервисы, не прибегая к собственным потокам или callback’ам, связанным с циклом событий.

Мастер-класс: Изучаем трассировщик выполнения Go

В качестве дополнения к докладу я проведу мастер-класс по трассировщику выполнения (execution tracer) Go. Этот мастер-класс вытекает из моего доклада «Семь способов профилирования программы, написанной на Go» с прошлогодней конференции GolangUK и из моего мастер-класса «Высокая производительность Go». Новый мастер-класс фокусируется на трассировщике выполнения Go.

Трассировщик выполнения – это новое средство профилирования и трассировки, интегрированное в Go, начиная с версии 1.5. В отличие от «внешних» инструментов профилирования, таких как pprof, valgrind или perf, трассировщик выполнения интегрируется непосредственно в среду выполнения Go, предоставляя подробные сведения о планировщике (scheduler), сетевом поллере (network poller) и сборщике мусора (garbage collector).

В рамках мастер-класса я объясню, как работает трассировщик выполнения, и расскажу о том, как собрать, а затем проанализировать результаты трассировки. Шаг за шагом участники пройдут через набор задач, оформленных как вывод трассировки неизвестных программ, и узнают, как интерпретировать результаты трассировщика, улучшить код, устранить узкие места производительности или масштабируемости и проверить результаты.

Найти больше информации и приобрести билеты можно на сайте DevFest Siberia 2017.

Go Taiwan Meetup, Тайбэй, 26-е сентября

Я приеду на Go-митап в Тайбее (Тайвань) 26-го сентября. Детали мероприятия скоро появятся на сайте GolangTW.

The HERE IS key

The Lear Siegler ADM-3A terminal is a very important artefact in computing history.

ADM-3A keyboard (image credit vintagecomputer.ca)

If you want to know why your shell abbreviates $HOME to ~, it’s because of the label on the ~ key on the ADM-3A. If you want to know why hjkl are the de facto cursor keys in vi, look at the symbols above the letters. The ADM-3A was the “dumb terminal” which Bill Joy used to develop vi.

Recently the ADM-3A came up in a twitter discussion about the wretched Apple touch bar when Bret Victor dropped this tweet:

Which settled the argument until Paul Brousseau asked:

Indeed, what does the HERE IS1 key do? Its prominent position adjacent to the RETURN key implies whatever it does, it is important.

Fortunately the answer to Paul’s question was easy to find. The wonderful BitSavers archive has the user manual for the ADM-3A available (cached to avoid unnecessary bandwidth costs to BitSavers). On page 29 we find this diagram

Page 29, ADM-3A Users Manual (courtesy bitsavers.org)

So HERE IS, when pressed, transmits a predefined identification message. But what do to the words “message is displayed in half-duplex” mean? The answer to that riddle lies in the ADM-3A’s Answerback facility.

Scanning forward to page 36, section 3.3.6 describes the configuration of the Answerback facility–programming the identification message transmitted when HERE IS is pressed.

Section 3.3.6, page 36, ADM-3A Users Manual (courtesy bitsavers.org)

Pressing the HERE IS key or receiving an ENQ from the host … causes the answerback message to be transmitted to the host and to be displayed if the terminal is in half duplex mode.

This is interesting, the remote side can ask the terminal “who are you?”.

The HERE IS key is a vestige of a an older facility called Enquiry. Enquiry allowed one end of the connection to query if the remote side was still connected, and if it was, exactly who was connected.

ANSWERBACK Message
Answerback is a question and answer sequence where the host computer asks the terminal to identify itself. The VT100 answerback feature provides the terminal with the capability to identify itself by sending a message to the host. The entire answerback sequence takes place automatically without affecting the screen or requiring operator action. The answerback message may also be transmitted by typing CTRL-BREAK.

This description is from the 1978 Digital VT100 user guide. It was certainly a simpler time when the server could ask a terminal to identify itself, and trust the answer.


Notes

  1. I’ve chosen to write the name of the key in all caps as the base model of the ADM-3A was only capable of displaying upper case letters. If you wanted lower case (above 0x5F hex), that was an optional extra.

Context isn’t for cancellation

This is an experience report about the use of, and difficulties with, the context.Context facility in Go.

Many authors, including myself, have written about the use of, misuse of, and how they would changecontext.Context in a future iteration of Go. While opinions differs on many aspects of context.Context, one thing is clear–there is almost unanimous agreement that the Context.WithValue method on the context.Context interface is orthogonal to the type’s role as a mechanism to control the lifetime of request scoped resources.

Many proposals have emerged to address this apparent overloading of context.Context with a copy on write bag of values. Most approximate thread local storage so are unlikely to be accepted on ideological grounds.

This post explores the relationship between context.Context and lifecycle management and asks the question, are attempts to fix Context.WithValue solving the wrong problem?

Context is a request scoped paradigm

The documentation for the context package strongly recommends that context.Context is only for request scoped values:

Do not store Contexts inside a struct type; instead, pass a Context explicitly to each function that needs it. The Context should be the first parameter, typically named ctx:

func DoSomething(ctx context.Context, arg Arg) error {
        // ... use ctx ...
}

Specifically context.Context values should only live in function arguments, never stored in a field or global. This makes context.Context applicable only to the lifetime of resources in a request’s scope. Given Go’s lineage on the server, this is a compelling use case. However, there exist other use cases for cancellation where the lifetime of the resource extends beyond a single request. For example, a background goroutine as part of an agent or pipeline.

Context as a hook for cancellation

The stated goal of the context package is:

Package context defines the Context type, which carries deadlines, cancelation signals, and other request-scoped values across API boundaries and between processes.

Which sounds great, but belies its catch-all nature. context.Context is used in three independent, yet sometimes conflated, scenarios:

  • Cancellation via context.WithCancel.
  • Timeout via context.WithDeadline.
  • A bag of values via context.WithValue.

At any point, a context.Context value can represent any one, or all three of these independent concerns. However, context.Context‘s most important facility, broadcasting a cancellation signal, is incomplete as there is no way to wait for the signal to be acknowledged.

Looking to the past

As this is an experience report, it would be germane to highlight some actual experience. In 2012 Gustavo Niemeyer wrote a package for goroutine lifecycle management called tomb which is used by Juju for the management of the worker goroutines within the various agents in the Juju system.

tomb.Tombs are concerned only with lifecycle management. Importantly, this is a generic notion of a lifecycle, not tied exclusively to a request, or a goroutine. The scope of the resource’s lifetime is defined simply by holding a reference to the tomb value.

A tomb.Tomb value has three properties:

  1. The ability to signal the owner of the tomb to shut down.
  2. The ability to wait until that signal has been acknowledged.
  3. A way to capture a final error value.

However, tomb.Tombs have one drawback, they cannot be shared across multiple goroutines. Consider this prototypical network server where a tomb.Tomb cannot replace the use of sync.WaitGroup.

func serve(l net.Listener) error {
        var wg sync.WaitGroup
        var conn net.Conn
        var err error
        for {
                conn, err = l.Accept()
                if err != nil {
                        break
                }
                wg.Add(1)
                go func(c net.Conn) {
                        defer wg.Done()
                        handle(c)
                }(conn)
        }
        wg.Wait()
        return err
}

To be fair, context.Context cannot do this either as it provides no built in mechanism to acknowledge cancellation. What is needed is a form of sync.WaitGroup that allows cancellation, as well as waiting for its participants to call wg.Done.

Context should become, well, just context

The purpose of the context.Context type is in it’s name:

context /kɒntɛkst/ noun
The circumstances that form the setting for an event, statement, or idea, and in terms of which it can be fully understood.

I propose context.Context becomes just that; a request scoped association list of copy on write values.

Decoupling lifetime management from context.Context as a store of request scoped values will hopefully highlight that request context and lifecycle management are orthogonal concerns.

Best of all, we don’t need to wait til Go 2.0 to explore these ideas like Gustavo’s tomb package.

Typed nils in Go 2

This is an experience report about a gotcha in Go that catches every Go programmer at least once. The following program is extracted from a larger version that caused my co-workers to lose several hours today.

package main

import "fmt"

type T struct{}

func (t T) F() {}

type P interface {
        F()
}

func newT() *T { return new(T) }

type Thing struct {
        P
}

func factory(p P) *Thing { 
        return &Thing{P: p}
}

const ENABLE_FEATURE = false

func main() {
        t := newT()
        t2 := t
        if !ENABLE_FEATURE {
                t2 = nil
        }
        thing := factory(t2)
        fmt.Println(thing.P == nil)
}

This distilled version of the program in question, while non-sensical, contains all the attributes of the original. Take some time to study the program and ask yourself, does the program print true or false?

nil != nil

Not to spoil the surprise, but the program prints false. The reason is, while nil is assigned to t2, when t2 is passed to factory it is “boxed” into an variable of type P; an interface. Thus, thing.P does not equal nil because while the value of P was nil, its concrete type was *T.

Typed nil

You’ve probably realised the cause of this problem is the dreaded typed nil, a gotcha that has its own entry in the Go FAQ. The typed nil emerges as a result of the definition of a interface type; a structure which contains the concrete type of the value stored in the interface, and the value itself. This structure can’t be expressed in pure Go, but can be visualised with this example:

var n int = 200 
var i interface{} = n

The interface value i is assigned a copy of the value of n, so i‘s type slot holds n‘s type; int, and it’s data slot holds the value 200. We can write this more concisely as (int, 200).

In the original program we effectively have the following:

var t2 *T = nil
var p P = t2

Which results in p, using our nomenclature, holding the value (*T, nil). So then, why does the expression p == nil evaluate to false? The explanation I prefer is:

  • nil is a compile time constant which is converted to whatever type is required, just as constant literals like 200 are converted to the required integer type automatically.
  • Given the expression p == nil, both arguments must be of the same type, therefore nil is converted to the same type as p, which is an interface type. So we can rewrite the expression as (*T, nil) == (nil, nil).
  • As equality in Go almost always operates as a bitwise comparison it is clear that the memory bits which hold the interface value (*T, nil) are different to the bits that hold (nil, nil) thus the expression evaluates to false.

Put simply, an interface value is only equal to nil if both the type and the value stored inside the interface are both nil.

For a detailed explanation of the mechanics behind Go’s interface implementation, Russ Cox has a great post on his blog.

The future of typed nils in Go 2

Typed nils are an entirely logical result of the way dynamic types, aka interfaces, are implemented, but are almost never what the programmer wanted. To tie this back to Russ’s GopherCon keynote, I believe typed nils are an example where Go fails to scale for programming teams.

This explanation has consumed 700 words–and several hours over chat today–to explain, and in the end my co-workers were left with a bad taste in their mouths. The clarity of interfaces was soured by a suspicion that gotchas like this were lurking in their codebase. As an experienced Go programmer I’ve learnt to be wary of the possibility of a typed nil during code review, but it is unfortunate that they remain something that each Go programmer has to learn the hard way.

For Go 2.0 I’d like to start the discussion of what it would mean if comparing an interface value to nil considered the value portion of the interface such that the following evaluated to true:

var b *bytes.Buffer
var r io.Reader = b
fmt.Println(r == nil)

There are obviously some subtleties that this pithy demand fails to capture, but a desire to make this seemingly straight forward comparison less error prone would, at least in my mind, make Go 2 easier to scale to larger development teams.

Should Go 2.0 support generics?

A long time ago, someone–I normally attribute this to David Symonds, but I can’t be sure he was the first to say it–said that the reason for adding generics to Go would be the reason for calling it Go 2.0. That is to say, adding generics to the language would be half baked if they were not used throughout the standard library. I wrote about this in a series of blog posts where I explored what I felt would be the repercussions of integrating templated types into Go.

Do I think Go should have generics? Well, there are really two answers to that question.

As I argued in my Simplicity Debt posts, mainstream programmers in 2017 expect a set of features in their languages. Many of us work in polyglot environments. Even if we want to be writing in Go as much as possible, there’s usually some Javascript, some CSS, some Python, maybe some Java, Swift, C#, PHP or even C++ in the project. Maybe this will change in the future, but right now, if you’re a commercial programmer working for a crust, every day you’ll touch a bunch of languages, so their differences tend to rub against one another.

  • Mainstream programmers expect static typing, not for performance, but for readability and maintainability–just look at what Typescript and Dart are bringing to Javascript, and Python’s formative efforts with optional typing.
  • Mainstream programmers expect concurrency. They expect to be able to do more than one thing at a time–just look at node.js and the compromises programmers were prepared to make to move away from heavy-weight thread per connection models. Go is obviously well positioned here.
  • Mainstream programmers expect some form of templated types because they’re used it in the other languages they interact with alongside Go.

So my first answer is: Go should have some form of generics because it is a mainstream, imperative, block scoped language and it is expected these days.

My second answer is if the designers of the language choose not to add templated types or parameterised functions–and keep in mind that I am not one of the language designers, only an exuberant fan–because, as I wrote in my series of posts, the repercussions for the simplicity and readability of the language may prove too jarring. If that were to happen, my recommendation would be that Go should own that decision.

What do I mean by that? Well, the best explanation I can give is a counterexample. Let’s look at Haskell. Haskell is what most functional programmers consider to be the baseline for a real FP language, and thus it looks pretty much like nothing programmers schooled in imperative, side effect ridden, block structured, languages are used to. But Haskell programmers own that. They own their difference, they don’t see it as a reason to make their language work more like PHP, or C++, or Rust, or even Go, and they are happy to explain the Haskell way of doing things to anyone who asks. My point is that if Go is not going to have a story for templated types, then we need to own it, just like Haskell programmers own their decisions.

This isn’t simply a case of saying “nope, sorry, no generics for Go 2.0, maybe in another 5 years”, but a more fundamental statement that they are not something that will be implemented in Go because we believe there is a better way to solve the underlying problem. Note that I did not say a better way to implement a templated type or parameterised function, but a better way to solve the underlying business problem. There is a difference.

This isn’t without precedent, Go was one of the first C style languages to eschew type inheritance, a decision which lead to a radical simplification of the language and a focus on the mantras of communicating intent via interfaces, and encapsulation over inheritance. Before Go, it was assumed that a mainstream language would have classes and a type hierarchy, nowadays that is less true.

So, should Go 2.0 have generics? If the decision is to add them then I’m sure it can be done, after all the syntax is the least important part of the decision, and there is a wealth of prior art in other languages to guide us. However, if the decision is not to add templated types, then it should be made so explicitly. Then it is incumbent upon all Go programmers to explain the Go Way of solving problems.

How to find out which Go version built your binary

This is a short post describing the procedure for discovering which version of Go was used to compile a Go binary.

This procedure relies on the fact that each Go program includes a copy of the version string reported by runtime.Version() . Linker magic ensures that this value will be present in the final binary irrespective of whether runtime.Version() is called by the resulting program. The value in question is stored in the runtime.buildVersion variable and can be recovered by a debugger.

The rest of this post describes the mechanisms for recovering the contents of runtime.buildVersion on various platforms.

Linux/FreeBSD/OpenBSD/NetBSD

If you’re on a Linux or *BSD platform, you can recover the binary build version with gdb.

% gdb $HOME/bin/godoc
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
(gdb) p 'runtime.buildVersion'
$1 = 0xa9ceb8 "go1.8.3"

Darwin

The debugging situation on OS X isn’t great, but here are several options.

gdb

gdb was removed from the XCode toolchain following the switch from gcc to llvm. If you are running a version of XCode that has gdb, you should used the instructions from the previous section.

Delve

Delve can be used to print the value of runtime.buildVersion.

% dlv exec $HOME/bin/godoc
Type 'help' for list of commands.
(dlv) b main.main
Breakpoint 1 set at 0x15596eb for main.main() ./golang.org/x/tools/cmd/godoc/main.go:156
(dlv) c
> main.main() ./golang.org/x/tools/cmd/godoc/main.go:156 (hits goroutine(1):1 total:1) (PC: 0x15596eb)
   151:                 }
   152:         }
   153:         log.Fatalf("too many redirects")
   154: }
   155:
=> 156: func main() {
   157:         flag.Usage = usage
   158:         flag.Parse()
   159:
   160:         playEnabled = *showPlayground
   161:
(dlv) p runtime.buildVersion
"go1.8.1"

lldb

Christian Witts reports on Twitter that XCode 8.3.3 ships with a version of lldb, version 370.0.42, that can interpret the Go string syntax.

$ lldb $HOME/bin/godoc
(lldb) b main.main
(lldb) run
(lldb) p runtime.buildVersion

I’ve tested earlier versions of lldb and found they do not work. Instread, use delve

Windows

Good news, everyone. Brian Ketelsen of GopherCon and GoTime.fm fame, reports that delve works perfectly on Windows for recovering this binaries’ build version.

PS C:\Users\bkete\go\src\http://github.com \derekparker\delve\cmd\dlv> dlv exec C:\Users\bkete\go\bin\dlv.exe
Type 'help' for list of commands.
(dlv) b main.main
Breakpoint 1 set at 0x8ec666 for main.main() c:/Users/bkete/go/src/github.com/derekparker/delve/cmd/dlv/main.go:11
(dlv) c
> main.main() c:/Users/bkete/go/src/github.com/derekparker/delve/cmd/dlv/main.go:11 (hits goroutine(1):1 total:1) (PC: 0x8ec666)
     6: )
     7:
     8: // Build is the git sha of this binaries build.
     9: var Build string
    10:
=>  11: func main() {
    12:         http://version.DelveVersion.Build  = Build
    13:         http://cmds.New ().Execute()
    14: }
(dlv) p runtime.buildVersion
"go1.8.1"

If someone wants to figure out the correct WinDbg or Visual Studio Debugger incantation, please let me know and I’ll link to you from this post.