Tag Archives: retrocomputing

Make your own Apple 1 replica

Woot! This project was featured on Hackaday.

mega6502, a big mess of wires

No Apple 1 under the tree on Christmas Day ? Never mind, with a 6502 and an Arduino Mega 2560 you can make your own.

The Apple 1 was essentially a 6502 computer with 4k of RAM and 256 bytes of ROM. The inclusion of a 6821 PIA and a Signetics video encoder meant that the Apple 1 shipped with its own 2400 baud dumb terminal built in. Just supply your own keyboard, composite monitor, and you were in business.

The good news is we can emulate the RAM, ROM, PIA, and all the glue logic with an Arduino.

The hardware

To validate the idea that an Ardunio could provide a stable clock for the 6502, I started by breadboarding the project.

6502 strapped for $EA

The result was a success, with a tight assembly loop I was able to generate a 1Mhz clock with a roughly 50% duty cycle. So it was on to a prototype.

Prototype 6502 “sidecar”

The protoshield has 0.1 inch connectors for the 40 pins on the 6502 and the 40 something pins on the Ardunio Mega’s expansion header allowing me to jumper between the 6502 and the Arduino. The strange jumper block presents $EA on the data bus unconditionally, this is called free running mode.

Mega6502 prototype

Because I wanted to use an LCD panel for debugging and the patch wires on the protoshield would not fit under the LCD shield I mounted the shield backwards and upside down, which retained the same pin outs (including 5v on the top). I called this prototype design the “sidecar”.

Sidecar wiring in detail

The schematic for wiring the sidecar to the Arduino is detailed in the README file.

The software

At the moment the software is a simple Arduino IDE sketch, you can find it on Github.

Clock

The Arduino provides the ϕ0 clock signal as part of the main loop() function. The 6502 interacts with the outside world on the falling edge of this clock (actually a few ns after ϕ0, the falling edge of ϕ2). It produces the address and read/write signals on the rising edge of ϕ0.

Different 6502 models have different requirements for the minimum and maximum length of each phase of ϕ0. The original NMOS 6502 required a clock of at least 100khz to avoid loosing internal CPU state, which made single stepping more complicated. With the Rockwell 65c02 I am using the ϕ0 low phase must not exceed 5μs, but the clock signal can remain high indefinitely (the fully static WDC 6502 removes any restriction on a minimum clock).

We can use this property to generate a stable ϕ0 low around 500 ns (the minimum instruction time on a 16Mhz Atmega is 62.5ns), then raise ϕ0 and do our processing, even take an interrupt. Because I have the 4Mhz 65c02 version, we can even make the ϕ0 low period shorter, to allow our high pulse to take longer in an effort to reach the 1Mhz clock target.

Laughton Electronics has published a fantastic page if you want to learn more about the 6502 timings.

Ram

The Apple 1 divided the 6502’s address space into 16 4k banks which could be assigned to RAM, ROM, or I/O.

The 2560 includes 8kb of SRAM, so we dedicate 4k to be the bottom bank of ram starting at $0000, which is more than enough for a usable replica. For Apple 1’s with 8kb of ram, the second bank of ram was usually strapped to $E000 and used for BASIC. The nice property of this is we can replace the $E000 bank with a ROM image mapped to that location (BASIC did not expect to be able to write to memory at $E000) and achieve the same effect without providing another 4k of RAM.

ROM

The original 256 byte Woz monitor rom is provided at $FF00. For simplicities sake the ROM is mirrored at every page in the $F000 address space.

I have tested a few of the popular ROM images like A1Assembler and Applesoft-lite but only include the Woz monitor rom in the source distribution. Enterprising readers should have little difficulty modifying the source code to include additional ROM images.

Input and output

The Apple 1 interfaces to the keyboard and screen via four registers from the 6821 PIA chip mapped into the address space at $D000.

When a key is pressed on the keyboard, the high bit of $D011 is latched high, this can be detected by the 6502 ROM monitor which then reads $D010 to fetch the keycode, which is conveniently encoded as 7bit ASCII.

Output is similar, the 6502 polls $D013 until the PIA reports that the video encoder is not busy then the character to write to the screen is placed in $D012.

It is straight forward to map these reads and writes to these addresses to the Arduino serial port. Again for simplicity, the PIA is mirrored to every page in $D000.

The speed

Like my previous projects, performance is always a problem. Assuming a 50% duty cycle for the ϕ0 clock, a 16Mhz Atmel has 8 cycles to decode the address and read/write the data. This is basically impossible. However, as I am using a Rockwell 65c02 cpu, which is CMOS, and a higher speed grade than the original NMOS based 6502, we can cheat and shorten the ϕ2 low, trading that time for a longer ϕ2 high pulse.

Just shy of 300khz

Using my trusty Bitscope Micro, I can probe the ϕ2 clock. You can see the asymetry between the high and low phases. The high phase is currently 2.8μs, or around 45 cycles for the Arduino. This equates to a clock speed of just under 300khz, which is very usable.

Demo time

Here is a short video showing the mega6502 running a short BASIC program in debug mode.

Here is a screen capture showing David Schmenk’s 30th birthday demo for the Apple 1.

Next steps

More tweaking of the decode logic to try to reduce the ϕ2 high period.
Implement a faux cassette interface possibly using the SD card for cassette storage
A new design using an Atmega1284P — a minimalistic two chip SBC 6502 solution, assuming I can find a bootloader that works.

Resources

If you liked this project, check out some other fantastic 6502 projects.

Project:65
Quinn Dunki’s fantastic Veronica. We are not worthy!
PDA6502, Paul has designed his own 6502 solution from scratch.

avr11: simulating minicomputers on microcontrollers

Introduction

The avr11, a atmega2560 clone with a custom SPI 256kb memory.

The avr11, an atmega2560 clone with a custom SPI 256kb memory.

It all started with Javascript.

In April of 2011 Julius Schmidt wrote a PDP-11 emulator that ran in a browser. I thought that this was one of the most amazing thing I had ever seen.

Late last year I ran across the link again in my Pocket backlog and spent a little time poking around the code that powered the simulator.

The PDP-11 architecture is very interesting. All the the major system components work asynchronously, co-ordinating access via the shared UNIBUS backplane. Julius’ simulator used this property and callbacks on timers to simulate components like the disk and console operating asynchronously.

I’ve written previously about the suitability of Go for writing emulators and so I thought it would be a fun project to port Julius’ work to Go. This port is going well and provided the base for porting the code to C++ for this project.

Around the same time I ran across several other projects which convinced me to try an Atmel port.

The first was the winner of last years IOCCC, a 4k Intel 8086 simulator¹ which showed me that the core of a CPU simulator could be small. Ignoring main memory, you need only simulate the small number of registers defined by the architecture. The PDP-11 has 8 16bit registers, a few service registers and 32 16 bit mmu registers² to hold mappings. This would fit even in a small microcontroller like the 328p or 32u4.

The second project was a hardware implementation of a 32bit ARM CPU running linux on an Atmel 128p. Probably built as a dare, this project showed me that reliable simulators can be built on the 8bit Atmel platform and an external device could be used to represent the larger main memory expected by the CPU being simulated.

Maybe this wasn’t as crazy as it sounded.

Why the PDP-11 ?

PDP-11/45 console. Image courtesy of John Holden’s PDP-11 page.

The PDP-11 is the most important minicomputer of the 1970’s.

The cost of the PDP-11 was low enough that it could be dedicated to one person or small group when mainframe computers cost so much to run that time was billed by the hour to recoup their phenomenal cost. DEC machines quickly became the platform for research and experimentation.

The PDP-11 was the machine that Ken Thompson and Dennis Ritchie developed Unix and the C programming language. As a historical artefact it has tremendous importance for anyone who is interested in computer programming or retro computing.

If you want to know why the int datatype in C defaults to 16 bits, look at the PDP-11, it was a 16 bit computer. If you want to know why a char is called a char not a byte look to way the PDP-11 stored two characters in a word.

So, what better way to learn about the PDP-11, and the history of C and Unix, than to build a simulator of the machine that started it all ?

Why build a simulator on a microcontroller?

Let’s be honest, the world doesn’t need another Arduino weather station.

Apart from wiring up LED and LCD displays I haven’t really found anything that really excites me about using microcontrollers. In some ways, the way that microcontrollers are coded and used feels very batch oriented — write some code, compile it, load it onto the micro, wait quietly to see it it worked, rinse, repeat.

I had heard a Podcast interview with one of the makers of the Pebble smart watch and learnt about the Pebble OS, a fork of FreeRTOS.

FreeRTOS looked like as a way to write interactive programs on microcontrollers, and those ideas were percolating inside my head while I was porting Julius’ Javascript simulator to Go over Christmas.

One of the nice features of Julius’ simulator was the very clean separation between the CPU logic and the memory logic. In the PDP-11 memory is just another device on the UNIBUS bus and so I began to think that if I could find some way of connecting the required 128 kilowords of memory to an Atmel the rest of the simulator should fit within the onboard SRAM.

What works today

A screenshot of one of the first successful boot attempts.

Today the simulator boots V6 unix and can execute some simple commands, there are some remaining bugs in the mmu which cause the simulator to fail when larger programs (/usr/bin/cc and /usr/games/chess for example) are executed

The hardware emulated is somewhere between a PDP11/40 and PDP11/45. The EIS option (MUL and DIV) is properly emulated, but FIS (floating point is not).

Only a single RK05 drive is simulated, backed by a file on the micro SD card.

I want to improve the accuracy of the simulator so that it can run V7 unix, 2.9/2.11 BSD, RSX-11M and most importantly, the DEC diagnostics.

All these improvements are first developed on the Go port then will be integrated into the Atmel port.

How is the performance?

Right now, not great. I don’t have a real PDP-11 for comparison, but looking at some videos on Youtube I’d have to say the simulator is at least 10x slower than an original 11/40.

I was never expecting to be amazed with the speed of this simulator, especially at this early stage. However, on a performance per watt basis, I think it’s hard to beat avr11.

The PDP-11 that this simulator models is spartan, even by the standards of the early 70s, yet still consumed over 2 kilowatts of power for the CPU and Memory (256kb). The 2.5 megabyte RK05 boot drive was another 600 watts. Real unix installations would have 3 or more drives, so there goes another 1200-1800 watts.

Compared to that, the avr11 draws well under the 500ma limit of a USB port. Although I lack equipment to measure the current draw I estimate it to be around 100ma at 5 volts which is 0.5 watts. Tell that to your data center manager.

Some specific performance issues I am aware of are:

32 bit register operands. Everything in the CPU is treated as a 32 bit integer during the simulation of the individual instructions. This is a hold over from the original Javascript simulation. The Atmel is internally an 8 bit CPU. There are provisions for 16 bit, double byte operations, but they come at a cost of 5x over their 8 bit counterparts. Using 32 bit operands are even more costly again.
SPI memory. The overhead of the SPI transaction to read a word from memory is quite high. As most instructions generate at least 2 memory cycles, and can be as high as 6, fetching operands from memory consumes a lot of wall time.

Getting the code

Julius’ original Javascript implementation, from which both my Go and Atmel ports derive, is licensed under the liberal WTFPL licence. As I share similar views on software licensing, both of my projects will be similarly licensed.

The code itself is on Github. As of today it is frankly a dog’s breakfast of babies first C++ programming and Arduino hacks. As the project progresses I hope this will improve.

Ethermega and SPI SRAM shield (unpopulated).

However the code itself isn’t much use without the hardware. Again this is under heavy flux and I hope to improve the number platforms the simulator can run on.

The specific hardware requirements are an Atmega2560 (the older 1280 will also work, I just don’t have one to experiment with). I’m using the Freetronics Ethermega, mainly because it was what I could buy at Jaycar, but also because it has a built in micro SD card reader onboard.

The SPI SRAM shield is custom, and I’ll be writing about it in my next post.

The author of 8086tiny has recently released a de-obfuscated version of their emulator.
Actually Julius’ emulator cheated, there are actually 3 banks of 16 mmu registers in a real KT11 mmu, but V6 unix doesn’t use supervisor mode, so we can cheat a little here.

Go, the language for emulators

So, I hear you like emulators. It turns out that Go is a great language for writing retro-computing emulators. Here are the ones that I have tried so far:

trs80 by Lawrence Kesteloot

I really liked this one because it avoids the quagmire of OpenGL or SDL dependencies and runs in your web browser. I had a little trouble getting it going so if you run into problems remember to execute the trs80 command in the source directory itself. If you’ve used go get github.com/lkesteloot/trs80 then it will be $GOPATH/src/github.com/lkesteloot/trs80.

GoSpeccy by Andrea Fazzi

GoSpeccy was the first emulator written in Go that I am aware of, Andrea has been quietly hacking away well before Go hit 1.0. I’ve even been able to get GoSpeccy running on a Raspberry Pi, X forwarded back to my laptop. Here is a screenshot running the Fire104b intro by Andrew Gerrand

Fergulator by Scott Ferguson

Like GoSpeccy, Fergulator shows the power of Go as a language for writing complex emulators, and the power of go get to handle packages with complex dependencies. Here are the two commands that took me from having no NES emulation on my laptop, to full NES emulation on my laptop.

lucky(~) sudo apt-get install libsdl1.2-dev libsdl-gfx1.2-dev libsdl-image1.2-dev libglew1.6-dev libxrandr-dev
lucky(~) % go get github.com/scottferg/Fergulator

sms by Andrea Fazzi

What’s this? Another emulator for Andrea Fazzi ? Why, yes it is. Again, super easy to install with go get -v github.com/remogatto/sms. ~~Sadly there are no sample roms included with sms due to copyright restrictions, so no screenshot.~~ Update: Andrea has included an open source ROM so we can have a screenshot.

Update: Several Gophers from the wonderful Go+ community commented that there are still more emulators that I haven’t mentioned.

dcpu by Jim Teeuwen, an implementation of Notch’s DCPU specification.
chip-8 by Yves Junqueira, which implements the CHIP-8 virtual machine used in some mid 1970’s home consoles.

Dave Cheney

The acme of foolishness