UNIX V4 tape successfully recovered: First ever version of UNIX written in C is running again
Computer History Museum software curator Al Kossow has successfully retrieved the contents of the over-half-a-century old tape found at the University of Utah last month.
UNIX V4, the first ever version of the UNIX operating system in which the kernel was written in the then-new C programming language, has been successfully recovered from a 1970s nine-track tape drive. You can download it from the Internet Archive, and run it in SimH. On Mastodon, "Flexion" posted a screenshot of it running under SGI IRIX.
Last month, we wrote about the remarkable discovery of a forgotten tape with a lost early version of Unix, found by Professor Robert Ricci at the Kahlert School of Computing at the University of Utah. At the time, we quoted the redoubtable Kossow, who also runs Bitsavers, as saying that it "has a pretty good chance of being recoverable." Well, he was right, and at the end of last week, he did it. Ricci also shared a video clip on Mastodon.
There's a video of the recovery here. It's only slightly over five minutes long, but then, UNIX V4 wasn't very big yet: for instance, the kernel was some 27 kB of code.
The data was recovered using the readtape program by the Computer History Museum's Len Shustek. Since few of us in the microcomputer world deal with magtapes much, the Greaseweazle tool for archiving old floppy tapes might be a more familiar comparison: rather than trying to copy the bytes or sectors from media – in other words, the processed digital data – both readtape and Greaseweazle sample and record the raw magnetic flux variations. Those can then be used to reconstruct the digital data, making some error recovery possible. In this case, only two blocks wouldn't read properly and there was enough to reconstruct their contents.
So, in the recovered files on the Internet Archive, you can see there's a 1.6 gigabyte file created from a tape that only held 40 MB or so of data. You probably don't want to download that.
Fortunately, Angelo Papenhoff offers a processed version, complete with a README telling you how to run it. For further guidance, then on Reddit, drop_table_allusers suggests:
It's very small: it contains around 55,000 lines of code, of which about 25,000 lines are in C, with under 1,000 lines of comments. But then, the late Dennis M. Ritchie and co-creator Ken Thompson were very definitely Real Programmers, and as is recorded in ancient wisdom:
Thompson is still active, and recently did his second oral history interview with the CHM, aptly titled A Computing Legend Speaks.
We've seen a lot of misunderstanding of this and what it represents online, so we'll try to set it in some kind of context.
The very first version of Unix, later known as the "Zeroth edition", was hand-coded in assembly language by Thompson in 1969. He wrote it for a spare PDP-7 at Bell Labs, a Digital Equipment Corporation minicomputer from 1965. The PDP-7 was an 18-bit machine: it handled memory in 18-bit words. This was so long ago that things like the eight-bit byte had not yet been standardized. PDP-7 UNIX was reconstructed from printouts between 2016 and 2019.
It did well enough that a few years later, Thompson got his hands on a PDP-11. Thompson rewrote his OS for this 16-bit machine – still in assembly language – to create UNIX First Edition. At first, the machine had a single RS11 hard disk, for a grand total of half a megabyte of storage, although the rebuilt source code is from a later machine with a second hard disk.
UNIX V4 in all its miniscule glory, running inside SimH on macOS 12. - Click to enlarge
It was followed later the same year by UNIX v2, still on a PDP-11/20. As we reported earlier this year, something like a beta version of UNIX V2 has recently been reconstructed.
UNIX V3 followed in 1972, which introduced the new feature of pipes. This was the version on which the then-new C programming language was first written.
Now, the long-lost UNIX V4 has been found and its files recovered. This was the first version where much of the kernel was rewritten in C. UNIX V4 only ran on a higher-end model of PDP-11, the PDP-11/45.
These days, the nature and evolution of UNIX in its early stages is not well understood. For instance, some things happened simply because Ken and Dennis were working on very limited hardware. At one point, they had a single DEC RK05 hard disk, which held a massive 1.5 MB. When they got access to a second hard drive, they moved all the home directories to it. These were held in /usr – it was short for "users," and it contained the home directories of both ken and dmr, meaning that it also contained most of the binary files – the actual programs constituting the OS itself. This caused them a problem: how do you mount the second hard disk when the mount command itself is on that disk? The solution: a special /sbin directory on the first hard disk which contained tools needed to, among other things, access any additional hard disks.
Rob Landley wrote an excellent explanation of the history of the split on the Busybox mailing list 15 years ago: Understanding the bin, sbin, usr/bin , usr/sbin split. Landley knows his stuff: he's the author of Toybox, which is a replacement for Bruce Perens's BusyBox multi-command binary – as used in Alpine Linux.
Today, it's part of Unix lore that there's an important functional distinction between the binaries in the root directory (/bin, /sbin, /lib and so on) and those kept under the /usr tree (/usr/bin, /usr/sbin, /usr/lib and so on). Trying to reconcile this split is a process called the usr merge, and amusingly, the latest Alpine Linux 3.23 has not yet completed it, although it was planned to.
UNIX started out as a quick hack by two geniuses in their spare time, so that they could use a spare computer – an extremely rare thing in the 1960s – to run a simulation game flying the player around a 2D of the Solar System that one of them had written. It was called Space Travel [PDF].
The project they were working on for their day jobs, the MULTICS operating system, became unfairly famous for being huge and over complicated. In fact, it was used for years, and is missed by its former users.
Those two geniuses, Ken and Dennis, wrote something tiny and simple, and in keeping with that, they used tiny cryptic abbreviations and very short file and directory names. Their colleague, the great Brian Kernighan – the "K" in "K&R C" and "AWK" – even suggested the name UNICS as a joke.
The problem is that this tiny, experimental OS escaped the lab. Version 6 got out and became the basis of the famous Lions book, the source of probably the most famous code comment of all time:
/*
* If the new process paused because it was
* swapped out, set the stack level to the last call
* to savu(u_ssav). This means that the return
* which is executed immediately after the call to aretu
* actually returns from the last routine which did
* the savu.
*
* You are not expected to understand this.
*/
(It's at line 2238 in the annotated source code, if you're curious. The comment itself inspired a book.)
Unix V7 is the one that did the real damage: it went viral and its descendants, offshoots, and rewrites were widely adopted by industry and academia.
- 52-year-old data tape could contain only known copy of UNIX V4
- The elusive goal of Unix – or Linux – simplicity
- Beta of Unix version 2 restored to life
- The Unix Epochalypse might be sooner than you think
Now, it has grown into a bloated mess millions of times bigger than the OS which inspired it. Those jokey cryptic filenames in cryptic folders are now enshrined as holy writ, and the people maintaining the systems have forgotten their origins.
Meanwhile, the original developers kept working away, improving it and refactoring it and simplifying the design, all the way up to the Tenth Edition – then, it was radically rewritten to become the network-aware Plan 9 from Bell Labs. Today, work on that continues as 9front, which we've written about, including its place in history.
Now, though, a crucial early evolutionary step has been found, imaged, and works. It's almost as if it were a Christmas miracle! ®
