Stack limit

Some weeks ago I was involved in a discussion about how the leds on the 11/70 should work – more specifically, how to deal with the differences in the ‘real’ hardware and Oscar Vermeulen’s PiDP-11 console. I had what I thought was a rather nice idea, but it wasn’t received very well – and, given the lack of real and working 11/70 systems about, it wasn’t very easy to find out what was ‘right’. So I thought about it for a while, and decided to focus on something else for a while. The idea I came up with was to revisit the old 11/70 test programs. The first one in the set, EKBA, I had already managed to run flawlessly, but I didn’t spend much time with the second one, EKBB.

Maybe it was my subconscious playing a trick, because EKBB is – as far as I know, at least – the only program that includes a console test… and that very quickly proved that my nice idea for the console wasn’t working at all. But after reverting it, I then decided to have a look at all the other error messages that EKBB was throwing at me. Mainly these were in two categories – DIV condition codes and stack limit behaviour.

The DIV condition codes caused a bit of a headache – again, I should probably say, but because the last time I looked at this bit of core is already 8 years ago, the details had slipped a bit. I did manage to fix a couple of the error – messages, I should probably add, because functionally it did work correctly already; what the tests do is determine at which point the DIV instruction is aborted in a number of edge cases such as divide by zero.

More interesting really were the errors about the stack limit.

Originally, PDP2011 was going to be something like an 11/94 – since that seemed the most interesting model, and also I had the listings for the ZKDJ test that showed the actual workings of the machine in much greater detail than the manuals did. When that worked, I added the other models mostly by selectively disabling bits of the core for 11/94. But, 11/94 may have the highest number, but it isn’t in all respects the most complex machine – and, one of those is the stack limit, which is a lot more complex and different in 11/70 (and 11/45 too, for that matter). And, the version that I had implemented might pass the tests for the J-11 cpus, 11/34 and 11/44, but it certainly did not pass the tests for 11/70.

Turns out I had misunderstood quite a bit of the way it is supposed to work. For instance, I had assumed that the initial (after reset) value of the stack limit register should be 400. But actually, it should be 0 – and the limit of 376 and 340 above that is arranged in logic. And, even more of a surprise, the stack limit mechanism also protects the PSW from being overwritten by the stack – and in the 11/45, even more internal registers are.

So as usual the interesting question is, how can it be that PDP2011 systems of all possible configurations – including quite a few 11/70 – have run since November 2011 and this bug wasn’t ever found? I’d speculate that the operating systems usually don’t run into yellow or red stack traps – and if they do, it is probably a secondary effect of something else going wrong. If that is the case, the errors wouldn’t be recoverable, and then it doesn’t matter so much if the trap mechanism doesn’t work entirely as it should.

Anyway, it’s fixed now. Not that EKBB now passes flawlessly – there are a few puzzles left in it. But first up now is to go back to the console and the leds. And oh, what I almost forgot – thanks to Jörg Hoppe for providing the listings of these tests, and obviously Al Kossow too for everything Bitsavers. None of this would be possible without freely accessible documentation.

Booting 11/70

Booting a PDP-11 is something that I have maybe overlooked a bit, in trying to approach the ‘real thing’. In the history of PDP2011, it’s not that long ago really that I added the M9312 boot roms option – and I really still prefer my own boot code. Partly because PDP2011 can be different models with different peripherals according to the configuration, and my own boot code shows what has been configured. Easy to keep track of what kind of system is in an FPGA that way. And also faster.
However, the M9312 boot roms and the monitor included in them have certainly proved their value. For one, the monitor works with Jörg Hoppe’s PDP11GUI,  allowing loading and dumping of memory and disks – which, if nothing else, opens up the possibility of easily copying disk images to and from a PDP2011 system, other than physically removing SD cards – to the point that I’m now actually considering other flash memory media besides SD cards. For instance because those other flash media could potentially be lots cheaper. Very interesting for designing a shim board for connecting to Oscar Vermeulen’s console, potentially adding a couple low cost flash components for the disks instead of the SD card connectors…
And that brings me to the subject that’s been on my mind. Because Oscar’s console is very definitely an 11/70, and 11/70 is one of the models of PDP11 that had a very specific different way of booting – even if it had the same regular M9312 in it, there was a specific boot rom for it, with its specific tests, without the monitor, and with a specific method of booting.
In general, PDP11 hardware would ‘boot’ using the power fail trap vector 024, and 11/70 follows this pattern. The M9312 would then detect the boot, and during the memory cycles to fetch the trap vector and PS the M9312 would ‘wire-or’ the values set on its micro switches to make the processor load the vector from its boot roms – one bigger 512 word rom for diagnostics and the monitor, or in case of the 11/70 for diagnostics only; and 4 smaller 128 word roms for booting from many different peripherals – at least 20 different types of disks supported by DEC alone, but also magnetic tape units, paper tape readers, punch card readers, and network cards – serial and Ethernet. So the most important thing about booting for an 11/70 administrator would be to choose the device boot roms, and set the switches (the ‘address offset switch bank $1’) on the M9312 card to the preferred boot medium – or to the diagnostics.

The switches on the M9312 card could be used to make the cpu to start the boot code in one of the small roms directly – most likely booting your preferred operating system from one of the hard disks. But you could also set it to start at the diagnostics code in the lower rom, and then use the console switches to determine to which location in the smaller device specific boot roms to jump.

And that’s what I’ve been thinking about – how to do that for PDP2011. With Oscar’s console, it is obviously possible to let the system boot in the 11/70 console style, have it run the specific 11/70 rom with its diagnostics, and then boot from one of the roms according to the setting on the console switches. But… well, I’m not sure what that would add. Booting the fpga PDP11 isn’t really going to be ‘really’ the same as the ‘real thing’, ever – for one, the power fail vector doesn’t really make sense like it did in the days of core memory. Nor can an fpga do wired or – sure I can work my way around that with another multiplexer implied from another if statement in the code, but it’d seem a bit awkward, and especially in the address calculation, I can really miss another layer of complexity. It wouldn’t make the system run any faster, that’s for sure.
So, what is on my mind is that while it isn’t really the way that a ‘real’ 11/70 would boot, the M9312 monitor is actually probably better in terms of flexibility and user friendliness. And for those that would rather directly boot from a specific device without bothering with console interaction, it’s fairly easy to do so by reconfiguring the start address in the PDP2011 code – basically, today’s equivalent of setting switches on the M9312 card. And, well, obviously it would be possible to add physical switches on a shim card, or use switches that are on the fpga board, or some combination – but on the target board that I’m working on now, the de0nano, there aren’t enough switches and I’d rather use them for something else probably.
In short, I’m a bit hesitant to add the special /70 boot rom – I’m not saying that I never will, but I think the monitor rom is far easier and more flexible to use, and for fixed setups the old style pdp2011 roms will probably work better, maybe in combination with a changed start address in the top source file. If I’m going to add it, it’ll probably not be the default. That should be the monitor, I think.


After the summer, I’ve picked up work on the interface to Oscar Vermeulen’s PiDP11 console – what was left to do was the virtual settings on the address rotary switch and the actual values on the address and data lights. It mostly works now, and I’ve come to the point that I need to take a step back from it, let it rest for a while and come back to it in a couple of days, maybe a week or so – to avoid getting blind to the things that aren’t right yet. Meantime I’ve sent a preview to a beta tester, and I’m anxiously awaiting his comments…
So now it’s time to just play with the machine! and the first thing on my mind to dive into was Ingres. One of the oldest real relational database systems, and with a long and rich history. I knew it was included in 2.11BSD, but when I tried it out years ago when I first got 2.11BSD to run, it didn’t work… all the commands core dumped. So it needed a bit more work – and after quite a bit of tinkering and experimenting, it turned out to be quite easy – as usual if you know the answer. At first, I tried rebuilding the Ingres sources as root, but that doesn’t work quite right – it can be done, but it’s a lot easier to run the make as the ingres user.
So, what needs to be done is this:

  1. Reconfigure the kernel to include the Ingres lock driver – in other words, the INGRES option (on the last line of the config file) should be set to YES. And obviously then recompile the kernel, install it and reboot the machine – and all of that using root, as usual.
  2. Login to the ingres user, change into the source directory, and run make – if you thought the kernel took a bit to recompile, well, this takes a bit longer.
  3. Change into the demo directory, and create the demo database by running ./demodb demo

And after that, the famous ’emp’ tables are ready for use. One surprise though – I must have known this in the day, but I forgot – this version of Ingres doesn’t use SQL, but it’s own language: QUEL. So ‘select * from emp’ doesn’t work, I had to use some of the examples from the manual.

* range of e is employee
* retrieve (e.all)
* \g
Executing . . .
|number|name                |salary|manage|birthd|startd|
|   157|Jones, Tim          | 12000|   199|  1940|  1960|
|  1110|Smith, Paul         |  6000|    33|  1952|  1973|
|    35|Evans, Michael      |  5000|    32|  1952|  1974|
|   129|Thomas, Tom         | 10000|   199|  1941|  1962|
|    13|Edwards, Peter      |  9000|   199|  1928|  1958|
|   215|Collins, Joanne     |  7000|    10|  1950|  1971|
|    55|James, Mary         | 12000|   199|  1920|  1969|
|    26|Thompson, Bob       | 13000|   199|  1930|  1970|
|    98|Williams, Judy      |  9000|   199|  1935|  1969|
|    32|Smythe, Carol       |  9050|   199|  1929|  1967|
|    33|Hayes, Evelyn       | 10100|   199|  1931|  1963|
|   199|Bullock, J.D.       | 27000|     0|  1920|  1920|
|  4901|Bailey, Chas M.     |  8377|    32|  1956|  1975|
|   843|Schmidt, Herman     | 11204|    26|  1936|  1956|
|  2398|Wallace, Maggie J.  |  7880|    26|  1940|  1959|
|  1639|Choy, Wanda         | 11160|    55|  1947|  1970|
|  5119|Ferro, Tony         | 13621|    55|  1939|  1963|
|    37|Raveen, Lemont      | 11985|    26|  1950|  1974|
|  5219|Williams, Bruce     | 13374|    33|  1944|  1959|
|  1523|Zugnoni, Arthur A.  | 19868|   129|  1928|  1949|
|   430|Brunet, Paul C.     | 17674|   129|  1938|  1959|
|   994|Iwano, Masahiro     | 15641|   129|  1944|  1970|
|  1330|Onstad, Richard     |  8779|    13|  1952|  1971|
|    10|Ross, Stanley       | 15908|   199|  1927|  1945|
|    11|Ross, Stuart        | 12067|     0|  1931|  1932|

A little bit more complex example: calculating the average salary for the employees working for each manager:

* range of e is employee
* retrieve (e.manager, avgsal=avg(e.salary by e.manager))
* \g
Executing . . .
|manage|avgsal    |
|    10|  7000.000|
|     0| 19533.500|
|    32|  6688.500|
|    33|  9687.000|
|    13|  8779.000|
|    55| 12390.500|
|    26| 10356.333|
|   199| 11117.556|
|   129| 17727.667|

and then of course it would be nice to add another column with the name of the manager. Simple, add another view on the same table and match the number to the manager id:

* range of e is employee
* range of m is employee
* retrieve (, e.manager, avgsal=avg(e.salary by e.manager)) where e.manager=m.number
* \g
Executing . . .
|name                |manage|avgsal    |
|Ross, Stanley       |    10|  7000.000|
|Smythe, Carol       |    32|  6688.500|
|Hayes, Evelyn       |    33|  9687.000|
|Edwards, Peter      |    13|  8779.000|
|James, Mary         |    55| 12390.500|
|Thompson, Bob       |    26| 10356.333|
|Bullock, J.D.       |   199| 11117.556|
|Thomas, Tom         |   129| 17727.667|

But, oops. Now we’ve lost manager 0 – because there isn’t a row for manager 0 in the table. Maybe 0 means that there isn’t one, and that it’s the big boss who has manager 0 in the table? That would seem right for J.D. Bullock – he fits all the stereotypes, being the oldest, and earning the most of all employees – and he started working in the company the day he was born. But there’s also Stuart Ross, who started a year later, and earns a lot less. So, I’m not sure – maybe the sample data is intentionally confusing.
Anyway, this case of missing rows in the last query is a nice example of what would be easy to lift out of the data with an outer join, but I have no clue how to do that in QUEL, or if it’s even possible. Nothing to be found in the manuals I’ve seen so far.

Things are moving!

Blinkenlights, for instance.
It’s so hard to believe that it’s already almost been 3 years since my last post here.
Well, I did have to hack my own site – I didn’t remember the admin password. Still, it’s not really like nothing did happen in the meantime, just nothing that I felt was finished enough to merit a post – like, the experimental work I did on the faster cpu. Or the ideas I had for adding sdhc support. And then there were the preliminary discussions on Oscar’s PiDP11, and whether or not I could interface my vhdl pdp11 to that. Somehow all of that was still in the not-quite-ready-for-posting stage until now… and maybe it’s showing my age, I like to make things public when they’re finished and real, even though the current fashion is to start shouting when you’ve just got a plan but can’t be sure if it’ll ever fly yet.
Anyway, it’s real enough now, I’ve got a few setups blinking their lights at me now. No, the vhdl for the console interface isn’t quite ready yet, but the tricky bits are done. Since the pinouts of Oscar’s PiDP11, and the Raspberry Pi interface to that, don’t quite match the pinouts of the fpga boards, it looks like there’ll have to be a converter board – a ‘shim’, we’re calling it for now. And since there will be a couple pins left on the 40-pin interface, I’ll most likely add the bare essential peripherals to that shim too – most likely it’ll work out to just enough to connect a serial console and a sd card. Center of development now is the DE0-NANO board from Terasic – a big fpga with lots of IO connectors, a very good build quality, and widely available for about USD 80 – it’s unbeatable. But probably most of Terasic’s other boards will do fine as well, if they have two of the 40-pin connectors and if the fpga is big enough – I’m not sure that DE0 (without the -NANO) will still be big enough. Why two of the 40-pin connectors, you might ask, if the console is clearly using just one? well, I’m planning for a peripheral board and that would use the other connector.
Development version of the console
No that isn’t what the console will look like when it’s finished – it’s one of the pre-production boards that Oscar gave me to start development on, and what we call the ‘lab test animal’. Oscar fished it out of the bin for me to play with – the holes are not aligned correctly so it won’t fit nicely in the case, and the leds are not the right colour either, obviously – but it’s just fine for the development I’m now doing. Just to show you the setup that I’m now working on 😉
Functionally, most of the lights and switches already work. Most of the work still to be done is around the rotary switches (with the mmu console modes) and some of the lights that the fpga PDP11 never needed – such as the run/pause/master, for instance. That might still take lots of time, but it’s getting there.
The finished console of course has the nice switches and rich red leds, and the beautiful panel to hide the PCB behind. And of course the custom injection molded  case… check Oscar’s site at to see more.

That’s my setup that’s now running Oscar’s PiDP11 for comparison and for playing, obviously! And, note how the white lamp test switch is not quite aligned with the rest of the switches, that’s entirely my fault in being in too much of a hurry to build the kit…
So that’s it for today, and I’ll try to post a bit more regular to keep you up to speed on what’s going on with the PDP2011. Over the next weeks I’ll be working to get the vhdl for the front panel working correctly, and also to get the design of the shim finalised. Hope to get that done before the summer comes 😉


Finally, I’ve managed to find the time to finish up the new boot code, test everything, generate new bitstreams for the download page, and update the site.
There are now two different sets of boot roms to choose from. The sources in m9312l46.mac and m9312h46.mac are the – now almost unchanged – DEC M9312 boot roms, as described in the K-SP-M9312 documents you can find on Bitsavers. The only change I made is a tiny one that will allow you to use lower case input – even though the size of the roms is completely filled up by the original code, I found some room by removing a couple of instructions that read the switch settings on the original hardware that allowed to select whether or not diagnostics would be run before booting. The PDP2011 does not have these switches – diagnostics will always run.
The second set of boot roms is in the sources m9312l47.mac and m9312h47.mac – I used the additional rom space to make the original boot code a bit more elaborate. It now lists what is  in the device space of the system, before going on to the original way of booting – ie, boot from the first disk of the first controller it finds, in the order RK, RL, RH.
Which of the two sets to choose depends a bit on which kind of configuration you run, and what you’re going to do with it. The DEC version is more flexible, it allows you to boot from whichever disk is in the system – which is very useful if you have made a configuration with more than one disk controller in it. And the load and store commands are very easy to use if you are debugging the interface between the PDP2011 core and memory chips, as you would do when porting the PDP2011 to an FPGA board that I don’t support. On the other hand, if you’re using a simple configuration and are booting it a lot, then the ‘old’ style core is easier – nothing to do, it just boots.
As a side effect of adding the second M9312 boot rom at 165000, in addition to the one already there in the older PDP2011 versions at 173000, the internal bus structure of the system has become larger. No problem for all of the existing board setups that I distribute, except for one – the de0, that was already used to the max of it’s capacity with the older 1170-rpxunofp setup, now gets seriously cramped for resources. As a consequence, I’ve had to decrease the clock speed somewhat – it now runs at 6.25Mhz at the cpu, instead of the 10Mhz. Still remarkable, if you consider that this the 1170-rpxunofp has 3 actual PDP-11 cpu’s in it – one for the system itself, one for the DEUNA, and one for the embedded terminal. Interestingly, it seems to be snappier despite the lower clock speed when I access it over the network – it might be that the original clock speeds caused some kind of interference between the DEUNA code and the ENC chip.
I’ve also updated the site in several places, and added a couple of how-to pages to explain how to get started, how to run the system, and how to make your own configuration.
Next thing on the agenda is to redo the sd card core in the disk controllers – clean up the old core, and add sdhc support – which I’ve postponed for a long time already, but since regular sd cards are becoming increasingly difficult to find (and my own stock is also rapidly depleting) this is becoming a priority. I don’t have a plan yet when it will be finished though – as I’ve often said, PDP2011 is something to do in winter, and the only reason that I’ve just now found time to work on it is because of a spell of bad weather in The Netherlands.
Finishing up for today, I thought to give an example of what the device space list looks like with the new boot code. Here it is:

Hello, world [t47]: cpu 11/45 fpu
177776           psw
177774           slr
177772           pirq
177770           mbr
177676 - 177640  par
177636 - 177600  pdr
177576 - 177572  mmu
177570           sdr
177566 - 177560  kl
177546           kw
174406 - 174400  rl
173776 - 173000  m9312
172516           mmu
172376 - 172340  par
172336 - 172300  pdr
172276 - 172240  par
172236 - 172200  pdr
165776 - 165000  m9312
boot from rl:

which of course will look slightly different depending on the configuration.


Last November, Scott Swazey asked why I made my own boot loader instead of using the original M9312 code.
I knew that the sources for M9312 were available, and I did have a look at them a long time ago. At that point, I was not sure I would ever get the CPU running, let alone booting from disks. And the code looked, well, complex and unlikely to run unless the hardware would mimic the original exactly. Also that seemed hardly possible at that time.
Later, when I got the first disk controller working, I just copied the boot loader from the simh sources – which I studied to get an idea of which parts of the disk controller were essential and which I could skip. And after the second and third disk controllers came into being, I just followed that pattern. Eventually that turned into T44 – the boot loader so far, the one that will announce itself with ‘Hello world’ and then proceed to boot from the first disk on the first available controller it knows about – RK05, RL02, RP06, in that order. Since in most cases the systems have one SD card only, and thus only one controller, that conveniently works for most cases. But Scott was building a system with both an RL and RH controller, so wanting to boot from a specific disk made total sense. So we looked into the challenge of making the original M9312 code work.
The first issue was that the M9312 code used absolute psects – as in, code to be fixed at a specific address. I knew there was an issue with that in my macro11 toolchain, but I never found what it was. Scott found it quickly though, it was a rather embarrassing mistake I made in the replacement of the macro11 linker that I modified to output the VHDL source for the boot roms.
After that, it was surprisingly simple. Just a question of adding the secondary boot rom at 165000, and I restructured the original device boot roms into one source – so it will fit into a single rom image. A bit later, I also changed the interpreter to accept lower case input – the original only works with upper case, which is a bit awkward.
M9312 commands

  • L <octal value> : set address
  • D <octal value> : deposit value at address
  • E <space> : examine data at address
  • S : start program

It also accepts the name of the four device bootroms as command:

  • DL<#> : boot from RL disk #
  • DK<#> : boot from RK disk #
  • DB<#> : boot from RP disk #
  • ZZ : run diagnostic

To make space for the lower case input, I had to remove some of the code from the original interpreter source – a bit of diagnostic code that would run on first boot. That also leaves some leftover room to reintroduce the ‘Hello world’ message – I’ve become used to that, and I’m missing it now. Or maybe some more user friendliness in the command interpreter, it’s very historic in the original state – and although it is somewhat fun to have it work in that way, it also makes for a lot of typing mistakes.
Next to his work on the booting stuff, Scott also found a mistake in the RL controller. The adders for the sector address were not wide enough, so an access to the fourth disk could wrap around to the first. That’s fixed now. He also made a suggestion to offset the disk images on the card, and use a standard MBR to address those images. After some long and hard thought, I decided not to include this – it may be convenient in some cases, but it also conflicts with the future plans I have for the disk controllers.
I haven’t decided yet if I will include the new boot loader into all prebuilt bitstreams. For the simple setups at least, the old boot loader scheme still makes sense. What I’ll definitely do is integrate both to use a joint code base for the device boot roms.
Updated sources will be published in a couple of weeks, I’m currently working to include Terasic’s C5G board into the distribution. After that is finished, I’ll post the new sources.

RSTS and J-11

Some time ago, Paul Koning contacted me about the issue that RSTS did not correctly detect the CPU type when the cpu was configured as a J-11 type – 11/84 or 11/94. He had already identified a problem in the cpu sources: the MFPT instruction would set the CPU code in the primary register set, instead of the currently active register set according to the PSW.

I built the core for the MFPT instruction a long time ago, at the point where I was working with a copy of the ZKDJ test to verify that the regular instructions were working correctly. I added the MFPT mainly because I liked the idea of sticking as close as possible to the original ZKDJ source – at the time, I did not anticipate the system becoming as complete as it is now. Why I chose to write the CPU type value in the primary register set I don’t really remember – it seems illogical now.

Anyway. The fix did solve the CPU type detection problem, but immediately revealed another: the startup code in RSTS went into a halt. Paul quickly found the reason; RSTS would overwrite it’s memory sizing code while trying to find out how much memory was available. The cause of this was that the J-11 models have 2044Mw of memory, and do not implement the unibus remap of the top 128K back into low memory – as 11/44 and 11/70 do.

After I fixed that issue, yet another appeared: the startup would proceed further, but would now issue the message:

This DCJ11 cannot be used in conjunction with an FPJ11 accelerator.
Contact Field Service for FCO kit EQ-01440-01 to correct the problem.
INIT will continue, but timesharing cannot be started.
RSTS V10.1-L RSTS (DB0) INIT V10.1-0L

Which I could easily suppress by setting the 8th bit of the control register at 17 777 750 to zero – stating no FPJ-11 floating point accelerator is present. Since the J-11 always includes the floating point instruction set in it’s microcode, functionally there is no difference in whether or not the FPJ-11 is present – it should only speed up the floating point instructions. But then, the message shows that there is a difference…

Diving deeper into the issue, Paul was able to find that the test that produced the message failed on a test involving the ASHC instruction. Sure enough, in the manual for the 11/84 EK-1184E-TM-001_Dec87.pdf – to be found on Bitsavers – page B-17 lists two model differences for the ASH and ASHC instructions, which I had already implemented a long time ago – but incorrectly applied to all models. As a test, I disabled this specific behaviour – and the result was that RSTS booted up, and recognized a FPJ-11 without complaining.

Apparently, the FPJ-11 then played some role in fixing the wrong implementation of ASHC and probably ASH in the J-11. Maybe the accelerator actually executed these instructions? or maybe it’s presence implied different microcode, or a different path in the microcode?

I’m not sure there is a way to find out – none of the documentation I’ve found so far includes this level of detail on the original hardware. Whatever the case, the RSTS CPU recognition bug is now fixed. Thanks Paul!

Besides fixing these bugs, I also made the bit setting in 17 777 750 a configurable item – including the corresponding behaviour of the ASHC and ASH instructions. The parameter is called have_fpa, and it’s default setting is 0 meaning no FPJ-11. I don’t think there is any use for having this, other than looking at the differences in the hardware listing in the RSTS startup…

have_fpa => 0
Start timesharing?  HA
  HARDWR suboption? LI
  Name  Address Vector  Comments
  TT0:   177560   060
  RB0:   176700   254   Units: 0(RP06)
  XE0:   174510   120   DELUA Address: 00-04-A3-1A-70-E1
  KW11L  177546   100   (Write-only)
  SR     177570
  DR     177570
  Hertz = 60.
  Other: FPU, 22-Bit, Data space, J11-E CPU
  HARDWR suboption?
have_fpa => 1
Start timesharing?  HA
  HARDWR suboption? LI
  Name  Address Vector  Comments
  TT0:   177560   060
  RB0:   176700   254   Units: 0(RP06)
  XE0:   174510   120   DELUA Address: 00-04-A3-1A-70-E1
  KW11L  177546   100   (Write-only)
  SR     177570
  DR     177570
  Hertz = 60.
  Other: FPU with FPA, 22-Bit, Data space, J11-E CPU
  HARDWR suboption?

As usual, I’ll post the updated sources to the download page some time later this weekend.

FPU and DEUNA fixes

Since I found the problem with the BAE register in the RH70 that prevented RSX-11MP from running, I’ve been working on straightening out the timing between the CPU and the main memory. It’s nowhere near finished, but the first tests show that when it is, the CPU will be capable of much higher speed – one experiment even ran at 90Mhz on the latest FPGA models. Not bad at all, compared to the current baseline of 10Mhz, even considering that the new timing needs slightly more cycles per instruction.

In the meantime people have been looking at RSX. Especially Paul Anokhin, who has helped me find several issues. Firstly in bringing the somewhat forgotten de1dram board variant back to life. The DE1 board has two memory chips, an sram of 512KB and a dram of 8MB. Obviously the sram is a bit too small to really bring to life 22-bit CPUs, and very limiting if you want to run the later versions of RSX or Unix on them. Years ago I made the de1dram version as an experiment while I was waiting for my first DE0 board to arrive, but it was never quite finished – the DE0 arrived a bit earlier than that, and I finished the work on that board and forgot about the de1dram. But now it works.

Secondly, Johnny Billquist has been working on BQTCP – a TCP stack for RSX that coexists with Decnet. It is afaik the only case that really requires the buffer chaining in the DEUNA to work correctly – Decnet uses fairly small buffers, but TCP by default uses an MTU of 1500, so if a packet of that size arrives and the buffers are smaller than that, the buffer chaining needs to be correct. This was not a problem before, since it appears as if all other cases – Decnet itself, TCP on 2.11BSD – appear to use buffer sizes slightly larger than the maximum packet size they expect to receive.

I wrote the DEUNA microcode as a sort of proof-of-concept – meaning, it is not really clean structured code. But it appeared to work well enough, so the somewhat more complex case of correct buffer chaining was not completely finished; it triggered a warning message, and it would also switch to the next buffer when needed. What I forgot was the case where a chunk of data from the ENC424J600 chip – the data is copied from the chip in 16-byte chunks – did not completely fit in the buffer; in that case, the whole 16-byte chunk would be placed in the new buffer instead of filling up the old one instead. Obviously, that caused errors for BQTCP. Luckily, it was surprisingly easy to fix, I only needed to restructure the receive flow in the microcode a bit and add a couple of tests to make the buffer chaining work correctly.

The latest issue Paul reported was slightly more complex to find – he wrote a F77 program to do some floating point calculations, and the results were not correct. The same algorithm in BP2 on RSX also was wrong, but translated to C on 2.11BSD it worked correctly. Since I did not have much time to look into this, I asked Paul to look at the instructions generated from the F77 program, and try to find the difference in flow between SIMH – which worked correctly – and the FPGA hardware. What he came up with was that the different flow started near the execution of the ABSF instruction.

That provided a nice clue for me to start chewing on. And soon enough, it became clear that there was an issue in the way that the addressing mode 0 for the group of instructions called ‘fp single operand group 2’ was handled – ABSF, NEGF, TSTF, and CLRF. For mode 0, I implemented a fast path in the instruction sequencer to bypass reading the input operand – because the input operand is in a register, it does not require memory access and thus does not need memory cycles. However, the register read occurred in the same cycle as picking up the output from the ALU – so, in effect, the output of the ALU was not based on the input. For the CLRF instruction, that makes no difference since the input is irrelevant anyway, and I would speculate that the TSTF instruction is not used much – but for the ABSF and NEGF instruction this is obviously not the case.

Apparently the addressing mode 0 ABSF and NEGF instructions are not used much. I checked 2.11BSD; at least the C implementation hides these instructions through library calls, so the compiler does not appear to generate these instructions directly. And the library implementation works with the operands on the stack, so it will never use mode 0. Also the MAINDECs seem to omit checking this part – maybe it did not use a separate data flow in the original machines, so that it would not make sense to specifically test for it. Whatever the case, none of FFPA, FFPB, FFPC, KFPA, KFPB, KFPC, or ZKDL picked up this issue – all of these run quite nicely even when the bug in the CPU is present.

Also here, once I understood the nature of the problem the fix was quite easy, I only needed to advance the register read to the main instruction decode state. Where it should have been in the first place, obviously – the whole point of the fast path was that the register should have been read already during the instruction decode.

Anyway, I’ll be posting the updated sources to the download page, and later this weekend I’ll post updated bitstreams as well. Big thanks to Paul for his help in finding and fixing these!

Fixes for DEUNA

Over the last months, I had a couple of occurrences of the problem where 2.11BSD would loose it’s network connection, reporting that there were no transmit buffers available on the DEUNA. All in all, I’ve seen this problem three or four times over the last year, but maybe ten times in the last month or so. No idea why – the only thing that changed is that I have a new Ethernet switch, that could make for a subtle change in the timing.
Anyway, now that it occurred more often, that also gave me the opportunity to find out what was actually wrong. I enabled the debug code in the if_de.c driver for the DEUNA and added some more debug statements. Next morning I was surprised by debug output – showing that in effect all transmit buffers were free…
So, that got me thinking of the interrupt controller core in the DEUNA. It did contain some strange edge-trigger construction, that could potentially result in a deadlock. I changed it, and setup my venerable old 20Mhz oscilloscope to show the interrupt signals – br and bg.
This time I had to wait for three days for the problem to occur again – and to my disappointment, it did, and it still locked up in the same way. However, it was also clear that no interrupts were taking place, so I was definitely looking in the right place – the interrupt controller was maybe not locked up itself, but even so no interrupts were taking place. More evidence against the interrupt controller and the edge-trigger in it.
A couple of experiments showed that an easy solution would be just to generate interrupts on the level instead of the edge. But this would also cause the DEUNA to keep on interrupting until the software disabled interrupts or cleared the originating bit. Not very elegant, but it did work – and after some time, I realised that the software will in all likely scenario’s examine the interrupt bits, and most likely reset them. So would it maybe work if I went back to the edge triggering system, and reset the trigger on writes into the PCSR0 register?
Of course it did.
And a minor other thing comes to mind: I keep saying DEUNA, but it’s actually a DELUA now. The difference is only in the PCSR1 ID bits; no logic has been changed at all. I did this because Decnet on RSX-11M-Plus tries to load microcode into the DEUNA – which will not work because in reality of course the controller does not look like a real DEUNA at all. But it will leave a DELUA alone. And because all the other software – 2.11BSD and RSTS – does not seem to make a difference between DEUNA and DELUA, there seems to be no reason not to change the thing into a DELUA.
I changed several subtle things in the microcode as well, mostly around buffer chaining and resetting the chip if it becomes disconnected for some reason. Buffer chaining probably still is not correct, but it doesn’t really seem to be used extensively by the operating systems – it’s only when broadcast frames longer than what the network stack expect arrive that the code seems to be triggered.
The updates – including the fix for RSX-11M-Plus – are on the download page now, and several pregenerated bitstreams as well. Enjoy!

RSX11M-Plus. Finally.

A couple of weeks ago someone mentioned that there were some FPGA related articles in the December issue of Circuit Cellar. So I checked it, and one of the articles pointed me to the built-in logic analyzers that the leading tool chains now all seem to have. At least, the Circuit Cellar article is about Chipscope, which is the Xilinx variant, and Altera has something similar called SignalTap.
Since most of my Xilinx stuff has been stored away since last years spring cleaning, I decided to go and play with SignalTap. And as usual with the FPGA tooling, the first impression was not that favourable. But a couple of days later I thought to try again, and this time around I started to appreciate some of the things that the software can do. For instance, tap into an enormous lot of signals at a time – at least certainly compared to my old ‘real’ analyzer, which can do only 32 signals. And the amount of capture memory is also decent, provided you’ve some room in your FPGA memories.
But more interesting is the trick where you can let the analyzer capture when some subset of the signals change state. And you can assign names to bit pattern values in a capture. Those two tricks I used to finally find the problem that prevented RSX-11M-Plus from booting – first, I used the address match signal within the RH11 controller logic as a trigger for the analyzer to capture state, and second, I assigned the register names of the control registers within the RH11 to the address signal.
So, I thought that would give me a nice and easy overview of exactly what RSX-11M-Plus was doing to the RH, and what would cause it to get wrong results. And that is exactly what it did – only, not in the way I expected. Took me some time to see something that in retrospect is very obvious; there is a write to a register in the RH11 space, but it isn’t decoded into a register name – even though I added register names for all registers that I knew about.
Aha. So, something going on here… The first thing I checked was whether it could be a controller register or a disk register – in a real setup with RH and RP, some of the registers reside in the disk, others in the controller. I decided to verify all the controller side registers first – and the one that I was consistently missing was BAE, the register that holds the bits 21-16 of the address for the controller. A quick change to the controller source proved that to be correct; if I assigned BAE to this register address, suddenly RSX-11M-Plus would boot happily… And it seems to run quite happily as well, including running complete sysgens, and also running Decnet and other software.
A couple of things still need some clarification; mostly, do other registers also live at other addresses than I would expect them. Once that is done, and I’ve completed my usual regression tests, I’ll be posting the new vhdl to the download page.

This output from the SignalTap-II analyzer shows the unexpected address for the BAE register

This output from the SignalTap-II analyzer shows the unexpected address for the BAE register