While porting an operating system can never be easy, it is a finite task. Bear in mind that originally, the kernel ran as a user-mode program on a VAX under VMS. That is where it was primarily debugged (approx 1997). Recently (Sept 2000), I had it working on an ALPHA under VMS to do some debugging.
Now there is a fully functional alpha port. This project took approximately 3 months of working weekends for the most part.
With that, here is a list of stuff to do:
How will your memory be laid out for a general user-mode process? For example, in the 486 implementation, memory is set up like this:
4m boundary indicator +---------------------------------------+ | no-access to catch bad pointers | <- 0xFFFFF000 +---------------------------------------+ | process copy of PD | <- 0xFFFFE000 PROCMPDVADDR \ <- physical address PRCTX_L_MPDPA +---------------------------------------+ | | | | | process page table pages | | private section 'PRCTX_L_PTSEC' | enough to map from PROCBASVADDR up | | | to and including PROCMPDVADDR | | | | <- PROCPTBVADDR (0xFFC7D000) / +---------------------------------------+ | | | | | user process space | | (MAXPROCPAGES*4096 bytes) | 4m | | <- PROCBASVADDR (0x20000000) +---------------------------------------+ | | | system global memory | | | <- 0x001A0000 (approx, actually wherever the kernel ends) +---------------------------------------+ | the kernel image | <- 0x00100000 +---------------------------------------+ | memory hole | <- 0x000A0000 +---------------------------------------+ | kernel stacks for each cpu | <- 0x00088000 +---------------------------------------+ | system page table | <- 0x00003000 +---------------------------------------+ | system global PD | <- 0x00002000 +---------------------------------------+ | stupid routine to switch | | alternate cpus to 32-bit mode | | overlaid by local APIC after boot | <- 0x00001000 +---------------------------------------+ | no-access to catch null pointers | <- 0x00000000 +---------------------------------------+Everything above address 0x20000000 is per-process (that is, it is different for each process). Everything below address 0x20000000 is system-common (that is, it is the same for each process). Note that on the 486's, we can get away with having a process' page tables be addressible only within that process, as the cpu uses physical addresses to locate the page tables.
Note that the system occupies low memory and the user process stuff starts in high memory. It made writing the loader easier because it could assume that physical addresses = virtual addresses, and the address the kernel gets linked at is the address it gets loaded at, etc, etc. Who cares what address user mode programs ended up at. The reason for the 0x20000000 is because the system page table (SPT) starts at address 0x00003000 and goes up until it runs into kernel stacks (one per cpu), which are just below the memory hole at 0x000A0000. So the largest the SPT can be is 640k-12k-24k*4=532k, 640k for the memory up to the hole, 12k for the stuff at the bottom of memory, and 24k for each of the possible 4 cpus. So this gives us a maximum system global area of 512Meg.
Alpha memory is set up like this:
+---------------------------------------+ | I/O sparse access | <- 0xFFFFFFFC00000000 +---------------------------------------+ | Kernel (& PAL, etc) | <- 0xFFFFFFFA00000000 +---------------------------------------+ | | Overlapping L1,L2,L3 pagetables | <- 0xFFFFFFF800000000 +---------------------------------------+ | | | | | user per-process space | | | | | <- 0x0000000000100000 +---------------------------------------+ | | | per-process kernel data | | | <- 0x00000000000F8000 +---------------------------------------+ | no-access to catch null pointers | <- 0x0000000000000000 +---------------------------------------+Everything below and including the L1 pagetable page is considered per-process. Everything above the L1 pagetable page is common to all processes and does not re-map when switching processes.
This is the more traditional approach where the user stuff is in the low end and the OS is in the upper end of virtual memory space. This was done because the Alphas have a variable pagesize (8K..64K) so I don't want the per-process space starting at a different address for different systems, as it would require re-linking user images. Also, different I/O systems could use differing numbers of L1 slots to map their space, which would also cause per-process space to start at different addresses.
You must also set up a boot sequence, and what the memory layouts will be.
Next, you will need a basic console driver. This can be a very simple thing to start with. Just write routines oz_hw_putcon which writes a string to the console, and oz_hw_getcon which reads a line from the keyboard. They are completely synchronous routines and should be easy to write. They link to the oz_dev_knlcons.c driver to form a complete console driver. The only caveat is that when a keyboard read is going on, NOTHING ELSE will happen, but it's enough to get things going and is perfectly fine for the loader.