For those of you who are linking to this page from outside the main OZONE site, it can be seen here. OZONE is an OS I wrote myself that combines my favourite parts of VMS, Linux and WindoesNT. I am in the process of porting OZONE to work under XEN.
Warning!
This is simply a description of my porting experience with some hopefully helpful hints.
Don't take any of it as definitive.
If you have any corrections or things to ask about or things to add, please let me know!
Also note that the stuff below pertains to XEN V1.2!
XEN is yet another x86 CPU / PC system emulator program (yaxcpsep). Well, not quite. XEN provides virtual environments of the x86 nature, but they are not PC's. I say that because the virtual machine it gives an OS doesn't have IO ports and won't allow access to ring 0, for example. There are other restrictions.
These restrictions are quite viable, however. Linux has been ported and works nicely. OZONE has a compile option to move most of the kernel into ring 1 anyway, so it shouldn't be too much work to get it to run with XEN.
To perform privileged functions such as altering pagetables, an OS must make what they call 'HYPERVISOR' calls. These are int $0x82 traps that the XEN monitor processes. There are calls for accessing pagetable entries, stuff like that. Doing it this way allows the virtualization to happen without any instruction translation or scanning, so the virtual machines execute mostly at full speed. The downside is that the OS must be ported to run on a new architecture, including device drivers for accessing the virtual devices. Yes, a ring 1 program can SGDT, but since the OS is ported, it won't do these as the value is meaningless. If an application does an SGDT, it is just as meaningless as it was in the original OS.
The first thing to do is get XEN running on a PC. The PC I used is an ASUS P2B-L with:
So anyway, here's what I did:
# # Start XEN virtual ethernet # echo "rc.local: starting XEN virtual ethernet eth0" /usr/bin/xc_dom_control.py vif_ipadd 0 0 192.168.0.151 /sbin/ifconfig eth0 192.168.0.151 /sbin/route add -net 0.0.0.0 netmask 0.0.0.0 gw 192.168.0.1 echo "rc.local: virtual eth0 startup complete"Substitute in your own IP addresses for the 192.168... numbers.
The above gets the stuff going with the control virtual machine running (what they call Domain 0). Now the idea is to get another virtual machine going. Here's what I did:
echo "rc.local: start ethernet..." /sbin/ifconfig eth0 192.168.0.152 /sbin/route add -net 0.0.0.0 netmask 0.0.0.0 gw 192.168.0.1 echo "rc.local: ifconfig..." /sbin/ifconfig echo "rc.local: done."
We figure once we know how to get a VM to start, we can port an OS. So I started with an hello world programme, (helloworld.s):
.text .globl _start _start: cld # from include/asm-xeno/hypervisor.h movl $2,%eax # __HYPERVISOR_console_write (include/asm-xeno/hypervisor-ifs/hypervisor-if.h) movl $hello_message,%ebx # arg1 = buffer virtual address movl $hello_message_len,%ecx # arg2 = buffer length int $0x82 # from include/asm-xeno/hypervisor.h movl $8,%eax # __HYPERVISOR_sched_op movl $1,%ebx # SCHEDOP_exit int $0x82 hang: jmp hang # shouldn't get here hello_message: .ascii "This is the hello world program\n" hello_message_len = . - hello_messageThe Xen loader also wants a 12-byte header on the image file. So I wrote a little assembler module (xenoguestheader.s) to handle that:
.text .globl _start _start: .ascii "XenoGues" # read_kernel_header (tools/xc/lib/xc_linux_build.c) .long _start # - the kernel's load address
The final image has to consist of the 12-bytes object code from xenoguestheader.s followed the object code from helloworld.s. Here is my makefile to accomplish that:
helloworld.gz: helloworld.s xenoguestheader.raw as -o helloworld.o -a=helloworld.l helloworld.s ld -Ttext 0x100000 -o helloworld.elf helloworld.o objcopy -O binary -S -g helloworld.elf helloworld.raw cat xenoguestheader.raw helloworld.raw | gzip > helloworld.gz xenoguestheader.raw: xenoguestheader.s as -o xenoguestheader.o xenoguestheader.s ld -Ttext 0x100000 -o xenoguestheader xenoguestheader.o objcopy -O binary -S -g xenoguestheader xenoguestheader.rawNote that both the helloworld and xenoguestheader are linked at 0x100000. I first tried putting the 12-byte header at the beginning of helloworld.s and not having a separate xenoguestheader module. The result was that it printed hello world program instead of This is the hello world program. Notice that there were 12 missing characters? Crikey! Had I simply programmed Hello world nothing would have printed and I'd still be trying to figure it out!
All I had to do in the /etc/xc/vm2 file was to specify:
[root@xenophilia xc]# xc_dom_create.py -D vmid=2 -f /etc/xc/vm2 Parsing config file '/etc/xc/vm2' VM image : "/test/helloworld.gz" VM ramdisk : "" VM memory (MB) : "64" VM IP address(es) : "192.168.0.153" VM block device(s) : "" VM cmdline : "ip=192.168.0.153:169.254.1.0:192.168.0.1:255.255.255.0::eth0:off root=/dev/hdb3 ro 4 VMID=2" VM started in domain 52 [52] This is the hello world program [root@xenophilia xc]#So now it is a simple matter of programming to port OZONE.
The next thing to do is to catalog the calls that the hypervisor makes available to guest OS's. We can start by looking in the include/asm-xeno/hypervisor.h file that's in the xenolinux directory, and scanning the linux code to see where they are used. Another place to look is the hypervisor source code, like look for do_stack_switch to find the code for HYPERVISOR_stack_switch.
The environment seems to be similar to what the Alpha's console sets up, except that all physical memory is mapped to virtual addresses.
There are three types of addresses used:
Initialization: (start_info_t *) : startup information, pointed to by %esi on initial jump to kernel see include/hypervisor-ifs/hypervisor-if.h number of pages, shared info struct pointer, page directory VA, where loaded modules are, command line Here's what I get for my "hello world" boot (after inserting printk's): si -> nr_pages 16384 <- 16K pages = 64M bytes si -> shared_info 0x294000 <- real-world physical address of shared_info struct si -> dom_id 3 si -> flags 0 si -> pt_base 0x40FF000 <- virtual address of my page directory page si -> mod_start 0 si -> mod_len 0 si -> cmd_line ip=192.168.0.153:169.254.1.0:192.168.0.1:255.255.255.0::eth0:off root=/dev/hdb3 ro 4 VMID=2 I don't know what Xen guarantees about its position in memory, but to be safe, I copy it to a page that is part of my kernel image. That way I know I can use any pages after my kernel for whatever I want. (shared_info_t *) HYPERVISOR_shared_info : a shared communication struct its machine address given by start_info->shared_info map to VA space with HYPERVISOR_update_va_mapping this page is not mapped as part of your initial VM's physical pages -> events, event_mask : bitmask of events to process via hypervisor_callback -> various : contains current date/time information (see arch/xeno/kernel/time.c) HYPERVISOR_set_callbacks (codesegment, (unsigned long)hypervisor_callback, codesegment, (unsigned long)failsafe_callback) hypervisor_callback = callback to process async events bitmask of events available in HYPERVISOR_shared_info -> events (atomic access) return via iret (see arch/xeno/kernel/entry.S, arch/xeno/kernel/hypervisor.c) failsafe_callback = a pseudo-pagefault happened while serving an int $0x82 request (see arch/xeno/kernel/entry.S, arch/xeno/kernel/traps.c) HYPERVISOR_set_trap_table (trap_table) : set callback for traps, faults (accvio, divbyzero, int instrs, etc) (see arch/xeno/kernel/traps.c) HYPERVISOR_stop (virt_to_machine (suspend_record) >> PAGE_SHIFT); Print a message on console: HYPERVISOR_console_write (buf, len) buf = virtual address of message string len = length in bytes of message string Let someone else do something (HLT replacement): HYPERVISOR_yeild () Terminate (last step of guest OS shutdown): HYPERVISOR_exit () Access debug register: HYPERVISOR_set_debugreg (registernumber, value) HYPERVISOR_get_debugreg (registernumber) Thread (stack) switching: HYPERVISOR_stack_switch (new_stack_segment, new_stack_pointer) Set ring 1 stack pointer in TSS arg 1 : stack segment arg 2 : stack pointer HYPERVISOR_fpu_taskswitch Set CR0's TS bit, then does usual exception through vector 7 if FPU accessed *but* it clears the TS bit for you before calling your exception handler Process (pagetable) switching: (use MMUEXT_NEW_BASEPTR below to load CR3) Pagetable calls: HYPERVISOR_mmu_update (ureqs, count); Perform array of mmu updates arg1 : array of updates ureqs[i].ptr & 3 = MMU_NORMAL_PT_UPDATE - updates an arbitrary pagetable entry V1.2: .ptr = top bits give virtual address of pagetable entry to update V1.3: .ptr = top bits give machine address of pagetable entry to update .val = pagetable entry value to set it to MMU_MACHPHYS_UPDATE - updates READONLY_MPT_VIRT_START table entry .ptr = top bits give machine address .val = pseudo-physical page number MMU_EXTENDED_COMMAND - subcommand in low bits of .val: MMUEXT_PIN_L1_TABLE : validate L1 pagetable page .ptr top bits = page's machine address MMUEXT_PIN_L2_TABLE : validate L2 pagetable (pagedirectory) page .ptr top bits = page's machine address MMUEXT_UNPIN_TABLE : unpin L1 or L2 pt page .ptr top bits = page's machine address MMUEXT_NEW_BASEPTR : set up new pagetable (loads CR3) .ptr top bits = pagedirectory page's machine address MMUEXT_TLB_FLUSH : flushes all TLB entries (reloads CR3) .ptr top bits = 0 MMUEXT_INVLPG : flush one TLB entry .ptr top bits = 0 .val top bits = virtual address to invalidate arg2 : number of updates HYPERVISOR_update_va_mapping (virtualpagenumber, entry, flags); Write a single pagetable entry (for such as servicing pagefaults) arg1 : virtual page number that PTE maps, ie, the VA>>12 that faulted arg2 : contents to write to pagetable entry, using machine address no translation is performed on arg2 before writing it to the pagetable but it is checked to be sure you are mapping one of your pages arg3 : UVMF_INVLPG : invalidate page UVMF_FLUSH_TLB : reloads CR3, flushing all TLB entries Disk IO: HYPERVISOR_block_io_op(&op); Network IO: HYPERVISOR_net_io_op(&netop); HYPERVISOR_network_op(&op); Memory Layout when control passed to Guest OS: HYPERVISOR_VIRT_START = 0xFC000000 // Virtual addresses beyond this are not modifiable by guest OSes Within that space, there is a 4MB table of longs at READONLY_MPT_VIRT_START that maps a machine page number to a pseudo-physical page number All phsyical memory requested in the startup is mapped starting at that address. So if you asked for 64M, and your kernel loads at 0xC0000000, Xen will start you with memory mapped at 0xC0000000..0xC3FFFFFF. The kernel is loaded at the low end of that memory. Xen puts a pagedirectory page at the high end and the pagetable pages just below that. When I start my hello world program, the pagetables are set up like this: pde[ 0] = 05018067 <- there are enough pagedirectory entries to cover the whole 64Meg pde[ 1] = 094F7067 pde[ 2] = 0D5AE067 pde[ 3] = 0D5AD067 pde[ 4] = 0D5AC067 pde[ 5] = 0D5AB067 pde[ 6] = 0D5AA067 pde[ 7] = 0D5A9067 pde[ 8] = 0D5A8067 pde[ 9] = 0D5A7067 pde[ A] = 0D5A6067 pde[ B] = 0D5A5067 pde[ C] = 0D5A4067 pde[ D] = 0D5A3067 pde[ E] = 0D5A2067 pde[ F] = 0D5A1067 pde[ 10] = 0D5A0067 <- the first 1Meg is unmapped pte[ 100] = 015A1023 <- this is where my 'hello world' program is loaded pte[ 101] = 095B3023 pte[ 102] = 095B4063 pte[ 103.. 105] = 095B5023..095B7023 pte[ 106] = 05018025 <- this is the pte I use to look at the pagetable pages pte[ 107.. 3FF] = 095B9023..098B1023 \ pte[ 400.. 7FF] = 098B2023..09CB1023 | pte[ 800.. BFF] = 09CB2023..0A0B1023 | pte[ C00.. FFF] = 0A0B2023..0A4B1023 | pte[ 1000.. 13FF] = 0A4B2023..0A8B1023 | pte[ 1400.. 17FF] = 0A8B2023..0ACB1023 | pte[ 1800.. 1BFF] = 0ACB2023..0B0B1023 | pte[ 1C00.. 1FFF] = 0B0B2023..0B4B1023 | pte[ 2000.. 23FF] = 0B4B2023..0B8B1023 | Here is the bulk of my usable pages pte[ 2400.. 27FF] = 0B8B2023..0BCB1023 | pte[ 2800.. 2BFF] = 0BCB2023..0C0B1023 | pte[ 2C00.. 2FFF] = 0C0B2023..0C4B1023 | pte[ 3000.. 33FF] = 0C4B2023..0C8B1023 | pte[ 3400.. 37FF] = 0C8B2023..0CCB1023 | pte[ 3800.. 3BFF] = 0CCB2023..0D0B1023 | pte[ 3C00.. 3FFF] = 0D0B2023..0D4B1023 | pte[ 4000.. 40ED] = 0D4B2023..0D59F023 / pte[ 40EE.. 40FC] = 0D5A0021..0D5AE021 <- here are pagetable pages for pde[2..100] = VA 800000..403FFFFF pte[ 40FD] = 094F7021 <- here is the pagetable page for pde[1] = VA 400000..7FFFFF pte[ 40FE] = 05018021 <- here is the pagetable page for pde[0] = VA 000000..3FFFFF pte[ 40FF] = 03F8A021 <- this is the pagedirectory page pointed to by si -> pt_base not on a 4Meg boundary and not self-referencing ** And that covers the 64Meg I asked for in my config file ** There are also entries starting at pde[3F0] (VA FC000000) and up mapping Hypervisor internal data. I experimented by moving hello world to link and load at VA 0x1000. Xen set up everything starting at 0x1000 and apparently hides the 'hole.' Likewise, when I link at base 0x400000, it leaves the first 4M unmapped and puts everything after that.
This shall be my pseudo-FAQ section. After all, it's like pseudo-physical pages, and how frequently is this stuff asked, really?
How are pseudo-physical page numbers assigned? Xen assigns each domain a range of pseudo-physical page numbers starting at zero through the requested memory size. Then Xen allocates real physical pages for them all and maps them to your virtual machine, virtually contiguous, starting at your kernel's load address up to where it runs out.
How do I translate pseudo-physical page number to its real-world physical (machine) page number? Look it up in the corresponding initial pagetable entry passed to you on initialization. Since Xen maps every page given you, the initial pagetables serve as a list of machine pagenumbers assigned to your virtual machine.
How do I translate a machine page number to its pseudo-physical page number? Xen provides a read-only table (starting at virtual address 0xFC000000=READONLY_MPT_VIRT_START) that maps each machine page number to the corresponding pseudo-physical page number. It has one entry per real machine physical memory page. Theoretically, you could scan the table and find out translations for other VM's machine pages, but it wouldn't be of any use.
Should my pagetable entries contain pseudo-physical or real-world machine page numbers? They should contain real-world machine page numbers. Xen performs no translation on the entries you write, it just validates that you own the page being pointed to. This is consistent with you being able to read the pagetable entries directly.
What is this pagetable pinning about? It is used to tell Xen which of your pages are being used for pagetable pages. Apparently you can port an OS without using it. However, each time you tell Xen to load CR3 with a new set of tables, it would have to verify them all to make sure they point only to your pages and no one else's. By pinning them, you are telling Xen that these are pagetable pages and it should verify them this once and assume they will not be changed except via Xen calls. Of course, as with all pages being used for pagetables, you must mark them read-only first!
Do I have to provide a GDT? You can just use the segment registers provided by Xen if you want. Xen provides you with ring 1 and ring 3 code and data segments. They have a base of zero and a limit of FC3FFFFF (the area from FC000000..FC3FFFFF is marked read-only). Thus, if you use the flat memory model, you don't have to have any segment register programming in your OS at all, except a GP fault handler that puts the registers right should some nitwit application mess wit' dem, and maybe some code will test CS or SS for ring 1 vs ring 3.
How are exceptions reported? You tell Xen what exceptions you want to handle by calling HYPERVISOR_set_trap_table as part of your initialization code. You give it a list of (vector,dpl,cs,offset) quadruples describing each vector you want to handle. For my OS port, the cs parameter is always FLAT_RING1_CS. dpl is the privilege level you want to give access to this vector, 0=CPU exceptions only; 1=your ring 1 code; 3=usermode code. If you add 4, it will also disable event delivery (but it doesn't save the prior state). offset is the address of your servicing routine. Exception vectors, especially those with error codes, should be declared with dpl zero to prevent usermode code from doing an int instruction to call your routine, as there would be no error code(s) pushed on the stack.
The servicing routine is entered just like the bare CPU chip would do. If the exception is defined with an error code, it will be pushed on the stack just like the CPU chip does it. There are two exceptions:
What horrors await me in handling events? Event handling is fairly straightforward. It enters your handler with a 'bare' stack ready for an iret, just like real hardware would enter an interrupt handler. You process the events that are tagged in shared_info.events, atomically clearing the bits. When finished, restore shared_info.events_mask, restore registers and iret. One thing that got me was that Xen clears both the individual event enable bit for the events it is delivering for and the master enable bit. I suppose this might have been an effort to emulate EFLAGS IF and 8259s needing an EOI, but it made my work a little more difficult. Also, there is no hypervisor call to do an iret and restore the mask at the same time, so you must do some messy stuff to exit your handler to make sure you don't miss an event and not eat up your kernel stack.
So now the question becomes, how can OZONE be best fit into this? It seems that Xen maps all requested pages to virtual address space when the virtual machine is booted. So if I statically linked the OZONE kernel at, say, 0xC0000000, that would give me a maximum physical memory size of 1024M-64M=960M (minus some amount for OZONE's paged pool and any dynamic images). That seems reasonable for now anyway. VAX/VMS puts its kernel at 0x80000000 so I could do that with OZONE if there is a system with over 768M virtual machine memory to support.
OZONE does not require memory in general to be mapped to VA space, but it seems to be the way Xen works, so be it. The standard X86 OZONE uses a pair (per CPU) of pagetable entries to dynamically access memory by physical address. To do this under XEN would require making an HYPERVISOR_mmu_update call each time which would slow it way down. So for now, I am going to just leave all physical memory mapped and access it that way.
OZONE's memory manager is based on what it thinks are physical page numbers using an array starting at zero thru the top physical page number - 1. So there are these practical choices to create and index the array:
Another thing we can probably do is get rid of OZONE's loader. It primary purpose was so someone could alter the startup parameters. Secondarily, someone could copy some installation files before booting the kernel. Since with Xen you get a full functioning OS (as Domain 0), these functions can be performed there. So our oz_kernel_xen.s is going to receive control directly from the Xen loader. Also, come to think of it, we don't have a console to read interactive loader commands from anyway!
So far, my memory looks like:
Virtual Address +------------------------------------------------+ | | | XEN Hypervisor | | | <- FC000000 (4M boundary) +------------------------------------------------+ | | | Used for kernel expansion | <- used for dynamic kernel images, paged pool | | +------------------------------------------------+ | | \ | Remaining free page mapping | | | | | +------------------------------------------------+ | | | | | System global pagedirectory and table | | all of the VM's physical memory is mapped | | | ... in here as given by Xen on startup +------------------------------------------------+ | | | | | Kernel image as loaded by XEN | | | | / <- C0000000 (4M boundary) +------------------------------------------------+ | | | Per-process stack | | | +------------------------------------------------+ | | | Per-process code and heap | | | <- 00800000 (4M boundary) +------------------------------------------------+ | | | Per-process pagetable | | | <- 00400000 (4M boundary) +------------------------------------------------+ | | | "Requested page protection" table | (holds page protection bits requested by application) | | <- 003C0000 +------------------------------------------------+ | | | Per-process "pdata" array | (holds per-process things like user & kernel malloc listhead) | | <- 003BE000 +------------------------------------------------+ | | | No Access | | | <- 00000000 +------------------------------------------------+I put my per-process pagetables and other stuff at the low end of virtual memory because...
XEN starts you off with the page directory and table pages at the high-end of virtual memory. I suppose I could have used them there as is, except I don't know if XEN guarantees it will always put them there and if they will always be there in order. So I swap them around (just change the mapping, using the same physical pages). I put the directory immediately after the kernel image followed by the pagetable pages. Then I follow it by some pages of zeroes to pad out to the highest system pagetable pages I want (to map the "Used for kernel expansion" area).
Padding it with the zeroes also will make the upper portion of the page directory static, so all my page directories will have the same upper-end contents forever. I don't have to worry about keeping all the per-process directories updated.
I use spinlock levels 0xA0 through 0xBE to correspond to events 0 through 30. So when I set the spinlock level to 0xA0, it blocks event 0. When it is set to 0xA1 it blocks events 0 and 1; at level 0xA2 it blocks 0, 1 and 2, etc. Any level below 0xA0 will not block any event deliveries, any level at or above 0xBE will block all event deliveries. This is analagous to my using levels 0xE0..0xEF for the irq levels in the bare hardware x86 version of Ozone.
So when I set spinlock 0xA0, I clear mask bit <0>, and the mask is 0x?FFFFFFE. Setting spinlock 0xA9 sets the mask to 0x?FFFFC00. The top bit (master enable), is independent of spinlock level, as in OZONE, you can have spinlocks either with or without hardware interrupts being enabled. So the mask at level 0xA0 might be either 0x7FFFFFFE (master enable off) or 0xFFFFFFFE (master enable on).
DOM0: (file=memory.c, line=331) Page 01759000 bad type/count (02000000!=01000000) cnt=1But it turns out that the hypervisor self-corrects this 'error' and pins the page anyway, so it was actually working all along! I also get what I assume is the complementary error when unpinning the page:
DOM0: (file=memory.c, line=367) Bad page type/domain (dom=0) (type 33554432 != expected 16777216)In both cases, the HYPERVISOR_mmu_update call returns a success status (zero). Also, since I am subsequently able to write-enable the page, I am assuming for now that Xen is actually unpinning it.
My startup sequence looks like:
In V1.2, disk IO is performed by placing requests in a ring buffer then signalling the hypervisor to process them. The hypervisor then replaces the requests with responses and signals their presense by calling your asynchronous event handler with the BLKDEV bit set in events.
The first thing you must do is reset the ring addresses and map the ring buffer to your virtual address space. There is just one ring buffer that is shared among all the virtual disks. To get its machine address:
op.cmd = BLOCK_IO_OP_RESET; rc = HYPERVISOR_block_io_op (&op); if (rc != 0) { oz_knl_printk ("oz_dev_xendisk_init: error %d resetting ring\n", rc); return; } op.cmd = BLOCK_IO_OP_RING_ADDRESS; HYPERVISOR_block_io_op (&op); machine_address = op.u.ring_mfn << 12;Do whatever calls in your OS to get an unassigned system pagetable entry suitable for mapping (like you would for accessing memory-mapped IO registers in a real system), then call HYPERVISOR_mmu_update to map it to the machine address of the ring buffer.
Next, you need to find out what virtual disks are set up for you to use in your virtual machine.
memset (&op, 0, sizeof op); op.cmd = BLOCK_IO_OP_VBD_PROBE; op.u.probe_params.domain = 0; op.u.probe_params.xdi.max = MAX_VBDS; op.u.probe_params.xdi.disks = vbd_info; op.u.probe_params.xdi.count = 0; rc = HYPERVISOR_block_io_op (&op); if (rc != 0) { oz_knl_printk ("oz_dev_xendisk_init: error %d probing number of vbds\n", rc); return; } number_of_defined_disks = op.u.probe_params.xdi.count;Then loop through the vbd_info array to get the info about each disk. The only two elements I needed were:
There are two ring array indices provided in the ring struct that you mapped:
resp_cons <= resp_prod <= req_prod
When all three indices are equal, it means the ring is empty. You must not increment req_prod all the way 'round to resp_prod or it will get confused with an empty condition, so you must always leave at least one empty spot in the ring. When inserting or removing items from the ring using req_prod or resp_prod, be sure to place memory barriers between accessing the indices and accessing the contents of the slots, as Xen may be accessing them with another CPU:
while (there is a request to queue) { indx = blk_ring -> req_prod; if ((indx + BLK_RING_SIZE - resp_cons) % BLK_RING_SIZE) >= BLK_RING_SIZE - 1) break; fill in blk_ring -> ring[indx].req with request MB to make sure hypervisor will see a valid blk_ring -> ring[resp_cons].req if (++ indx == BLK_RING_SIZE) indx = 0; blk_ring -> req_prod = indx; }
while (blk_ring -> resp_prod != resp_cons) { MB to make sure blk_ring -> ring[resp_cons].resp is valid read response from blk_ring -> ring[resp_cons].resp if (++ resp_cons == BLK_RING_SIZE) resp_cons = 0; }
When you have placed some requests in the ring and have incremented req_prod, do:
op.cmd = BLOCK_IO_OP_SIGNAL; rc = HYPERVISOR_block_io_op (&op); if (rc != 0) oz_knl_printk ("oz_dev_xendisk startreq: error %d signalling\n", rc);The hypervisor will signal the BLKDEV event as each request completes.
My driver is in oz_dev_xendisk.c. There are probably just a few routines of general interest:
Like disk IO, network IO is performed by placing requests in a ring buffer then signalling the hypervisor to process them. The hypervisor then replaces the requests with responses and signals their presense by calling your asynchronous event handler with the EVENT_NET bit set in events.
Unlike disk IO, however, there is a separate ring for each virtual network interface your domain can access, and the receive and transmit rings are separated. This makes sense from the standpoint that if you have one very active interface and one relatively inactive one, you wouldn't want requests from the inactive interface interfering with requests from the active one and vice versa.
Each interface operates independently. So you must set up a probing loop to check all interfaces. There is a symbol MAX_DOMAIN_VIFS set up that you can use to terminate your probing loop. The loop should do a NETOP_RESET_RINGS call followed by a NETOP_GET_VIF_INFO and accept the device if both calls are successful (return zero status). You can retrieve the 6-byte ethernet virtual hardware address from element netop.u.get_vif_info and build your OS-dependent device table from there.
You will need a pagetable entry for each virtual device, to map its ring buffer. The ring buffer's machine pagenumber is returned in netop.u.get_vif_info.ring_mfn after each probe; map an unused pagetable entry to this page using HYPERVISOR_mmu_update or HYPERVISOR_update_va_mapping.
You will also need one page per receive queued to a virtual device. So if you intend to keep 10 receives per virtual device in the receive ring at all times, you will need to allocate 10 pages per virtual device. Each of these pages must have exactly one read/write pagetable entry pointing to them or Xen will reject the entry. I think this is because Xen takes it away from you when you queue the request then may give you back a different page when the request completes. So don't assume you get the same page back. But Xen remaps the new page to your same old pagetable entry, so you don't really know the difference. For OZONE, I simply give it the initial mapping pagetable entry that Xen provided on startup.
The rings work pretty much like the disk queues, except that they are split up into separate rings. There is also another index you must contend with, rx_event or tx_event. With these indices, you tell Xen how often to deliver the EVENT_NET event. The event is delivered when tx_resp_prod is incremented and becomes equal to tx_event (likewise for the rx indices). Suffice it to say that, if you have pending requests, you want to be sure that at some point, you set xx_event greater than xx_resp_prod to be sure Xen will queue an event.
Anyway, the conceptual order of the indices are (mod XX_RING_SIZE):
xx_resp_cons <= xx_resp_prod <= xx_event <= xx_req_prodBe sure to include memory-barriers where appropriate, as Xen may be munching away on your ring with another CPU whilst you are inserting or removing items. The addition of the xx_event indices complicates things a bit as you don't want to miss an event delivery.
When you place new requests in either the transmit or receive ring, you must call HYPERVISOR_net_op with a function code of NETOP_PUSH_BUFFERS to tell Xen to start working on your new requests.
My driver is in oz_dev_xenetwork.c.
As of Aug 8, 2004, it boots, starts my cli (shell) and runs the startup script. Now I have to implement the network virtual device driver and I 'should' be able to telnet to it.
Here is a list of my Xen-specific source files so far (updated Aug 16, 2004):
__ __ _ ____ \ \/ /___ _ __ / | |___ \ \ // _ \ '_ \ | | __) | / \ __/ | | | | |_ / __/ /_/\_\___|_| |_| |_(_)_____| http://www.cl.cam.ac.uk/netos/xen University of Cambridge Computer Laboratory Xen version 1.2 (m@n.n) (gcc version 2.96 20000731 (Red Hat Linux 7.1 2.96-98)) Thu Aug 5 15:44:10 EDT 2004 Initialised all memory on a 320MB machine Reading BIOS drive-info tables at 0x9fd90 and 0x9fda0 CPU0: Before vendor init, caps: 0000aa19 00000000 00000000, vendor = 0 CPU caps: 0000aa19 00000000 00000000 00000000 Initialising domains Initialising schedulers Initializing CPU#0 Detected 2.383 MHz processor. Found and enabled local APIC! CPU0: Before vendor init, caps: 0000aa19 00000000 00000000, vendor = 0 CPU caps: 0000aa19 00000000 00000000 00000000 CPU0 booted SMP motherboard not detected. Emmy_X86::lapicwrite: illegal write register 0x030, data 00FB00EF, eip 0808:FC623000 Emmy_X86::lapicwrite: illegal write register 0x020, data 0000000F, eip 0808:FC623000 enabled ExtINT on CPU#0 Emmy_X86::lapicwrite: illegal write register 0x280, data 00000000, eip 0808:FC623000 ESR value before enabling vector: 00000000 Emmy_X86::lapicwrite: illegal write register 0x280, data 00000000, eip 0808:FC623000 ESR value after enabling vector: 00000000 Using local APIC timer interrupts. Calibrating APIC timer for CPU0... ..... CPU speed is 2.3787 MHz. ..... Bus speed is 1.1781 MHz. ..... bus_scale = 0x00000134 ACT: Initialising Accurate timers Time init: .... System Time: 10615906ns .... cpu_freq: 00000000:00245E78 .... scale: 000001A3:8DFA5203 .... Wall Clock: 1092057892s 0us Start schedulers PCI: PCI BIOS revision 2.10 entry at 0xf33b0, last bus=0 PCI: Probing PCI hardware PCI: device 00:00.0 has unknown header type 7f, ignoring. Uniform Multi-Platform E-IDE driver Revision: 6.31 ide: Assuming 50MHz system bus speed for PIO modes; override with idebus=xx hda: xenodisk, ATA DISK drive hdb: hda3, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 hda: 6640704 sectors (3400 MB), CHS=6588/16/63 PIO (slow!) hdb: 417690 sectors (214 MB), CHS=442/15/63 PIO (slow!) SCSI subsystem driver Revision: 1.00 Red Hat/Adaptec aacraid driver (1.1.2 Aug 3 2004 09:37:26) Device dummy opened and ready for use. DOM0: Guest OS virtual load address is c0000000 DOM0: Guest OS virtual stack address is c3fee000 DOM0: oz_hwxen_start: initializing as domain 0 DOM0: (file=memory.c, line=331) Page 0558f000 bad type/count (02000000!=01000000) cnt=2 DOM0: oz_hwxen_start*: initial event 0x102 DOM0: oz_hwxen_start*: initial events_mask 0x0 DOM0: oz_hwxen_start: CPU frequency 2383480 Hz DOM0: oz_hwxen_event_vbd_upd*: DOM0: oz_hwxen_start: boot time 2004-08-09@13:24:56.1475021z DOM0: oz_ldr_set: parameter signature cannot be changed DOM0: Copyright (C) 2001,2002,2003,2004 Mike Rieker, Beverly, MA USA DOM0: Version 2004-01-03, OZONE comes with ABSOLUTELY NO WARRANTY DOM0: EXPECT it to FAIL when someone's HeALTh or PROpeRTy is at RISk DOM0: DOM0: oz_knl_boot_firstcpu: DOM0: total number of cpus: 1 DOM0: page size (in bytes): 4096 DOM0: total pages of physical memory: 0x4000 (64 Megabytes) DOM0: system base virtual address: 0xc0000000 DOM0: system page table entries: 0x10000 (256 Megabytes) DOM0: initial non-paged pool size: 0x400000 (4096 Kilobytes) DOM0: first free virt page: 0xC4000 DOM0: first free phys page: 0xC0 DOM0: DOM0: oz_knl_debug 0: initialized (cb 0xc0052940, cp 0x0, dc 0xc0079860) DOM0: oz_knl_boot_firstcpu: initializing physical memory DOM0: oz_knl_phymem_init: cache modulo: L1 1 page (4 K), L2 1 page (4 K) DOM0: oz_knl_phymem_init: 1296 pages required for phys mem state table and non-paged pool DOM0: oz_hw_pool_init: 0x510 pages, ppage 0x3AF0, vpage 0xC3AF0 DOM0: oz_knl_phymem_init: physical memory state array at vaddr 0xc3af0000, phypage 0x3AF0 DOM0: oz_knl_phymem_init: initial non-paged pool size 4282432 (4182 K), base 0xc3bea7c0 DOM0: oz_knl_phymem_init: there are 0x3A30 free pages left (58 Meg) DOM0: oz_knl_boot_firstcpu: initializing modules DOM0: oz_knl_idno_init: max 256 at 0xc3beb0a8 DOM0: oz_knl_boot_firstcpu: creating system process DOM0: oz_knl_user_create: user OZ_Startup logged on at 2004-08-09@13:24:57.7618505z DOM0: oz_knl_thread_cpuinit: cpu 0 initialization complete DOM0: oz_knl_boot_firstcpu: defining logical names DOM0: oz_knl_boot_firstcpu: starting device drivers DOM0: oz_dev_timer_init DOM0: oz_dev_vdfs_init (oz_dfs) DOM0: oz_dev_knlcons_init DOM0: oz_dev_xendisk_init DOM0: oz_dev_xendisk: xenhd_0 totalblocks 6640704 DOM0: oz_dev_xendisk: xenhd_1 totalblocks 417690 DOM0: oz_knl_boot_firstcpu: device driver init complete DOM0: console _console1: console via oz_hw_putcon and oz_hw_getcon DOM0: xenhd_0 _disk1 : virtual hardisk 0x300 DOM0: xenhd_1 _disk2 : virtual hardisk 0x340 DOM0: oz_dfs _fs1 : init and mount template DOM0: timer _timer1 : generic timer DOM0: oz_knl_boot_firstcpu: creating startup process DOM0: oz_knl_startup: mounting xenhd_1 via oz_dfs DOM0: oz_dev_dfs: volume oz_dfs mounted at 2004-08-09@13:11:17.8527305z was not dismounted DOM0: oz_hw_process_initctx*: ppdsa C01C0000, ppdma 01750000 DOM0: (file=memory.c, line=331) Page 01750000 bad type/count (02000000!=01000000) cnt=1 DOM0: oz_hwxen_pte_write*: pin ma 01751C07 for vpn 00400 (oldpte 00000000) DOM0: DOM0: *** Reading and validating home block DOM0: DOM0: *** Opening sacred files DOM0: DOM0: *** Reading bitmaps DOM0: DOM0: *** Scanning file headers DOM0: DOM0: *** Checking extension header links DOM0: DOM0: *** Checking directories DOM0: DOM0: *** Writing bitmaps DOM0: oz_hwxen_pte_write*: unpin ma 01751C67 for vpn 00400 (newpte 00000000) DOM0: oz_hwxen_pinpdpage*: unpin ma 01750000 DOM0: (file=memory.c, line=367) Bad page type/domain (dom=0) (type 33554432 != expected 16777216) DOM0: oz_hw_kstack_delete*: kernel stack depth 1864 (0x748) DOM0: oz_knl_startup: volume mounted on device xenhd_1.oz_dfs DOM0: OZ_SYSTEM_DIRECTORY (kernel) (ref:1) (table) DOM0: OZ_DEFAULT_TBL (kernel) (ref:1) = 'OZ_PROCESS_TABLE' 'OZ_PARENT_TABLE' 'OZ_JOB_TABLE' 'OZ _USER_TABLE' 'OZ_SYSTEM_TABLE' DOM0: OZ_SYSTEM_TABLE (kernel) (ref:1) (table) DOM0: OZ_DEFAULT_DIR (kernel) (ref:0) = 'xenhd_1.oz_dfs:/ozone/binaries/' (terminal) DOM0: OZ_IMAGE_DIR (kernel) (ref:0) = 'xenhd_1.oz_dfs:/ozone/binaries/' (terminal) DOM0: OZ_LOAD_DEV (kernel) (ref:0) = 'xenhd_1' (terminal) DOM0: OZ_LOAD_DIR (kernel) (ref:0) = 'xenhd_1.oz_dfs:/ozone/binaries/' (terminal) DOM0: OZ_LOAD_FS (kernel) (ref:0) = 'xenhd_1.oz_dfs:' (terminal) DOM0: OZ_SYSTEM_PROCESS (nosupersede) (nooutermode) (kernel) (ref:1) = 'C3BECD28:process' (ob ject) DOM0: oz_knl_startup: loading kernel image (oz_kernel_xen.elf) symbol table DOM0: oz_knl_startup: spawning startup process DOM0: oz_hw_process_initctx*: ppdsa C02C9000, ppdma 01859000 DOM0: (file=memory.c, line=331) Page 01859000 bad type/count (02000000!=01000000) cnt=1 DOM0: oz_hwxen_pte_write*: pin ma 0185AC07 for vpn 00400 (oldpte 00000000) DOM0: oz_knl_startup: startup process spawned DOM0: oz_hwxen_pte_write*: pin ma 0195DC07 for vpn 006FF (oldpte 00000000) DOM0: oz_hwxen_pte_write*: pin ma 01961C07 for vpn 00420 (oldpte 00000000) DOM0: oz_hwxen_pte_write*: pin ma 01970C07 for vpn 00402 (oldpte 00000000) DOM0: params: executing startup procedure ... DOM0: # DOM0: # Define oz_cli's external commands DOM0: # DOM0: create logical table -kernel OZ_SYSTEM_DIRECTORY%OZ_CLI_TABLES DOM0: create logical name -kernel OZ_CLI_TABLES%cli oz_cli.elf DOM0: create logical name -kernel OZ_CLI_TABLES%cat oz_util_cat.elf DOM0: create logical name -kernel OZ_CLI_TABLES%copy oz_util_copy.elf DOM0: create logical name -kernel OZ_CLI_TABLES%crash oz_util_crash.elf DOM0: create logical name -kernel OZ_CLI_TABLES%credir oz_util_credir.elf DOM0: create logical name -kernel OZ_CLI_TABLES%dd oz_util_dd.elf DOM0: create logical name -kernel OZ_CLI_TABLES%debug oz_util_debug.elf DOM0: create logical name -kernel OZ_CLI_TABLES%delete oz_util_delete.elf DOM0: create logical name -kernel OZ_CLI_TABLES%dir oz_util_dir.elf DOM0: create logical name -kernel OZ_CLI_TABLES%dism oz_util_dismount.elf DOM0: create logical name -kernel OZ_CLI_TABLES%dump oz_util_dump.elf DOM0: create logical name -kernel OZ_CLI_TABLES%edt edt.elf DOM0: create logical name -kernel OZ_CLI_TABLES%elfconv oz_util_elfconv.elf DOM0: create logical name -kernel OZ_CLI_TABLES%format oz_util_diskfmt.elf DOM0: create logical name -kernel OZ_CLI_TABLES%gunzip oz_util_gzip.elf DOM0: create logical name -kernel OZ_CLI_TABLES%gzip oz_util_gzip.elf DOM0: create logical name -kernel OZ_CLI_TABLES%init oz_util_init.elf DOM0: create logical name -kernel OZ_CLI_TABLES%ip oz_util_ip.elf DOM0: create logical name -kernel OZ_CLI_TABLES%ldelf oz_util_ldelf32.elf DOM0: create logical name -kernel OZ_CLI_TABLES%make oz_util_make.elf DOM0: create logical name -kernel OZ_CLI_TABLES%mount oz_util_mount.elf DOM0: create logical name -kernel OZ_CLI_TABLES%partition oz_util_partition.elf DOM0: create logical name -kernel OZ_CLI_TABLES%purge oz_util_delete.elf DOM0: create logical name -kernel OZ_CLI_TABLES%rename oz_util_copy.elf DOM0: create logical name -kernel OZ_CLI_TABLES%scsi oz_util_scsi.elf DOM0: create logical name -kernel OZ_CLI_TABLES%shutdown oz_util_shutdown.elf DOM0: create logical name -kernel OZ_CLI_TABLES%sort oz_util_sort.elf DOM0: create logical name -kernel OZ_CLI_TABLES%tailf oz_util_tailf.elf DOM0: create logical name -kernel OZ_CLI_TABLES%telnet oz_util_telnet.elf DOM0: create logical name -kernel OZ_CLI_TABLES%top oz_util_top.elf DOM0: create logical name -kernel OZ_CLI_TABLES%type oz_util_cat.elf DOM0: # DOM0: # Create OZ_ROOT_DIR logical name (parent of kernel image directory) DOM0: # DOM0: create symbol -string def_dir 'oz_lnm_string (oz_lnm_lookup ("OZ_DEFAULT_DIR", "user"), 0)' DOM0: create symbol -string load_dir 'oz_lnm_string (oz_lnm_lookup ("OZ_SYSTEM_TABLE%OZ_LOAD_DIR", "kernel"), 0)' DOM0: set default {load_dir} DOM0: set default ../ DOM0: create symbol -string root_dir 'oz_lnm_string (oz_lnm_lookup ("OZ_DEFAULT_DIR", "user"), 0)' DOM0: set default {def_dir} DOM0: create logical name -kernel OZ_SYSTEM_TABLE%OZ_ROOT_DIR -terminal {root_dir} DOM0: # DOM0: # Add current directory to image path DOM0: # DOM0: create logical name OZ_SYSTEM_TABLE%OZ_IMAGE_DIR OZ_DEFAULT_DIR: -copy OZ_IMAGE_DIR DOM0: # DOM0: # Set up timezone DOM0: # DOM0: create logical name -kernel OZ_SYSTEM_TABLE%OZ_TIMEZONE_DIR {root_dir}timezones/ DOM0: set timezone EST5EDT DOM0: oz_cli: error 28 setting timezone to EST5EDT DOM0: # DOM0: # Define image used to log in and the password file DOM0: # DOM0: create logical name -kernel OZ_SYSTEM_TABLE%OZ_PASSWORD_FILE -terminal {root_dir}startup/pass word.dat DOM0: create logical name -kernel OZ_SYSTEM_TABLE%OZ_LOGON_IMAGE oz_util_logon.elf DOM0: create logical name OZ_SYSTEM_TABLE%OZ_UTIL_LOGON_MSG "OZONE backup server system" "Authorize d access only" DOM0: # DOM0: # Declare debugger executable DOM0: # To activate debugger for a program, use -debug option before the command name DOM0: # DOM0: create logical name OZ_SYSTEM_TABLE%OZ_DEBUG_IMAGE oz_util_debug.elf DOM0: oz_hwxen_pte_write*: unpin ma 0185AC67 for vpn 00400 (newpte 00000000) DOM0: oz_hwxen_pte_write*: unpin ma 01970C67 for vpn 00402 (newpte 00000000) DOM0: oz_hwxen_pte_write*: unpin ma 01961C67 for vpn 00420 (newpte 00000000) DOM0: oz_hwxen_pte_write*: unpin ma 0195DC67 for vpn 006FF (newpte 00000000) DOM0: oz_hwxen_pinpdpage*: unpin ma 01859000 DOM0: (file=memory.c, line=367) Bad page type/domain (dom=0) (type 33554432 != expected 16777216) DOM0: oz_hw_kstack_delete*: kernel stack depth 2324 (0x914)
image = "/root/oz_kernel_xen.gz" ramdisk = "" mem_size = 32 vbd_list = [ ('phy:hdb3', 'hda', 'w') ] vbd_expert = 1 cmdline_root = "load_device=xenhd_0"The commands used to start it were:
xen_nat_enable xen_read_console & xc_dom_create.py -D vmid=3 -f /etc/xc/vm3And the output was:
[root@xenophilia xc]# xc_dom_create.py -D vmid=3 -f /etc/xc/vm3 Parsing config file '/etc/xc/vm3' VM image : "/root/oz_kernel_xen.gz" VM ramdisk : "" VM memory (MB) : "32" VM IP address(es) : "192.168.0.154" VM block device(s) : "phy:hdb3,hda,w" VM cmdline : " load_device=xenhd_0 " Warning: one or more hard disk extents writeable by one domain are also readable by another. [7] oz_hwxen_start: initializing as domain 7 [7] oz_hwxen_start*: initial event 0x102 [7] oz_hwxen_start*: initial events_mask 0x0 [7] oz_hwxen_start: CPU frequency 350800800 Hz VM started in domain 7 [7] oz_hwxen_event_vbd_upd*: [7] oz_hwxen_start: boot time 2004-08-09@16:42:54.3775346z [7] oz_ldr_set: parameter signature cannot be changed [7] Copyright (C) 2001,2002,2003,2004 Mike Rieker, Beverly, MA USA [7] Version 2004-01-03, OZONE comes with ABSOLUTELY NO WARRANTY [root@xenophilia xc]# [7] EXPECT it to FAIL when someone's HeALTh or PROpeRTy is at RISk [7] [7] oz_knl_boot_firstcpu: [7] total number of cpus: 1 [7] page size (in bytes): 4096 [7] total pages of physical memory: 0x2000 (32 Megabytes) [7] system base virtual address: 0xc0000000 [7] system page table entries: 0x10000 (256 Megabytes) [7] initial non-paged pool size: 0x400000 (4096 Kilobytes) [7] first free virt page: 0xC2000 [7] first free phys page: 0xC0 [7] [7] oz_knl_debug 0: initialized (cb 0xc0052940, cp 0x0, dc 0xc0079860) [7] oz_knl_boot_firstcpu: initializing physical memory [7] oz_knl_phymem_init: cache modulo: L1 1 page (4 K), L2 1 page (4 K) [7] oz_knl_phymem_init: 1160 pages required for phys mem state table and non-paged pool [7] oz_hw_pool_init: 0x488 pages, ppage 0x1B78, vpage 0xC1B78 [7] oz_knl_phymem_init: physical memory state array at vaddr 0xc1b78000, phypage 0x1B78 [7] oz_knl_phymem_init: initial non-paged pool size 4273184 (4173 K), base 0xc1becbe0 [7] oz_knl_phymem_init: there are 0x1AB8 free pages left (26 Meg) [7] oz_knl_boot_firstcpu: initializing modules [7] oz_knl_idno_init: max 256 at 0xc1bed4c8 [7] oz_knl_boot_firstcpu: creating system process [7] oz_knl_user_create: user OZ_Startup logged on at 2004-08-09@16:42:54.5777125z [7] oz_knl_thread_cpuinit: cpu 0 initialization complete [7] oz_knl_boot_firstcpu: defining logical names [7] oz_knl_boot_firstcpu: starting device drivers [7] oz_dev_timer_init [7] oz_dev_vdfs_init (oz_dfs) [7] oz_dev_knlcons_init [7] oz_dev_xendisk_init [7] oz_dev_xendisk: xenhd_0 totalblocks 417690 [7] oz_knl_boot_firstcpu: device driver init complete [7] console _console1: console via oz_hw_putcon and oz_hw_getcon [7] xenhd_0 _disk1 : virtual hardisk 0x300 [7] oz_dfs _fs1 : init and mount template [7] timer _timer1 : generic timer [7] oz_knl_boot_firstcpu: creating startup process [7] oz_knl_startup: mounting xenhd_0 via oz_dfs [7] oz_dev_dfs: volume oz_dfs mounted at 2004-08-09@13:25:04.2861632z was not dismounted [7] oz_hw_process_initctx*: ppdsa C01C0000, ppdma 0F3EF000 [7] oz_hwxen_pte_write*: pin ma 0F3EEC07 for vpn 00400 (oldpte 00000000) [7] [7] *** Reading and validating home block [7] [7] *** Opening sacred files [7] [7] *** Reading bitmaps [7] [7] *** Scanning file headers [7] [7] *** Checking extension header links [7] [7] *** Checking directories [7] [7] *** Writing bitmaps [7] oz_hwxen_pte_write*: unpin ma 0F3EEC67 for vpn 00400 (newpte 00000000) [7] oz_hwxen_pinpdpage*: unpin ma 0F3EF000 [7] oz_hw_kstack_delete*: kernel stack depth 1740 (0x6cc) [7] oz_knl_startup: volume mounted on device xenhd_0.oz_dfs [7] OZ_SYSTEM_DIRECTORY (kernel) (ref:1) (table) [7] OZ_DEFAULT_TBL (kernel) (ref:1) = 'OZ_PROCESS_TABLE' 'OZ_PARENT_TABLE' 'OZ_JOB_TABLE' 'OZ_USER_TABLE' 'OZ_SYSTEM_TABLE' [7] OZ_SYSTEM_TABLE (kernel) (ref:1) (table) [7] OZ_DEFAULT_DIR (kernel) (ref:0) = 'xenhd_0.oz_dfs:/ozone/binaries/' (terminal) [7] OZ_IMAGE_DIR (kernel) (ref:0) = 'xenhd_0.oz_dfs:/ozone/binaries/' (terminal) [7] OZ_LOAD_DEV (kernel) (ref:0) = 'xenhd_0' (terminal) [7] OZ_LOAD_DIR (kernel) (ref:0) = 'xenhd_0.oz_dfs:/ozone/binaries/' (terminal) [7] OZ_LOAD_FS (kernel) (ref:0) = 'xenhd_0.oz_dfs:' (terminal) [7] OZ_SYSTEM_PROCESS (nosupersede) (nooutermode) (kernel) (ref:1) = 'C1BEF148:process' (object) [7] oz_knl_startup: loading kernel image (oz_kernel_xen.elf) symbol table [7] oz_knl_startup: spawning startup process [7] oz_hw_process_initctx*: ppdsa C02C9000, ppdma 0F2E6000 [7] oz_hwxen_pte_write*: pin ma 0F2E5C07 for vpn 00400 (oldpte 00000000) [7] oz_knl_startup: startup process spawned [7] oz_hwxen_pte_write*: pin ma 0F1E2C07 for vpn 006FF (oldpte 00000000) [7] oz_hwxen_pte_write*: pin ma 0F1DEC07 for vpn 00420 (oldpte 00000000) [7] oz_hwxen_pte_write*: pin ma 0F1CFC07 for vpn 00402 (oldpte 00000000) [7] params: executing startup procedure ... [7] # [7] # Define oz_cli's external commands [7] # [7] create logical table -kernel OZ_SYSTEM_DIRECTORY%OZ_CLI_TABLES [7] create logical name -kernel OZ_CLI_TABLES%cli oz_cli.elf [7] create logical name -kernel OZ_CLI_TABLES%cat oz_util_cat.elf [7] create logical name -kernel OZ_CLI_TABLES%copy oz_util_copy.elf [7] create logical name -kernel OZ_CLI_TABLES%crash oz_util_crash.elf [7] create logical name -kernel OZ_CLI_TABLES%credir oz_util_credir.elf [7] create logical name -kernel OZ_CLI_TABLES%dd oz_util_dd.elf [7] create logical name -kernel OZ_CLI_TABLES%debug oz_util_debug.elf [7] create logical name -kernel OZ_CLI_TABLES%delete oz_util_delete.elf [7] create logical name -kernel OZ_CLI_TABLES%dir oz_util_dir.elf [7] create logical name -kernel OZ_CLI_TABLES%dism oz_util_dismount.elf [7] create logical name -kernel OZ_CLI_TABLES%dump oz_util_dump.elf [7] create logical name -kernel OZ_CLI_TABLES%edt edt.elf [7] create logical name -kernel OZ_CLI_TABLES%elfconv oz_util_elfconv.elf [7] create logical name -kernel OZ_CLI_TABLES%format oz_util_diskfmt.elf [7] create logical name -kernel OZ_CLI_TABLES%gunzip oz_util_gzip.elf [7] create logical name -kernel OZ_CLI_TABLES%gzip oz_util_gzip.elf [7] create logical name -kernel OZ_CLI_TABLES%init oz_util_init.elf [7] create logical name -kernel OZ_CLI_TABLES%ip oz_util_ip.elf [7] create logical name -kernel OZ_CLI_TABLES%ldelf oz_util_ldelf32.elf [7] create logical name -kernel OZ_CLI_TABLES%make oz_util_make.elf [7] create logical name -kernel OZ_CLI_TABLES%mount oz_util_mount.elf [7] create logical name -kernel OZ_CLI_TABLES%partition oz_util_partition.elf [7] create logical name -kernel OZ_CLI_TABLES%purge oz_util_delete.elf [7] create logical name -kernel OZ_CLI_TABLES%rename oz_util_copy.elf [7] create logical name -kernel OZ_CLI_TABLES%scsi oz_util_scsi.elf [7] create logical name -kernel OZ_CLI_TABLES%shutdown oz_util_shutdown.elf [7] create logical name -kernel OZ_CLI_TABLES%sort oz_util_sort.elf [7] create logical name -kernel OZ_CLI_TABLES%tailf oz_util_tailf.elf [7] create logical name -kernel OZ_CLI_TABLES%telnet oz_util_telnet.elf [7] create logical name -kernel OZ_CLI_TABLES%top oz_util_top.elf [7] create logical name -kernel OZ_CLI_TABLES%type oz_util_cat.elf [7] # [7] # Create OZ_ROOT_DIR logical name (parent of kernel image directory) [7] # [7] create symbol -string def_dir 'oz_lnm_string (oz_lnm_lookup ("OZ_DEFAULT_DIR", "user"), 0)' [7] create symbol -string load_dir 'oz_lnm_string (oz_lnm_lookup ("OZ_SYSTEM_TABLE%OZ_LOAD_DIR", "kernel"), 0)' [7] set default {load_dir} [7] set default ../ [7] create symbol -string root_dir 'oz_lnm_string (oz_lnm_lookup ("OZ_DEFAULT_DIR", "user"), 0)' [7] set default {def_dir} [7] create logical name -kernel OZ_SYSTEM_TABLE%OZ_ROOT_DIR -terminal {root_dir} [7] # [7] # Add current directory to image path [7] # [7] create logical name OZ_SYSTEM_TABLE%OZ_IMAGE_DIR OZ_DEFAULT_DIR: -copy OZ_IMAGE_DIR [7] # [7] # Set up timezone [7] # [7] create logical name -kernel OZ_SYSTEM_TABLE%OZ_TIMEZONE_DIR {root_dir}timezones/ [7] set timezone EST5EDT [7] oz_cli: error 28 setting timezone to EST5EDT [7] # [7] # Define image used to log in and the password file [7] # [7] create logical name -kernel OZ_SYSTEM_TABLE%OZ_PASSWORD_FILE -terminal {root_dir}startup/password.dat [7] create logical name -kernel OZ_SYSTEM_TABLE%OZ_LOGON_IMAGE oz_util_logon.elf [7] create logical name OZ_SYSTEM_TABLE%OZ_UTIL_LOGON_MSG "OZONE backup server system" "Authorized access only" [7] # [7] # Declare debugger executable [7] # To activate debugger for a program, use -debug option before the command name [7] # [7] create logical name OZ_SYSTEM_TABLE%OZ_DEBUG_IMAGE oz_util_debug.elf [7] oz_hwxen_pte_write*: unpin ma 0F2E5C67 for vpn 00400 (newpte 00000000) [7] oz_hwxen_pte_write*: unpin ma 0F1CFC67 for vpn 00402 (newpte 00000000) [7] oz_hwxen_pte_write*: unpin ma 0F1DEC67 for vpn 00420 (newpte 00000000) [7] oz_hwxen_pte_write*: unpin ma 0F1E2C67 for vpn 006FF (newpte 00000000) [7] oz_hwxen_pinpdpage*: unpin ma 0F2E6000 [7] oz_hw_kstack_delete*: kernel stack depth 2324 (0x914) [root@xenophilia xc]#
[snip of startup stuff]
Here is the driver probing the ethernet devices, followed by a partial listing of devices. It names the device xenet_0:
[63] oz_dev_xenetwork_init [63] oz_dev_xenetwork_init: found vif 0, address AA-00-00-AB-29-56 [63] oz_dev_xenetwork_init: error -22 resetting vif 1 rings [63] oz_dev_xenetwork_init: error -22 resetting vif 2 rings [63] oz_dev_xenetwork_init: error -22 resetting vif 3 rings [63] oz_dev_xenetwork_init: error -22 resetting vif 4 rings [63] oz_dev_xenetwork_init: error -22 resetting vif 5 rings [63] oz_dev_xenetwork_init: error -22 resetting vif 6 rings [63] oz_dev_xenetwork_init: error -22 resetting vif 7 rings [63] oz_knl_boot_firstcpu: device driver init complete [63] console _console1: console via oz_hw_putcon and oz_hw_getcon [63] ramdisk _disk1 : mount[K/M] [63] xenhd_0 _disk2 : virtual hardisk 0x300 [63] etherloop _ether1 : ethernet loopback [63] xenet_0 _ether2 : virtual ethernet AA-00-00-AB-29-56 [63] oz_dfs _fs1 : init and mount template [63] oz_dpt _fs2 : mount template
[snip of more startup stuff]
Here are my equivalent of ifconfig, enabling the device and assigning its ip address:
[63] ip hw add xenet_0 [63] oz_dev_ip: device xenet_0, hw addr AA-00-00-AB-29-56, enabled, mtu 1500 [63] ip hw ipam add xenet_0 192.168.0.154 192.168.0.0 255.255.255.0 [63] #
Here is my ping command (as part of the startup script) pinging the router:
[63] ip ping 192.168.0.1 [63] ip: pinging 192.168.0.1, ip packet length 40, icmp length 16 [63] ip: error 19 enabling ctrl-C detection [63] 192.168.0.1 seq 0 ttl 254 time 0.0027021 [63] 192.168.0.1 seq 1 ttl 254 time 0.0006668 [63] 192.168.0.1 seq 2 ttl 254 time 0.0006860 [63] 192.168.0.1 seq 3 ttl 254 time 0.0006731 [63] 192.168.0.1 seq 4 ttl 254 time 0.0006683 [63] 192.168.0.1 seq 5 ttl 254 time 0.0006555 [63] 192.168.0.1 seq 6 ttl 254 time 0.0006670 [63] 192.168.0.1 seq 7 ttl 254 time 0.0006726 [63] 192.168.0.1 seq 8 ttl 254 time 0.0006492 [63] 192.168.0.1 seq 9 ttl 254 time 0.0006788 [63] 192.168.0.1 seq 10 ttl 254 time 0.0006455 [63] 192.168.0.1 seq 11 ttl 254 time 0.0006634 [63] 192.168.0.1 seq 12 ttl 254 time 0.0006576 [63] 192.168.0.1 seq 13 ttl 254 time 0.0006416
So basically, at this point, I consider it working, though it has not been beat on. It was about as difficult as I expected. From most difficult to least of problems I remember:
All in all, I say to Xen developers, nicely done! Even though my guest OS crashed and crashed during porting, Xen kept running and all I needed to do was reboot my VM. It pretty much works as advertised. Also, doing this project proved Emmy's worth as a development tool (as well as finding a few bugs)!
I made these changes to the base OZONE code during the Xen porting effort: