Site Home Page
The UML Wiki
UML Community Site
The UML roadmap
What it's good for
Case Studies
Kernel Capabilities
Downloading it
Running it
Compiling
Installation
Skas Mode
Incremental Patches
Test Suite
Host memory use
Building filesystems
Troubles
User Contributions
Related Links
Projects
Diary
Thanks
Contacts
Tutorials
The HOWTO (html)
The HOWTO (text)
Host file access
Device inputs
Sharing filesystems
Creating filesystems
Resizing filesystems
Virtual Networking
Management Console
Kernel Debugging
UML Honeypots
gprof and gcov
Running X
Diagnosing problems
Configuration
Installing Slackware
Porting UML
IO memory emulation
UML on 2G/2G hosts
Adding a UML system call
Running nested UMLs
How you can help
Overview
Documentation
Utilities
Kernel projects
Screenshots
A virtual network
An X session
Transcripts
A login session
A debugging session
Slackware installation
Reference
Kernel switches
Slackware README
Papers
ALS 2000 paper (html)
ALS 2000 paper (TeX)
ALS 2000 slides
LCA 2001 slides
OLS 2001 paper (html)
OLS 2001 paper (TeX)
ALS 2001 paper (html)
ALS 2001 paper (TeX)
UML security (html)
LCA 2002 (html)
WVU 2002 (html)
Security Roundtable (html)
OLS 2002 slides
LWE 2005 slides
Fun and Games
Kernel Hangman
Disaster of the Month

The Current TODO list

Since about Feb 2001, I've been maintaining a TODO list and posting it to the devel list occasionally.
Not done

Packaging

Functionality

Bugs

My attempts to run the Debian install procedure hung UML while it was probing disks. I never figured out what was happening.

Make UML build without warnings

Make sure that each clock tick gets counted

Figure out the hostfs crash that Larent Bonnaud is seeing

make 'gdb=pty' work

protect kernel memory from userspace

Figure out why the io_thread loses its parent and its connection to the kernel

Disable SIGIO on file descriptors which are already being handled. This will cut off some interrupt recursion.

Figure out why gdb can't use fd chan (blinky@gmx.net)

Figure out why repeated 'tail /dev/zero' with swap causes process segfaults

Set SA_RESTART on SIGPROF

Replace dentry_name with d_name

Dynamically allocate all driver descriptors

Make slip_tramp conditional in slip_close

many du / causes slab cache corruption

Adding an eth device via the mconsole (and probably the command line) won't necessarily configure the device that was named

Figure out why init/version.c, init/main.c, and arch/um/main.c don't get coverage data.

Fix the ubd rounding bug spotted by Steve Schmidtke and Roman Zippel.

gdb should be the last thing to be shut down so that breakpoints can be set late in the shutdown process.

Have the __uml_exitcall handlers check to see if they need to do anything.

Find the race that causes the kernel to run before the debugger is running, and which then causes a panic and segfault.

Tests that need writing

  • Build and load some modules to check for unexported symbols
  • Swap testing - qsbench, low-memory kernel build
  • Rerun some existing tests through hostfs

Figure out why gdb inside UML sometimes misses breakpoints.

^C doesn't work well in a busy loop in __initcall code. I've seen the process die with a SIGINT as well as have the SIGINT arrive to a sleeping process, panicing the tracing thread.

When setting a conditional breakpoint on __free_pages in free_all_bootmem_core and continuing, UML hangs.

Figure out what to do with __uml_setup in modules. A lot of these should end up being normal __setup, since they don't have to happen before the kernel runs.

Figure out why UML died nastily after ^C-ing it and hitting it in userspace.

Telnetting to a console when in.telnetd is non-executable produces an error and a crash from a sleeping process segfaulting.

Single-stepping when a signal has arrived doesn't work. gdb singlesteps, sees the signal, and single-steps with the signal. The signal is handled immediately, stepping the process into the handler. The original instruction was never executed, so when gdb puts the breakpoint back and the handler returns, the breakpoint is hit again.

The _to_user routines refer to user data without using copy_to_user.

How to cause a strange userspace segfault - run the new UML inside UML under gdb. Put a breakpoint on create_elf_tables. 'finish', 'next', segfault.

With an old port-helper hanging on to a port, running a UML which want those ports causes all of the consoles and serial lines to output their login prompts to stdout.

Assigning console and serial line devices to files doesn't work cleanly.

Make sure that irqs are deactivated or freed when their descriptors are closed.

Things to audit:

  • Make sure that destructors exactly undo everything the constructor does
  • Set FD_CLOEXEC on any descriptors that don't need to be passed across execs.
  • Make sure any protocols are 64-bit clean - this means not using ints, longs, etc. Also, maybe enums are bad.

port_kern.c has port_remove_dev and port_kern_free which do almost the same things.

Make sure that return values are actually used.

skas4 things - remove the extra two context switches per system call, make sure it compiles with CONFIG_PROC_MM off, implement the mm indirector, move PTRACE_LDT to /dev/mm, fix PTRACE_FAULTINFO to return trap type and err, allow the environment, argument, etc pointers of an mm to be set.


Confirm fix

Get either Myrtal Meep or Matt Clay to confirm (or not) that they can no longer crash UML with Apache/Perl/MySQL. This will probably be fixed with the new network drivers.

Done

Packaging

Put together a UML deb builder

Figure out how to ship the UML test suite

Figure out how to make a module tar file with the right structure

It currently sucks. What I want is an RPM and a deb for each release containing:

  • the kernel
  • HOWTO
  • user-mode network tools
  • test harness
  • test suite
  • maybe a small root filesystem like the tomsrtbt
  • modules
  • mkrootfs
Anything else? I also need suggestions for locations for all these things.

Functionality

Add a mechanism for rereading module symbols when the module is reloaded.

Integrate Greg Lonnon's I/O space emulation

Make the socket channel work

Figure out what to do about the network driver. There are duelling drivers from Jim Leu and Lennert Buytenhek, and I need to figure out how to merge the best bits from them into a single driver, or split them into two very distinct drivers.

Extract Patrick Schaaf's cool new block driver from him.

Remove the tty mode save/restore from main and put it in the file descriptor channel code. (Boria Feigin)

Added more stuff to /proc/cpuinfo (mistral)

Allow gdb to be attached to a host device in the same way that consoles and serial lines can

modify_ldt implemented (patch from Lennert Buytenhek)

Add support for root hostfs

Added --help (patch from Yves Rougy)


Bugs

I occasionally see one of the loggers dying from a SIGTERM during boot because of a kill that's somehow directed at the wrong process.

I need to figure out why some people see 'Unexpectedly got signal 4 in signals'. This has been reported a couple of times, but I've never got any information that would help me debug it.

Implement copy_{from,to}_user using setjmp/longjmp.

Fix gcov and gprof support

Figure out why the umn device won't serve more than one packet to remote machines.

Figure out how to make -D_FILE_OFFSET_BITS=64 work with modules (fix from Lennert which explicitly uses the *64 interfaces)

Figure out why ^S and ^Q don't work on the console (patch from Livio)

Stop the hangs caused by someone disconnecting from a virtual console

umn needs to destroy the slip device when it's no longer needed (JS)

clean up the umn_set_addr logic

Make sure that readonly hostfs really is readonly (James McMechan)

Figure out the wait_for_page hangs

I used to see syslog occasionally hang, but haven't in a long time. It happened again, and it turns out I was enabling signals when they shouldn't be, allowing the timer to interrupt a sensitive part of do_IRQ.

gdb doesn't find the kernel if it isn't called 'linux' (Brian J. Murrell)

Have the ubd driver figure out the block device size with BLKGETSIZE if necessary (Lennert Buytenhek, marc)

Make sure it boots in the presence of IO redirection and piping (JS)

Rebooting a UML under debug won't kill the old gdb

Remove assumption that argv is contiguous (Boria Feigin)

Have hostfs copy user data with copy_{to,from}_user.

Put in Lennert's utsname patch (Lennert Buytenhek)

Put in Lennert's ubd LFS patch (Lennert Buytenhek)

The kernel shouldn't hang when debug is requested and it wasn't compiled with CONFIG_PT_PROXY.

Make empty_zero_page and empty_bad_page reserved (Rik)

Increase COMMAND_LINE_SIZE to something bigger (JS)

Put breakpoints on panic and stop

Make hostfs figure out external filenames by walking up the dentry->parent tree rather than having inodes store them.

Figure out why hostfs misses files in large directories

Make sure init can exec (Al Viro)

Fix the problems that mistral is seeing. This is three bugs - the sleep hang, a bug in sigio_handler (both fixed), plus the sigreturn bug.

Reimplement process signal delivery to eliminate the race when the process handler returns

Various fixes and cleanups in hostfs (Al Viro)

Remove real_mm from thread and replace it with some unused pte bits

Fix /proc/cmdline (prose from Henrik Nordstrom, code from Greg Lonnon)

drivers Makefile cleanup (patch from Greg Lonnon)

tty_flip_buffer_push fixlet patch from (Gordon McNutt)

The ubd device should reset openflags between mounts (jfreak)

Figure out the sleep hang. It happens rarely. Somehow, irq_desc[0].flags gets stuck at 5 which tells do_IRQ that the interrupt is already being handled.

Stop ^C in gdb from segfaulting processes if it hits in userspace.

Figure out why swapping causes process segfaults and nasty mm messages.

Fix the build so that modules_install drops modules in the proper hierarchy (fix from Henrik Nordstrom)

Got rid of hostfs_llseek (patch from stewart and Henrik Nordstrom)

gdb now ignores SIGWINCH. (patch from James Stevenson)

Fixed ubd stats (patch from James Stevenson)

UML should run on kernels with a 2G/2G address space split (blinky)

Fix the segv bug where a --x page is never fixed because thread.starting_exec is set (mistral) - mistral hasn't complained about this in ages

Fix 'cat /proc/kmsg' (Torsten Fink)

Fix crash if someone types at the console too soon (patch from mistral)

Change __SMP__ to CONFIG_SMP (Niels Kristian Bech Jensen)

Include config.h where necessary (Niels Kristian Bech Jensen)

Fix a couple allocation buglets in hostfs (Henrik Nordstrom)

Implement internal system calls correctly (Roman Zippel)

Allow the time to be changed (Frank Klingenhoefer) (patch from Livio Soares)

The ptrace proxy should handle wait correctly for non-UML subprocesses of gdb

Figure out why UML flunks the f00f test

Make the build work when CONFIG_MCONSOLE is off

Provide some kind of reasonable error message when the root filesystem isn't writable.

Figure out why hostfs panics when it's on an nfs directory on the host (David Coulson)

Remove the chdir from the umid setup

Figure out why the .deb build fails checksumming - it was because hostfs reads went through the page cache and writes bypassed it

Figure out why diffing identical kernel pools produces diffs - can't reproduce this, so I have to assume that it's fixed somehow

Make sure that gdbs get killed properly

Stop dev_ip_addr from crashing if the interface has no IP addresses

Figure out why the Debian ping is slower than the Slackware ping

set_umid should only set the umid

COW headers should have absolute pathnames in them or backing file path names should be possibly relative to the COW file's directory

Figure out why some early breakpoints don't work. I.e. Setting a bp in do_initcalls or bdflush and continuing from start_kernel misses the bp, but continuing from rest_init hits it.

Print out a decent error message when mcast setup fails because there are no multicast devices on the host

Suppress the error message from uml_net when route fails because there's no eth0 on the host

See if it's possible to have uml_net know when an eth device changes its IP address.

Figure out why apt-get install ntp crashes UML.

Version the mconsole and uml_router interfaces

Fix the error messages when a backing file doesn't exist, and also when the COW file exists, but is empty.

Put -fno-common back

Fix gdb

Fix daemonized operation

Implement thread_saved_pc

Run sysrq stuff in interrupt context

Make it possible to ^Z and bg UML

Make core dumping work

many du / causes process segfaults

ddd doesn't work with gdb-pid

Removing a block device with the mconsole doesn't remove its entry in /proc/partitions

Removing ethertap or TUN/TAP eth devices produces odd error messages in UML

Figure out why running UML under an identical UML hangs the outer one

Figure out why running a properly built UML under UML doesn't work

gdbs aren't getting killed properly again

Setting up a tuntap interface with no gate address will hang UML when the interface is brought up

Figure why processes segfault under extreme load, like a 'make -j' kernel build

Figure out why leaning on the space bar when top is running causes an FPE.

Figure out how to let arches ifdef out PTRACE_[SG]ETREGS

Put UML temporary files (except maybe the mmap files) in the user's home directory.

Fix the bogomips calculation.

Figure out how to implement thread private pages.

Figure out why ^S and ^Q don't work on the console (phillips)

When a non-root ubd device is COWed, the bitmap isn't mapped in for some reason.

The mconsole driver should support C-A-D if it doesn't already.

'i reg' to gdb inside UML doesn't print the fp registers

Improve the formatting when a error message is returned from uml_net, i.e. when TUN/TAP is requested, but uml_net doesn't support TUN/TAP.

Get the fp registers in core files correctly.

gdb should be able to see ^C immediately

Byte-swap the ubd bitmap (Roman Zippel)

The register state in the sigcontext passed in to a signal handler should be copied into the process registers when the handler returns.

When UML halts, any external debugger needs to be told that it exited so it doesn't get confused about the child suddenly disappearing.

setup_stack should use copy_to_user.

Using strace as an external debugger doesn't work. strace sees no system calls starting at the delay calibration.

When a new COW file is created and the backing file doesn't exist, the error message implies the COW file doesn't exist, plus the COW file is created with size 0.

mconsole clients need to pass absolute pathnames into UML.

Make sure there are no references to errno in kernel code

Complain when a channel type is requested but not configured in

Get the tracing thread out of the business of running trampolines

disable_chan needs to free up all SIGWINCH info so that logging out of a port console doesn't produce nasty-looking errors. Also the woody filesystem produces nasty errors when logging in through telnet.

close_chan should call free_irq, not free_irq_by_fd

Running the ists debian fs with devfs=nomount produces a 'Freeing free IRQ'.

Look at the initrd dependencies. UML doesn't compile when initrd support is disabled.

Multiple opens of a pty console cause irq registrations to pile up.

Sanity-check arguments in the network *setup routines. Ethertap in particular needs checking. dev->user should remain NULL if there's an error. Also sanity-checking in uml_net is needed.

The uml_net entry points should check their argcs.

Figure out why large pastes into consoles get truncated.

'next' across a statement which segfaults causes wait_for_stop to get confused by the segfault.

Make sure "none" is commutative with the other channel initializations.

It's still possible for disconnecting from a UML console to put that device into an infinite poll loop after it's closed.

Giving the wrong pid to gdb-pid causes a horrible crash.

In gdb inside UML, putting a breakpoint on a int 0x80 causes the system call not to happen.

Breakpoints don't work in the trivial statically linked getpid program. PTRACE_SINGLESTEP appears not to be working.

Figure out why, when there are more telnet connections to UML consoles than there are consoles, when one is freed up, the next telnet connection only wakes up when something else causes an interrupt.

Figure out where the occasional 'end!=nsectors' message is coming from.

Setting up ethertap on 2.2 causes uml_net to spit out an infinite stream of -EBADF.

Configuring a device twice with uml_mconsole apparently succeeds and leaves it unusable.

Configuring a disk to a non-existant file with uml_mconsole apparently succeeds and leaves it unusable.

I built UML inside itself twice on hostfs, with the output going to a log. The first failed, and when the second ran, the log filled with garbage.

The xtime lock should disable interrupts.


Confirm fix

Someone needs to try booting UML as a diskless client via bootp and make sure that it works.

I have a bug that says that diskless booting doesn't work because it's impossible to assign an address to a network interface from the command line. However, I've heard of someone booting diskless successfully, so this needs to be cleared up.

I have an old bug that claims that UML can be crashed with a ping flood. I haven't seen this happen recently, so if anyone else has, you'd better let me know.

Larent Bonnaud claimed a while ago that upgrading a Debian filesystem hung the kernel. I need to know if this is still a problem.

Get Rik van Riel to confirm that UML no longer loses characters that he types at the console. Rik says he hasn't seen it in a while, mistral confirms.

Hosted at SourceForge Logo