User-mode Linux (http://user-mode-linux.sourceforge.net) is the port of the Linux kernel to itself. That is, it treats Linux as a platform to which it's interesting to port Linux. User-mode Linux (UML) runs entirely in userspace as a set of completely normal processes. All of the low-level facilities required to run Linux were implemented in terms of Linux system calls. UML requires no special kernel hooks and contains no kernel-level pieces. UML virtualizes system calls, and jails its processes, through the Linux ptrace system call interception mechanism. Whenever a process executes a system call, a special thread, the tracing thread, is notified. The tracing thread reads the system call and its arguments from the process, nullifies the system call in the host kernel, and forces the process to execute the system call in the UML kernel context. So, those processes only have access to whatever resources UML itself was provided by the host. The virtual memory system, address spaces, and memory protection are implemented with mmap. A file on the host is used as the virtual machine's physical memory. It is mmapped as a single block into an area of the address space that the kernel treats as physical memory. Pages from this area are mmapped again into the kernel and process virtual memory areas as they are assigned as backing pages to virtual memory. Each thread within UML gets a process on the host. This simplifies context switching, which becomes a matter of stopping one process and continuing another. However, it is necessary to update the address space of the process coming into context to reflect pages that have been swapped out or had their protections changed, plus changes in the kernel's virtual mappings since that process had last run. There is a complete set of virtual devices, including consoles, serial lines, a block device, and a network device. The consoles and serial lines can be attached to a number of facilities on the host such as ttys, ptys, pts terminals, file descriptors, xterms, and host ports. The block driver can provide access to anything on the host that resembles a block device. Normally, UML block devices are assigned to files in the host filesystem, but they can also be attached to physical disks, partitions, CD-ROM drives, and floppies. The network driver has a number of backends providing access to different means of exchanging packets with the host, other physical machines, and other virtual machines. Currently, UML can communicate with the host through ethertap and slip devices. Totally virtual networks may be created with a hub daemon that passes ethernet frames from one virtual machine to another, as well as with the multicast mechanism, where a number of UMLs attach to a multicast port to exchange packets with each other. Device interrupts are implemented with Linux signals. The timer uses SIGALRM and SIGVTALRM, while the other devices use SIGIO. The UML interrupt handlers do whatever is necessary to figure out what device is responsible for a given interrupt and hand that information into the normal kernel IRQ mechanism. Synchronous interrupts, such as page faults, and also implemented with Linux signals. Memory faults are handled by a SIGSEGV handler, which determines whether a fault can be fixed, and if so, maps a page into the address space. Most other handled signals, such as SIGILL, SIGTRAP, and SIGFPE, are passed along to the process in which they occurred. The result of all this is a Linux virtual machine which is self-contained and secure enough to serve as a jail. This is one of the many uses that have emerged for UML virtual machines. Others include kernel development, virtual hosting, sandboxing, experimenting with new kernels and distributions, network experimentation, and as a Linux environment for other operating systems. Its major use at this point is as a platform for Linux kernel development. Since it runs in Linux userspace, all of the facilities and tools of the underlying Linux platform are available to be used in its development. This includes gdb, providing kernel developers with a debugging environment that closely resembles that of a normal userspace application. Much filesystem development is currently being done inside UML, along with some memory management work. In addition, hardware driver development has become a possibility, with a USB host controller being written for UML. There is high interest in UML from the hosting industry. The application of UML is obvious - hosting providers could rent virtual machines running on a large server instead of renting rack space for a large number of small servers. This could transform the economics of the industry by allowing providers to consolidate a large number of small customers onto a small number of medium to large servers, reducing the costs associated with floor space, air conditioning, and power. It would also allow increased reliability and availability by greatly reducing the amount of hardware being used, and replacing it with the more reliable and redundant hardware of a larger server. Admin costs would also go down because virtual machines are more convenient to manage and monitor. UML offers new opportunities for hosting providers to sell highly granular amounts of incremental computing resources. The maximum amounts of memory, disk space, network bandwidth, and number of simultaneous running processes can all be controlled and adjusted, and therefore sold, by the provider. UML is also seeing use as a teaching tool. Obviously, it has value in an OS course, it provides a far superior debugging environment over a physical machine. It is also far more convenient to provide each student with a virtual machine than a physical one. For this reason, it is also being used to teach other areas, such as system and network administration. More recently, more classes of applications for UML have emerged as the implications of allowing it to be less jail-like and more open to the outside world have become apparent. For example, it is possible to create a specialized view of the outside world by representing outside resources as fundamental Linux abstractions, such as files and processes, inside UML. So, an external SQL database could be represented as a filesystem inside UML and its contents manipulated as files and directories. Or a number of Apache server processes running on a web farm could be represented as a single process inside UML and manipulated as such, allowing all of them to be restarted by sending the single UML process the signal to restart. It's possible to take this idea further and use UML to turn the Linux kernel into a normal userspace library. Applications linking against it would be able to use all the facilities of the kernel, including threads, memory management, filesystems, and networking. All of these facilities are carefully coded, tuned, and highly debugged, making them attractive replacements for their current libc counterparts. The filesystems can be looked at as hierarchical data stores, with the application's internal data being stored in such a way that the data's hierarchy is represented as a directory structure and individual items being represented as files. This would be the application's equivalent of /proc. If created on the equivalent of a ramdisk in memory, they can provide a convenient way to store and retrieve an application's data. Combined with the Linux network stacks and the UML virtual network interfaces, this would allow an application to export its internal data to the outside world through NFS or another remote filesystem. For example, linking Apache against a UML kernel library would allow it to store its configuration in an internal filesystem and export it to another machine. Doing this with an entire web server farm would allow the configurations of all of the servers to be viewed and manipulated from a central location. This would further allow all of the servers' configurations to be tweaked automatically to maximize the performance of each individual server as well as the farm as a whole. Storing this data in a persistent external form instead of an internal ramdisk would allow the application to store its state in the host filesystem and resume from it at a later point. Turning this configuration around, and hosting the configuration filesystem on the central machine and exporting it to each server would allow someone on that central machine to change the configuration of all the servers at once. This centralized configuration would be interesting for other types of applications, such as desktop and office applications. Manipulating the internal state of these applications through the filesystems they import would provide a new way of integrating them and getting them to cooperate with each other. With judicious use of symbolic links between these filesystems, the applications could be made to share context. For example, if a mail client shifted its attention to a new person (i.e. by the user reading a message from that person), that could be reflected in the filesystem containing its state, and that would let other applications, such as organizers, contact managers, and chat programs shift their attention to that same person. This would come a lot closer to having a suite of applications follow the user's thoughts than is possible now. The kernel's memory allocation facilities, kmalloc and get_free_pages, could be used as an efficient, scalable alternative to malloc. They provide a highly optimized set of memory allocation routines. The slab allocator provides allocation of same-size objects. The page allocator, via a buddy system algorithm, provides memory defragmentation. The kernel also provides a complete threads library, with a scheduler and a set of spinlock and semaphore primitives. These, also, are highly scalable and very well tested, making them an excellent base for application development. The other memory management facilities, such as swapping, could also be put to good use in a process. Rather than actually writing out pages, the swapper could provide hints to the host OS as to what application pages aren't needed and could be swapped out. This could lead to a machine running a set of applications that cooperate with the host to allow the machine's memory to be most efficiently used. The virtual memory system can be used to implement separate, protected, address spaces within the host, and this, in conjunction with the existing jailing capabilities of UML, could be used by an application to safely run untrusted code inside it. This is particularly useful for a web development platform, such as a browser, to run untrusted executable content downloaded from the net. It's possible to go one step beyond even this and imagine an application putting the kernel's resource management facilities to use managing abstract application-defined resources rather than things like raw memory and threads. So, the slab allocator could be used to allocate pools of abstract objects which may or may not actually occupy memory and the scheduler could be used to schedule the application's thread-like objects which may not have actual separate execution contexts. At this point, user-mode Linux implements a fairly complete Linux virtual machine. Adding SMP support will more or less complete this aspect of UML. Once this is done, the new avenues of development described above will start. When UML allows the Linux kernel to be used as an application development library, a large number of areas of new applications will open up. In addition, combining the virtual machine and library aspects of UML in various ways will provide further uses. So, the use of UML as a virtual machine environment, while extremely interesting, is just the beginning, and the end is far from being in view.