The COW block driver

Next: The mconsole Up: Changes during the last Previous: Changes during the last

The COW block driver

The major enhancement to the UML block driver has been the contribution by Greg Lonnon of a copy-on-write (COW) layering capability. This allows UML to layer a private writable file over a shared read-only file to form a single read-write block device. The writable file contains only the blocks that have been modified. This provides the ability for multiple UMLs to share a single filesystem image and to write to it. This is important for root filesystems, since they tend to be large, and Linux doesn't deal well with read-only roots.

When the block driver is using a COW device, it writes modified blocks to the COW layer, and reads from either the COW layer or the backing layer, depending on whether or not the requested block has been modified.

The COW file contains a header which contains the following

a magic number to distinguish a COW file from a normal filesystem image
a version number
the path of the read-only backing file
the last modification time and last size of the backing file, which are used to check that the backing file hasn't been modified
the sector size used by this file

Following the header is a bitmap describing which blocks have been modified and are valid in the COW layer. This is loaded into memory and used to decide where blocks should be read from.

Following the modified block bitmap is the actual block data. This is sparse, meaning that valid blocks are located in the same location relative to the start of the filesystem as their equivalents in the backing file and that only those blocks which have been modified have been allocated disk space.

The sparseness of the COW file greatly simplifies the driver by allowing it to read and write the same locations relative to the filesystem start, regardless of whether the I/O is happening on the COW file or the backing file.

With many UMLs booting from the same filesystem through COW devices, the disk space required is greatly reduced. In the situation where a number of virtual machines are booted from the same filesystem and they have made few changes to the data, the disk space consumption for the entire group is not much more than that of a single UML. This reduction in disk space can greatly increase the number of virtual machines that the host machine can run. It can also increase the efficiency with which it can run them. Since the vast majority of the data used by the virtual machines is shared, only one copy of it will exist in the host's caches. This effectively increases the size of the host's memory, since, previously, different virtual machines would have entire private filesystems, which would be cached separately by the host despite the fact that they are largely identical.

It is also an administrative convenience. Creating a new COW file is done automatically by UML when it is requested on the command line. This is far more convenient than copying a large filesystem image, which can take several minutes for a large filesystem.

Another advantage is that it provides a simple checkpointing facility. If the virtual machine crashes or something important was deleted, and there was no important data in the COW file, then the old filesystem state can be restored by simply deleting it and starting over with another one.

It is sometimes desirable to be able to merge the COW file changes into the backing file. This is done with the uml_moo utility. uml_moo simply traverses both the COW file and backing file, writing the current version of each block out to a third file.

Next: The mconsole Up: Changes during the last Previous: Changes during the last

Jeff Dike 2001-09-15