Site Home Page
The UML Wiki
The Old UML Site

Host File Access

There are two UML filesystems which provide access to filesystems on the host - hostfs and humfs. In both cases, you mount a filesystem within UML and under it will be the contents of a host directory.

hostfs is fine for mounting host directories as long as you

  • don't want to boot on it
  • don't want to create Unix sockets or device nodes
  • don't want to write host files as a non-root user within UML

The exact reasons for these limitations are described below.

If any of these are true, look at humfs, which doesn't have these problems, but is less convenient to use, as you need to prepare the host directory as a humfs directory before mounting it within UML.

There's one restriction common to both filesystems that some people care about. Because they use the UML page cache,

  • if a file is cached in UML, and a process reads it, the file on the host will not be read - the data will come straight from the page cache
  • when a file is written in UML, the changes will be flushed out to the host at some unpredicably later time
This means that the UML will not see changes made on the host (and may overwrite them if something inside UML has changed the same files) and that changes made inside UML won't be visible on the host immediately. Making UML updates appear on the host can be done by mounting the filesystem synchronously, with "-o sync". Making host updates appear inside the UML is harder. Right now, it can't be done. The long-term solution will probably be to use inotify to detect changed files and invalidate the appropriate pages in the UML page cache.
Mounting a host directory with hostfs is easy.
UML# mount none /host -t hostfs
will give you the host's root directory mounted on the UML /host.

If you don't want the host's root, but a subdirectory, such as your home directory, this can be specified with an appropriate "-o" switch on the mount command.

UML# mount none /host -t hostfs -o /home/user
will mount the host's /home/user on the UML's /host.
Unlike hostfs, which allows you to mount any host directory within the UML, humfs requires that the host directory be set up ahead of time. This is done with the humfsify tool, which is part of the uml_utilities package. The steps required are
  • Make the directory and cd to it
    host% mkdir humfs-mount
    host% cd humfs-mount
  • Within that directory, create a subdirectory called "data" which contains the hierarchy that you want to make available to the UML. A common case is to copy the contents of an existing UML filesystem image, which is done like this:
    host% mkdir mnt
    host# mount uml-rootfs mnt -o loop
    host% cp -a mnt data
    You must preserve the ownerships and permissions, which is why the copy was done as root and the "-a" switch is needed.
  • As root, run the humfsify utility to convert this directory to the format needed by the UML humfs filesystem:
    host# humfsify user group 4G
    You should specify the username and group name of the user that will be running the UML that will mount this directory. The last argument is the size of the filesystem as seen within the UML. I gave it 4 gigabytes here.
With the humfs mount prepared, you can mount it within the UML:
UML# mount none /host -o /path/to/humfs/mount -t humfs
As with hostfs, the "-o" switch specifies the path on the host to the directory that you wish to mount.

In order to boot a humfs filesystem, you must humfsify a UML root filesystem as described above. Then, add the following to the UML command line:

rootfstype=humfs rootflags=/path/to/humfs/root
hostfs limitations and humfs
To see the basic problem with hostfs, mount a hostfs directory inside UML, and, as a normal user, create a file within that mount, and then look at its permissions:
UML% cd /mnt/tmp
UML% touch uml-file
touch: setting times of `uml-file': Permission denied
UML% ls -l uml-file
-rw-r--r-- 1 500 500 0 May 25 19:09 uml-file
UML% id
uid=1000(user) gid=1000(user) groups=1000(user)
In spite of the UML user having uid and gid 1000, the file is owned by uid 500. So, you created a file and it is immediately owned by someone else. This creates the permission problems we see. The reason for this is that file operations go through two filesystem layers with different ideas of ownerships. The file creation request first goes through UML, which believes the file will be owned by uid 1000. Then it goes to the host, which sees a process with uid 500 (my uid on the host) creating a file. Thus, the file ends up being owned by a different uid than the one inside the UML which initiated the operation. Furthermore, the permissions may be further changed by the host if the UML owner has a more restrictive umask than the user within the UML.

There are related problems with any file operation which requires root privileges, such as creating device nodes:

UML# mknod hda b 3 1
mknod: `hda': Operation not permitted
Here, I'm trying to create a device as root, an operation which would succeed anywhere else in the UML filesystem. However, since it is the host that actually creates the node, and UML is running as non-root, so the operation fails.

humfs avoids these hostfs problems by bypassing permission checks on the host. It does so by separating file permissions and ownerships from the host files. On the host, if you look inside the data subdirectory in the humfs mount, you will see that permissions are much more open than usual and that the ownerships are different:

host% ls -l data/bin/ls data/etc/passwd /bin/ls /etc/passwd
-rwxr-xr-x 1 jdike jdike 93876 Feb 11 01:43 data/bin/ls
-rwxr-xr-x 1 jdike jdike  1649 Mar 14 14:25 data/etc/passwd
host% ls -l /bin/ls /etc/passwd
-rwxr-xr-x 1 root  root  93876 Feb 11 01:43 /bin/ls
-rw-r--r-- 1 root  root   2000 May  5 14:34 /etc/passwd
Now, if you look at the corresponding files within the file_metadata directory, you will see the original owners and permissions:
host% cat file_metadata/bin/ls
493 0 0
host% cat file_metadata/etc/passwd
420 0 0
493 is 0755 in octal and 420 is 0400. The zeros are the user and group ownerships of the files. This separation of file data and permissions, and the storage of permissions within files rather than on files, takes the host's permission checks out of the picture. When a non-root user inside UML creates a humfs file, the file on the host will be owned by the host user running UML. But that's OK because the file will be writable by UML, and the permissions and ownership seen within UML will be stored within a metadata file. The data and the metadata are merged by the UML humfs filesystem, so that users within UML see the correct permissions.

If you try creating a device node on humfs, it behaves exactly as you'd expect:

UML# mknod hda b 3 1
UML# ls -l hda
brw-r--r-- 2 root root 3, 1 May 25 20:05 hda
If you look at the corresponding files on the host, you will see why:
host% ls -l data/tmp/hda 
-rwxrwxr-x 1 jdike jdike 0 May 25 16:05 data/tmp/hda
host% ls -l file_metadata/tmp/hda 
-rw-r--r-- 1 jdike jdike 14 May 25 16:05 file_metadata/tmp/hda
host% cat file_metadata/tmp/hda 
420 0 0 b 3 1
The mknod inside UML didn't translate into a mknod on the host, which would have required root privileges. Instead, it created an normal, empty in the data subdirectory, and put all of the device information in the metadata file. This information is used by the humfs filesystem to create an actual device node within UML, even though there is no device node on the host. Thus, no root operations are required on the host.
Hosted at SourceForge Logo