As an option, UML has the ability to log all data going through UML
terminals out to the host. This is primarily useful for
honeypots, although
other security-related applications might find it useful as well.
The first step in using this is to configure it in. It's enabled with
CONFIG_TTY_LOG ('Enable tty logging' under 'Character Devices' in the
UML configuration). With this done, UML will automatically log all
sessions to the host.
The interval between opening a UML terminal device and closing it is
considered to be a session. By default, each session is logged to a
separate file in the current directory on the host. The file name is
constructed from the current time. There will be a lot of them
created during the boot process because each step of bringing the
system up opens and closes /dev/console, which makes each line of boot
output look like a separate session. Every login will recognized as a
session, so those will each appear in its own file. If the user
allocates another tty (with su, screen, or some similar tool), that
will open a new session, which will get a separate log file on the
host.
There a couple of ways of changing this behavior. To have the log
files put in a different directory, use the tty_log_dir switch on the
UML command line:
tty_log_dir=dir
The one file per session scheme is the simplest possible way to do
logging, but it's limited and may not be suitable for everyone. The
main problem is that this definition of session isn't exactly the same
as the common notion, which is everything that appears on the user's
screen. Utilities which allocate new pseudo-terminals, such as su and
screen, will cause new log files to appear, and it won't necessarily
be obvious how to splice that into the log of the parent session in
order to reconstruct the text seen by the user.
The solution to this problem is to write the logging information out
as a single stream of data which contains information about the device
that's being used. This is done by specifying the tty_log_fd option
on the UML command line:
tty_log_fd=3 3>tty_log_file
This causes UML's file descriptor 3 to be opened to tty_log_file, and
the logging data will be written to it.
The logging data is a stream of fixed length records with optional
variable length data following.The records have the
following form:
struct tty_log_buf {
int what;
unsigned long tty;
int len;
int direction;
unsigned long sec;
unsigned long usec;
};
The 'what' field can have the following values:
#define TTY_LOG_OPEN 1
#define TTY_LOG_CLOSE 2
#define TTY_LOG_WRITE 3
The 'tty' field is an integer to be used as a unique identifier of the
tty. It is actually the address within UML of the tty_struct, but
outside UML, it is used as an opaque identifier.
The 'len' field says how much data follows the record. It will be
non-zero for TTY_LOG_WRITE and TTY_LOG_OPEN (in UML 2.4.19-40 and
later) records, and zero for all others.
If the record type is TTY_LOG_WRITE, then the data that was written to
the tty immediately follows the tty_log_buf, and its 'len' field says
how much data there will be.
For TTY_LOG_OPEN records, the data length is sizeof(long) and the data
is the identifier of the tty that is active in the context of this
open. This is what allows character streams from different terminals
to be spliced back together to reproduce the stream of text that the
user actually saw. This identifier is the 'parent' tty, so the data
from the newly opened 'child' tty needs to be inserted at this point
into the parent's stream.
The 'direction' field says whether the data was being written to or
read from the terminal. It can have one of these values
#define TTY_READ 1
#define TTY_WRITE 2
The 'sec' and 'usec' fields are a timestamp, which is useful when
playing the log back with the original timings.
To fully use this functionality, you should use UML version 2.4.19-49
or later. tty logging in skas mode (skas mode is highly recommended for
all security-related applications) was broken because of a copy_user
bug. Also, TTY_LOG_OPEN wasn't including the parent tty in its data,
making session reconstruction impossible.
jail/tty_log.pl in the utilities tarball contains a simple log
parser. It reads the records written to tty_log_fd, parses them, and
prints them out. It should be fairly easy to customize it to do
whatever session reconstruction you need.
Also in the utilities tarball is jail/playlog.pl, which is a more
user-friendly interface to the log. By default, it will play back the
session at its original speed if there is only one session in the
log. If there are multiple sessions, it will print out their ids and
exit. You must then rerun playlog, specifying which session you want
to see. In this case, the command line is
perl playlog.pl log-file [tty-id]
There are some switches which alter its behavior
-
-f - follows a live log, similar to 'tail -f'. This will show the
session live, in real time.
-
-n - dumps out the session without recreating the original timing
-
-a - prints out all data, rather than only tty output. This will
allow you to see things which didn't echo on the terminal, such as
passwords. The downside is that all other user input will be
doubled, since those characters are both tty input and tty output.