Due date: Friday, September 12th by 11:59pm
Context switches occur when the operating switches from running one process to running another. You are to modify the Linux kernel to record context-switch information for each process. You are then to design experiments to measure the time slice quantum, the context switch time and time to create and destroy processes in the Linux kernel.
The getrusage()
system call returns system resource
information about a process. The rusage structure it uses has a number
of fields, such as time used, messages sent, page faults, and context
switches, not all of which are filled in by a given operating system.
In particular, the Linux kernel does not record per-process context
switches. In this project, you will extend the Linux kernel
implementation so that the getrusage()
system call
returns information about context switches.
The getrusage()
system call is located in
linux/kernel/sys.c
. The system call itself is
sys_getrusage()
which calls the internal function
getrusage()
. The filled in fields of the rusage
structure come from the structure task_struct
. The
struct task_struct
contains information about each
process (task) in the system and is located in
linux/include/linux/sched.h
. The
task_struct
fields are modified when a process is created
and exits, in linux/kernel/fork.c
and
linux/kernel/exit.c
respectively. They are also modified
in linux/kernel/timer.c
and in the directory
arch/i386/mm
.
You need to extend the functionality of getrusage()
to
return meaningful values for voluntary and involuntary context
switches. You will need to add fields to the struct
task_struct
to keep track of context switches for each process,
for both for the process itself and its children. You can model your
changes on how minor and major page faults are handled, although the
method of counting them is different (see below).
The scheduler is a kernel function called schedule()
(located in linux/kernel/sched.c
), that gets called from
other system call functions (for example, when a process goes to sleep
waiting for I/O), after every system call and after some interrupts.
When invoked, the scheduler:
You do not need to be concerned about specific scheduling policies for this project.
Linux maintains a counter kstat.context_swtch
, which
is a global counter that is incremented whenever a context switch
occurs. This increment occurs in the schedule()
function, when the process identified by the task_struct pointer
variable prev
switches to the task_struct pointer
variable next
(where prev
and
next
are different). At this point you can insert a
statement to increment the total number of context switches (both
voluntary and involuntary) for the process pointed to by
prev
(since you will keep track of the number of context
switches from a process rather than to a process so you should use
prev
rather than next
). You will then have
the total number of context switches, both voluntary and involuntary.
In Linux, processes that are running or ready to run have their state
in their tast_struct set to TASK_RUNNING. So, at this point, a
voluntary context switch means the process is in a state other than
TASK_RUNNING (most likely waiting for I/O). Therefore, if
prev->state
is not TASK_RUNNING then the voluntary
context switch count can be incremented, too.
Once you have your kernel changes implemented, you should be able
to verify that they work by writing some user level programs and using
getrusage()
, or fork()
and
exec()
if you wish to measure the context switches of
other system programs.
After you have implemented your getrusage()
changes
and debugged it carefully, you will then design experiments to
measure: 1) the amount of time required to perform a context switch,
2) the amount of time required to create and destroy a process, and 3)
how long a time slice is (for a CPU intensive process). Since the
time scales for these operations are very small (typically smaller
than the clock granularity of 10 ms), you will measure the time for
many operations and then divide by the number of operations performed.
Also, you will need to do multiple runs in order to account for any
variance in the data between runs.
To measure the time for context-switching consider using a basic CPU-intensive process that counts up to a large number (about 2000000000 on a typical Fossil client). If you run two or more such processes, they will take longer to perform the count. This extra time is the overhead contributed by the context switches.
Process creation in Linux is done via the fork()
system call (check out the sample fork.c
). You can use the
wait()
system call to have the parent process block until
a child process has exited. Do a man fork
or man 2
wait
for more information.
When your experiments are complete, you must turn in a brief (1-2 page) write-up with the following sections:
You will probably need to modify include/linux/sched.h
to add information to struct task_struct
. When you add
to struct task_struct
, you also need to change the
INIT_TASK macro (also in sched.h
) to be sure the initial
values are in place. Also, note that sched.h
has a lot
of files depending upon it, meaning there will be a lot that need
recompilation every time you modify it. So, change
sched.h
as few number of times as possible (design twice,
compile once).
When writing kernel code, you will want to print messages to
stdout
, as you do in printf()
. Since many
parts of the kernel may not have access to the stdio library, kernel
developers wrote their own version of printf()
called
printk()
. printk()
basically behaves the
same as printf()
, in terms of formatting. Furthermore,
printk()
also writes messages to the log file
/var/log/messages
, so you can view output there in case
your modified OS crashes. You might add prefixes to your printk()
messages, such as "MLC: " or "Fossil: " so you can more easily pick
out your messages from the log file (using grep, perhaps). But be
careful! If you have printk()
messages in a part of the
kernel that is accessed frequently (like the scheduler) it can fill up
your log file quickly. When this happens, your system can become
unstable. Check the size of your log file (using ls -l
)
and the disk space that is free (using du
)
frequently.
Remember to save your work frequently in case you crash your machine or need to "roll-back" to a previous working source code version! Refer to http://fossil.wpi.edu/ for more information on how to do this and general use of the Fossil lab and other useful Linux links.
The following system calls might be useful:
fork()
-- to create a new process.
getrusage()
-- to get information about resource
utilization.
gettimeofday()
-- to get the wall-clock time.
wait()
-- to wait for a process to terminate.
execve()
-- to execute a file. The call
execvp()
may be particularly useful.
When running your experiments, you need to be careful of processes
in the background (say, a Web browser downloading a page or a
compilation of a kernel) that may influence your results. While
multiple data runs will help spot periods of extra system activity,
try to keep your system "quiet" so the results are consistent (and
reproducible). You may consider running your experiments on a system
with minimal number of other system processes. An easy way to do this
on Linux (and other versions of Unix) is via the runlevels. The
command telinit
(run as root by using sudo
)
will put the computer into different runlevels, with level "1" (single
user mode) being a good candidate. Do a man telinit
or a
man inittab
for more information.
If you find yourself struggling, you might proceed carefully through the following steps:
getrusage()
system call. You might make
several versions of the test program that do different amounts of
computation vs. I/O to observe how the getrusage()
values vary.
getrusage()
system code
and related routines that modify rusage values. Use
printk()
statements as needed to build up confidence
where to add your modifications.
struct task_struct
to record
context switches. You need fields for both the process itself and its
children. Once you have the structure changes in place, just
initialize the values to a fixed, non-zero value, such as one, so you
can verify your code is working. When you call
getrusage()
at this point it will just return this fixed
value. Your code should accumulate values for child processes when
these processes exit (as done for other fields in
linux/kernel/exit.c
). Test your code with a process that
creates many child process and you should see the number of context
switches increase for each forked child process.
linux/kernel/sched.c
to properly record
context switches. You may use printk()
statements here to
build up confidence, but this code is a core part of the operating
system and will result in numerous log messages so pay attention to
the size of the log file.
You must hand in the following:
sched.c
and sys.c
files),
The turnin (/cs/bin/turnin
) for proj1 is "proj1".
When turnin, also include file "group.txt" which contains the
following:
group_name login_name1 last_name1, first_name1 login_name2 last_name2, first_name2 ...
Also, before you turnin tar up (with gzip) your files. For example:
mkdir proj1 cp * proj1 /* copy all your files to submit to proj1 directory */ tar czf proj1.tgz proj1
then:
scp proj1.tgz login_name@ccc:~/ ssh login_name@ccc /* will ask your ccc passwd */ /cs/bin/turnin submit cs3013 proj1 proj1.tgz
Send all project questions to the TA mailing list.
Send all Fossil administrative questions to the Fossil mailing list.