CS3013 Project 1

Switching Contexts

Due date: Friday, September 6th by 11:59pm

Context switches occur when the operating switches from running one process to running another. You are to modify the Linux kernel to record context-switch information for each process. You are then to design programs to verify that your implementation works and to evaluate the number of context switches for a variety of processes.

Description

Getting Resource Usage

The getrusage() system call returns system resource information about a process. The rusage structure it uses has a number of fields, such as time used, messages sent, page faults, and context switches, not all of which are filled in by a given operating system. In particular, the Linux kernel does not record per-process context switches. In this project, you will extend the Linux kernel implementation so that the getrusage() system call returns information about context switches.

The getrusage() system call is located in linux/kernel/sys.c. The system call itself is sys_getrusage() which calls the internal function getrusage(). The filled in fields of the rusage structure come from the structure task_struct. The struct task_struct contains information about each process (task) in the system and is located in linux/include/linux/sched.h. The task_struct fields are modified when a process is created and exits, in linux/kernel/fork.c and linux/kernel/exit.c respectively. They are also modified in linux/kernel/timer.c and in the directory arch/i386/mm.

You need to extend the functionality of getrusage() to return meaningful values for voluntary and involuntary context switches. You will need to add fields to the struct task_struct to keep track of context switches for each process, for both for the process itself and its children. You can model your changes on how minor and major page faults are handled, although the method of counting them is different (see below).

Counting Context Switches

The scheduler is a kernel function called schedule(), that gets called from other system call functions (usually when a process goes to sleep waiting for I/O), after every system call and after some interrupts. When invoked, the scheduler:

Performs some basic periodic tasks, like handling interrupt service routines (not a concern of this project)
Chooses one process to execute according to the scheduling policy
Dispatches the chosen process to run

You do not need to be concerned about specific policies for this project.

Linux maintains a counter kstat.context_swtch, which is a global counter that is incremented whenever a context switch occurs. This increment occurs in the schedule() function, when the process identified by the task_struct pointer variable prev switches to the task_struct pointer variable next (where prev and next are different). At this point you can insert a statement to increment the total number of context switches (both voluntary and involuntary) for the process pointed to by prev (since you will keep track of the number of context switches from a process rather than to a process so you should use prev rather than next). You will then have the total number of context switches, both voluntary and involuntary. At this point, a voluntary context switch means the process is in a state other than TASK_RUNNING (most likely waiting for I/O). Therefore, if prev->state is not TASK_RUNNING then the voluntary context switch count can be incremented, too.

Hints

You will probably need to modify include/linux/sched.h to add information to struct task_struct. When you add to struct task_struct, you also need to change the INIT_TASK macro (also in sched.h) to be sure the initial values are in place. Also, note that sched.h has a lot of files depending upon it, meaning there will be a lot that need recompilation every time you modify it. Change sched.h as few number of times as possible.

When writing kernel code, you will want to print messages to stdout, as you do in printf(). Since many parts of the kernel may not have access to the stdio library, kernel developers wrote their own version of printf() called printk(). printk() basically behaves the same as printf(), in terms of formatting. Furthermore, printk() also writes messages to the log file /var/log/messages, so you can view output there in case your modified OS crashes. You might add prefixes to your printk() messages, such as "MLC: " or "Fossil: " so you can more easily pick out your messages from the log file (using grep, perhaps). But be careful! If you have printk() messages in a part of the kernel that is accessed frequently (like the scheduler) it can fill up your log file quickly. When this happens, your system can become unstable. Check the size of your log file (using ls -l) and the disk space that is free (using du) frequently.

Remember to save your work frequently in case you crash your machine or need to "roll-back" to a previous working source code version! Refer to http://fossil.wpi.edu/ for more information on how to do this and general use of the Fossil lab and other useful Linux links.

If you find yourself struggling, you might proceed carefully through the following steps:

Write a test program that correctly executes the default, unmodified getrusage() system call. You might make several versions of the test program that do different amounts of computation vs. I/O to observe how the getrusage() values vary.
Familiarize yourself with the getrusage() system code and related routines that modify rusage values. Use printk() statements as needed to build up confidence where to add your modifications.
Add fields into the struct task_struct to record context switches. You need fields for both the process itself and its children. Once you have the structure changes in place, just initialize the values to a fixed, non-zero value, such as one, so you can verify your code is working. When you call getrusage() at this point it will just return this fixed value. Your code should accumulate values for child processes when these processes exit (as done for other fields in linux/kernel/exit.c). Test your code with a process that creates many child process and you should see the number of context switches increase for each forked child process.
Modify linux/kernel/sched.c to properly record context switches. You may use printk() statements here to build up confidence, but this code is a core part of the operating system and will result in numerous log messages so pay attention to the size of the log file.
Proceed with the project evaluation.
Turn in the project. Relax!

Evaluation

After you have implemented your getrusage() changes and debugged it carefully, you will then evaluate the nature of context switches by:

Creating at series of workloads:
1. A basic CPU-intensive process that counts up to a large number. In this case, it should count high enough for this process to require 15 seconds or so to complete its counting (a count to about 2000000000 on a typical Fossil client).
2. An disk I/O bound process that reads and/or writes to a large file.
3. A network intensive application such as a Web browser (actually browsing) or a file transfer.
4. A varied application such as a compilation of a large program.
Running each workload and recording:
1. context switch information
2. time spent running (both user and system)
3. wall-clock time
Depicting the results clearly. You can do this in a series of tables or even with graphs.
Interpreting the results. Briefly describe what the results mean and what you think is happening.

Feel free to run other processes and interpret their results.

Hand In

You must hand in the following:

All modified source code files for your solution (for example, the entire sched.c and sys.c files),
A compiled version of your kernel.
Instructions on how to incorporate your code into the kernel tree and compile it.
Your evaluation. Include:
1. source code of evaluation programs you used and brief instructions how to run them
2. a table or graph of your measured results
3. your analysis and interpretation

The turnin (/cs/bin/turnin) for proj1 is "proj1". When turnin, also include file "group.txt" which contains the following:

        group_name
        login_name1  last_name1, first_name1
        login_name2  last_name2, first_name2
        ...

Also, before you turnin tar up (with gzip) your files. For example:

        mkdir proj1
        cp * proj1  /* copy all your files to submit to proj1 directory */
        tar czf proj1.tgz proj1

then:

        scp proj1.tgz login_name@ccc:~/
        ssh login_name@ccc    /* will ask your ccc passwd */
        /cs/bin/turnin submit cs3013 proj1 proj1.tgz

Return to the 3013 Home Page

Send all project questions to the TA mailing list.

Send all Fossil administrative questions to the Fossil mailing list.