CS-502 Operating Systems                                                                              WPI, Fall 2007
Hugh C. Lauer                                                                                        Project 1 (15 points)
Assigned: Monday, September 17, 2007                Part 1 Due: Monday, September 24, 2007
                                                                                    Part 2 Due: Monday, October 1, 2007

Introduction

This project is intended to introduce you to working inside the Linux kernel. In your virtual machine, you will

o       modify the kernel to add a new system call and demonstrate that you can use that system call;

o       add another kernel call to get some useful information; and

o       create kernel patches and test programs submit your code.

Creating a Patch

Before we get started, let’s learn how to make a patch file. Patch files are the normal way to distributed small changes to large source trees in the Unix/Linux development world. A patch file describes the differences between an original source tree and a modified tree. If you change only a few lines in a handful of files, the patch file will only contain a few lines. Moreover, it will be easy for someone to find, identify, and read all of your changes.

A patch is created using the diff program. Another person can then apply the patch file using the patch program to the same original kernel tree and create the same result. You can read about the diff and patch programs in the online documentation for Linux.

Suppose that you started with a source tree in the directory linux-2.6.18.8-0.5 and that you made your changes in a cloned directory named kernelSrc.

From a directory immediately outside kernelSrc, execute the following command:–

diff –urN /usr/src/linux-2.6.18.8-0.5 kernelSrc > patch1

This will create the patch in the file patch1.

Note:  You must make the patch file immediately outside your kernelSrc directory. This way, someone else can apply the patch, even if the directory names are different.

            Do not prefix your kernelSrc directory with anything. For example, do not give a fully qualified pathname — even something like ~/kernelSrc or ./kernelSrc. This contaminates the patch file with pathnames that someone else does not have, and therefore the patch will not work for that person. For purposes of this course, it makes grading much more difficult.

Note:  It is also important that you build to a separate object directory such as kernelDst. This prevents your source directory (and your patch file) from being contaminated with about a megabyte of differences from the configuration step of the kernel build.

Suppose now the instructor is using ~/instructorSrc as the grading directory, and suppose it is an exact copy of linux-2.6.18.8-0.5. The instructor will then execute the following commands to create an exact duplicate of your kernelSrc directory: –

cd instructorSrc
patch –p1 < {yourSubmissionDirectory}/patch1

Note the patch is applied inside the directory being patched, even though it was created outside that directory; the –p1 argument removes one directory level, so that patches remain independent of actual directory names.

Part 1: Adding a System Call (6 points)

We are now ready to modify the Linux kernel by adding a new system call. This part of the project is essentially as described on pages 76-78 of the Silbershatz textbook, but the details are different because the Linux kernel version is different. Comprehensive information on implementing system calls is provided in Chapter 5 of Linux Kernel Development, 2nd edition, by Robert Love.

You should start with a new clone of the kernel tree made using the linked copy command as we did in Project 0:–

cp –al /usr/src/linux-2.6.18.8-0.5 kernelSrc

The system call that you will add simply puts a message in the log saying that it was called, and then it returns. It uses the printk() function, the kernel equivalent of printf().

·        Edit the file called kernelSrc/kernel/sys.c. Add the following two lines to the top of this file:–

#include <linux/linkage.h>
#include <linux/kernel.h>

and the following lines to the end:–

asmlinkage long sys_helloworld () {
   printk(KERN_EMERG Hello, world!\n);
   return 0;
}

Note that there is no comma between KERN_EMERG and ″Hello, world!\n″. These are simply two strings that are concatenated together to form one argument.

This is the entire program to implement this system call! It illustrates the Linux kernel call conventions. In Linux, all kernel call functions are named sys_nameOfFunction. The compiler directive asmlinkage tells the C compiler how to compile the call so that it can receive its arguments.

·        You next need to register your system call with the kernel. How you do this depends upon what kind of machine you are running. If you are using a 32-bit i386 architecture (e.g., a 32-bit Pentium), then do the following:–

o       Edit the file unistd.h in the directory kernelSrc/include/asm-i386 to define a new system call number. You will see a list of other system calls; your entry should follow the same pattern and be added to the end. It must be named

__NR_helloworld[1]

and it must have a number one greater than the last existing call in the list. You also need to increment the value in NR_syscalls. Since system calls start at zero, the total number of system calls will be one greater than your system call number.

o       Edit the file syscall_table.S in the directory arch/i386/kernel to create an entry point for your system call.[2] You will see a long list of entry points of system calls. Add the line

.long sys_helloworld

to the end of this list. It is essential that this list be maintained in the same numerical order as the list in unistd.h.

·        If you are using an x86_64 architecture, then do the following instead:–

o       Edit the file unistd.h in the directory include/asm-x86_64 to define a new system call number for __NR_helloworld. You will see a list of other system calls; your entry should follow the same pattern. That is, you should add two lines of the following form to the end of the list

#define    251 /*whatever is the next number in list*/
__SYSCALL(__NR_helloworld, sys_helloworld)[3]

Make sure that the number of system calls is set appropriately.

If your architecture is something other than either of these, find the appropriate asm directory under include and edit the file unistd.h, following the pattern used for that architecture.

Note: The order of systems calls in the list is critical. Once a system call has been defined, it can never be redefined or undefined in a future version of Linux without breaking libraries that are installed in user space.

Rebuild and install the kernel with these changes. Be sure to change the kernel version string in the configuration utility.

To test your new system call, write the following simple program in user space:–

#include <errno.h>
#include <sys/syscall.h>
#include <linux/unistd.h>
#include <stdio.h>

#define __NR_helloworld 288  /* or whatever you set it in unistd.h */

_syscall0(long, helloworld); /* zero indicates the number of arguments */

main () {
   printf(“The return code from the helloworld system call is %d\n”, helloworld());
}

[Apparently: _syscalln has become deprecated with gcc version 4.1. Students found that it is recommended to use the syscall(2) function that has been around for years. However, this does not provide any type-checking of argument types.]

The ­syscall0 macro creates the function helloworld, which is just a trap to the kernel to invoke your newly created kernel function. (If your system call takes n arguments, you would have used the syscalln macro.)

Running this program should generate a kernel message with severity KERN_EMERG. The syslogd daemon puts these messages into the circular log file /var/log/messages. You need root privileges to read this file.

Last year, students had a hard time finding their message in the log. You may have brainstorm to figure out how to read the log message. Two other ways to read the log are the program /bin/dmesg or the command cat /proc/kmsg.

Submitting Part 1

To submit the results from this part of the assignment, create and submit a file patch1 that reflects the differences between this kernel and the original kernel source tree. Also submit a copy of your test program, a Makefile for it, and a very short report that you have seen the message generated by your system call. Be sure to put your name on your files and documents!

As a matter of pacing yourself, you should complete this part in a few days. You may need more than a week to complete Part 2.

Part 2: Getting Process Information (9 points)

In this part of the assignment, you will create a system call that gets some useful information about the current process, and you will add it to your kernel of Part 1.[4]

The function prototype for your system call will be

long getprinfo(struct prinfo *info);

where *info is a pointer to data structure in user space where your system call will put information about the process. The system call will return zero if successful or an error indication if not successful. The structure prinfo is defined as follows:–

struct prinfo {
   long state;                /* current state of process */
   long nice;                 /* process nice value */
   pid_t pid;                 /* process id */
   pid_t parent_pid;          /* process id of parent */
   pid_t youngest_child_pid;  /* pid of youngest child */
   pid_t younger_sibling_pid; /* pid of younger sibling */
   pid_t older_sibling_pid;   /* pid of older sibling */
   unsigned long start_time; /* process start time */
   long user_time;            /* CPU time spent in user mode */
   long sys_time;             /* CPU time spent in system mode */
   long cutime;               /* total user time of children */
   long cstime;               /* total system time of children */
   long uid;                  /* user id of process owner */
   char comm[16];             /* name of program executed */
};

If the calling process does not have, say, a youngest child or any siblings, then return -1 in the corresponding fields.

Create a new kernel include file called include/linux/prinfo.h for this definition. This new include file will have to include the file include/linux/types.h for the definition of pid_t.

You will need a separate user-space include file prinfo.h that repeats a copy of the struct prinfo and that also includes the function prototype of getprinfo. This user-space include file should reside in the same directory as your user-space test program.

Implementation

Before you attempt to implement your system call, you should look at the implementations of the getuid and getpid system calls to provide guidance. These can be found in the file kernel/timer.c. Here are some things you should know:–

·        Almost all of the information you need to fill in the fields of a prinfo structure can be found in the structure the task_struct, defined in include/linux/sched.h. Study this structure carefully! Some of the information is obtained by following pointers from task_struct, for example, to child or sibling processes. If a process has no children or siblings, these pointers will be null.

·        The kernel file include/asm/current.h defines an inline function that returns the address of the task_struct of the current process.

·        Every system call must check the validity of the arguments passed by the caller. In particular, kernel code must never blindly follow a pointer provided by a caller in user space. Fortunately, the Linux kernel provides two functions that not only check the validity but also transfer information between kernel space and user space. These functions are copy_from_user and copy_to_user. You will need to use the latter. For example, if you have accumulated the information in a data structure in the kernel called

struct prinfo kernel_info;

then you can use copy_to_user as follows:–

/* copy data from kernel_info to area supplied by caller */
if (copy_to_user(info, &kernel_info, sizeof(prinfo))
           return –EFAULT;

where EFAULT is a error code defined in include/asm/errno.h.

The copy_to_user function returns zero if the info argument provided by the caller is valid and the copy is successful, but it returns the number of bytes that failed to copy in case of an error. 

·        You don’t need to worry about page faults in the user space or about blocking and/or pre-emption by another process. Your system call operates in something called process context. It has access to both kernel and user data, and it is capable of taking page faults, being pre-empted, or going to sleep without affecting the kernel or other processes.

·        You need to install you system call in the appropriate places in unistd.h and syscall_table.S, as you did in Part 1 of this assignment. For simplicity of this assignment (and of grading), add your getprinfo system call after the helloworld system call from Part 1 of this assignment. Don’t delete the helloworld implementation!

Implement your new system call in its own file called prinfo.c, and make it part of the kernelSrc/kernel directory. Also, edit kernelSrc/kernel/Makefile to add prinfo.o to the list of object files near the top of the Makefile. This will cause it to be compiled and linked with the rest of the kernel.

Testing your System Call

Write a simple user-space program that calls getprinfo patterned after the one you wrote for testing your helloworld call in Part 1. As part of this test program, you will need to create and include a user-space version of prinfo.h. Your test program should print all of the information returned in the prinfo structure for the calling process. Run it several times; also run it from several different shells. Also test what happens if the pointer argument is null or invalid; show that the system call does the right thing.

Note which fields change from run to run and which fields do not. Also, inspect the results to see if they make sense. In your write up, discuss why some things change and how frequently they change.

For debugging your system call, use the printk() function that you used in Part 1. You may see this information in the log file /var/log/messages.

Submitting Part 2

To submit this part of the assignment, create a patch file patch2 that describes the differences between this kernel and the original Linux source tree.

Also submit your test program, your user-space include file, a Makefile to build and clean your user-space test program, and some sample test results of invoking your system call more than once.

Finally, include a write-up (in MS Word or PDF format) describing your tests and how the information returned by your system call changes from run to run.

General Notes on Submission of this Assignment

Be sure to put your name at the top of every file you submit and/or at the top of every file that you edit!

Submit your assignment for grading as directed in class using the web-based Turnin tool developed by Professor Fisler’s group. A brief introduction can be found at

            http://web.cs.wpi.edu/~kfisler/turnin.html

and access to the turnin system itself can be found at

            http://turnin.cs.wpi.edu:8088/servlets/turnin.ss

For purposes of Turnin, this assignment is really two projects —

o       Project1, Part1 and

o       Project1, Part2.

Do not put separate parts of the assignment in separate folders or specify separate makefiles for them. Do not zip everything together into a zip file.

Individual Assignment

This is an individual project, not a team project. Each student should submit his/her own work, not copies of jointly developed code.

Nevertheless, if you are puzzled or unsure of some aspect of this assignment, you should consult your friends, colleagues, or the instructor to help clarify your understand or derive an approach to the problem.



[1]       Note that __NR_helloworld starts with a double underscore.

[2]       This table used to be in entry.S as described by Robert Love in Linux Kernel Development, but it was recently broken out to be more manageable.

[3]       Note that __SYSCALL starts with a double underscore and is all caps.

[4]       This part of the project, including the function prototype and the structure definition, is borrowed from a project assignment by Prof. Jason Nieh of Columbia University.