CS-502 & CS-3013 Operating Systems                                                           WPI, Fall 2007
Hugh C. Lauer                                                                                        Project 2 (30 points)
Assigned: Monday, October 1, 2007                                   Due: Monday, October 22, 2007

Introduction & Problem statement

In this project, you will implement a mailbox system within the Linux kernel that allows processes in separate address spaces to send and receive messages among themselves.

The purpose of this project is twofold:–

1. To gain practical experience with programming and testing an inter-process communication system, and

2. To gain experience with synchronization and memory management in the Linux kernel.

Traditionally, Unix and Linux processes occupied separate address spaces and had no way to communicate with each other except via pipes, files, or segments of shared virtual memory. Although the invention of threads has mitigated this problem substantially, there are still circumstances in which processes need to retain the benefits of separate address spaces while enjoying a more friendly form of inter-process communication.

You mailbox system will allow processes to send messages to each other. These messages will be copied by the kernel into mailboxes of the receiving processes. To keep things simple, mailboxes will be of fixed size, messages will be processed in FIFO (first-in, first-out) order, and each process (i.e., thread group in Linux kernel terminology) will have exactly one mailbox. Any thread or process will be able to send messages to any mailbox (including its own), but only the threads of the mailbox “owner” will be able to receive them.

To test your mailbox system, you will need to write a test system in user space that sends messages randomly, receives replies, and keeps track of what was sent and received.

It is suggested but not required that you base your design on the FIFOMessageQueue monitor that was presented in class. A copy is in the Appendix at the end of this document.

Mailboxes and Messages

The user-level interface (i.e., the API) to the mailbox facility is defined in the C header file mailbox.h and specified in this section. A mailbox is an abstract object capable of containing a bounded number of messages of undefined structure. A mailbox is created for each process at the time the process in created, and it is destroyed when the process exits. Threads within the same process share the same mailbox.

In order to keep things simple, message bodies in this project are of fixed maximum size but of no defined type. The mailbox header file defines this maximum size as

#define MAX_MSG_SIZE 128

Your kernel should specify a maximum number of messages that any mailbox may hold. This should be at least 32, but any larger number would be acceptable. The maximum number of messages is not part of the message system API.

Any process or thread (i.e., task) may send a message to any process (including itself) if it knows the process ID of that process — i.e., its pid_t. It may learn this process ID when it is forked or spawned or from some other source of information such as another message. To send a message, the task invokes

int SendMsg(pid_t dest, void *body, int len, bool block)

The argument dest is the process ID of the recipient. The argument *body points to a string of uninterpreted bytes of length len, not exceeding MAX_MSG_SIZE. The application program must cast this pointer to its own data structure. The argument block specifies whether the sending task blocks if it cannot send the message. If block is TRUE and the mailbox is full, then the calling task will block and only return when the mailbox is no longer full and the message is sent. If block is FALSE and the mailbox is full, then the sending task will return immediately with an error indication, and the message will not be sent.

In all cases, SendMsg returns zero if it successfully queues the message onto the destination mailbox, and it will return an error code if not. See below for error codes. A special condition occurs if the mailbox is “stopped” (see below). In this case, the SendMsg returns an error code.

To receive a message and remove it from its mailbox, a task invokes

int RcvMsg(pid_t *sender, void *msg, int *len, bool block)

The process ID of the sender is returned in the pid_t pointed to by *sender. The area pointed to by *msg must be an array of bytes of MAX_MSG_SIZE in length. The queued message is returned to this area, and its length is returned in the integer pointed to by *len. If the argument block is TRUE and the mailbox is empty, RcvMsg blocks and does not return until a new message has been received. If the argument block is FALSE and the mailbox is empty, then RcvMsg returns immediately with an error. If the mailbox is stopped, RcvMsg will remove and return a queued message from the mailbox, but it will return an error if there are no queued messages. In all cases, RcvMsg returns zero if it successfully dequeues a message from the mailbox and an error code if not.

Note that any thread of a process, or even multiple threads, can attempt to receive messages from the mailbox of that process. Messages are received in FIFO order.

A task may manage its mailbox by calling

int ManageMailbox(bool stop, int *count)

The argument *count is a pointer to an integer; ManageMailbox will return the number of queued messages in the mailbox to this integer. The Boolean argument stop specifies whether to stop this mailbox from receiving any more messages. If stop is true, then any attempt to send a future message to this mailbox results in an error to the sending task. Also, all blocked calls to SendMsg or RcvMsg for this mailbox immediately return with an error. A task may continue to receive accumulated messages from its mailbox after stopping it. A mailbox may be stopped multiple times, with the same effect as stopping it once. To keep this assignment simple, there is no way to start a mailbox that has been stopped.

Each mailbox is automatically started when its process is created, and it is automatically stopped when its process is terminated. Also, when the process terminates, all messages must be flushed, and any blocked tasks must immediately return with an error. Note that this only happens when the last remaining thread of the process terminates.

Error codes

All of the message and mailbox functions return int values indicating success or error. A return value of zero represents success. The following errors have been defined: –

· MAILBOX_FULL: This is returned when SendMsg with block set to FALSE attempts to send a message to a mailbox that already contains the maximum number of messages that it can hold.

· MAILBOX_EMPTY: This is returned when RcvMsg with block set to FALSE attempts to retrieve a message from an empty mailbox.

· MAILBOX_STOPPED: This is returned when any call to SendMsg is made to a mailbox that has been stopped. It is also returned by a call to SendMsg or RcvMsg with block set to TRUE when that call was blocked at the time the mailbox is stopped.

· MAILBOX_INVALID: This is returned when SendMsg specifies a destination that is not a valid process ID. Note that kernel tasks and certain other system processes that are initialized before the mailbox system are also considered invalid.

· MSG_TOO_LONG: This is returned when a call to SendMsg attempts to send a message longer than MAX_MSG_SIZE.

· MSG_ARG_ERROR: This is returned when any pointer argument to any message or mailbox call is invalid, so that copy_to_user() and copy_from_user() fail.

· MAILBOX_ERROR: This is returned in the case of any other mailbox or message error.

You may add additional error returns as needed.

Implementation

There are several parts to the implementation of this message and mailbox facility: –

· Adding system calls to implement the functions of the mailbox facility.

· Setting up the mailbox during process creation and deleting it during termination. You should only delete a mailbox when all tasks of a task group (i.e., process) have terminated.

· Initialization of the mailbox system.

· Allocating and managing space for messages and mailboxes in the kernel.

· Implementing the synchronization needed for the sending and receiving messages.

· Creating a user-space module that provides the interface to the message and mailbox facility.

In addition, you must write a test program to exercise your mailbox facility.

Before you start

You should start with a clean kernel tree. Clone /usr/src/linux-2.6.18.8-0.5 into a working directory such as kernelSrc and build the kernel as in Project 0. After you have built it, apply the following patch file: –

prePatch-Project2

This patch does three things, all of which you can see by reading the patch file: –

· It adds a field to the Linux task_struct for your mailbox. This field is pointer to a new C-language struct called mailbox, which you will have to define in your implementation.

· It adds three entries

.long sys_mailbox_send
.long sys_mailbox_rcv
.long sys_mailbox_manage

to the file arch/i386/kernel/syscall_table.S. You will implement your mailbox facility using three system calls with these names.

· It adds three entries

#define __NR_mailbox_send 318
#define __NR_mailbox_rcv 319
#define __NR_mailbox_manage 320

to the file include/asm/unistd.h and updates the number of system calls.

Next, clone this kernelSrc tree to another kernel tree by the command

cp –al kernelSrc kernelSrc2

Do all of your work in kernelSrc2. Then, when you are ready to submit, create a difference file as follows: –

diff –urN kernelSrc kernelSrc2 > patch-Project2

Submit this patch only, not a patch against the original Linux sources.

The reason for doing all of this is that the three header files modified by the pre-patch file are included by many files in the kernel. If each student were to change these files separately, it would require many, many hours of compilation by the instructor to grade this project. By having all students apply this same patch, the instructor can pre-compile the kernel and take a snapshot. Then, each student’s patch_Project2 can be applied in turn to the snapshot and can be compiled much more quickly.

System Call

In most professional situations, system call numbers are scarce and precious resources. You would normally be restricted to using only one for your entire message and mailbox facility, even though it has several functions. For purposes of keeping this assignment simple, we will allow three system calls as provided above. You may define them however you wish, so long as you stay within the constraints of the pre-patch file.

A useful debugging technique used by students of previous terms is to added additional functionality to one of the system calls. This helps, for example, to print out the message queue of the mailbox and to examine the values of the mailbox data structures from the pleasure of a user space program. It is a lot easier than using printk() in many cases.

Creating, deleting, and keeping track of mailboxes

The primary work of creating a new process is performed by the function do_fork() in kernelSrc/kernel/fork.c, and likewise the primary work of terminating a process is performed by the function do_exit() in kernelSrc/kernel/exit.c. You will need to modify both of these functions to create and delete your mailbox. See Robert Love, Linux Kernel Development, pp 31-33 and 35-36, respectively.

When a new process is forked, you need create a new mailbox and store a pointer to it in the mailbox field of the process’s task_struct. You may use kmalloc() to allocate memory for your data structure that defines the new mailbox. To delete your mailbox data structure, you may use kfree(). Both of these are defined in linux/slab.h and are described on pages 187-192 of Robert Love’s book.[1]

If your process has more than one thread, they all share the same mailbox. Therefore, when a thread is spawned, do_fork() is called with the CLONE_THREAD switch. In this case, you need to use the same mailbox pointer as the spawning task rather than create a new one.

When the thread exits, do_exit() is called. One of the things that this function does is to check to see whether the entire thread group is dead; you can see this code in do_exit(). If the group is dead, you must delete the mailbox and reclaim the memory for the mailbox data structure. However, if the group is not dead, you must not delete the mailbox.

Don’t forget than when you are deleting a mailbox, you need to implement the specified functionality for calls that are blocked, so that they can be unblocked and return errors. The easiest way to do this is to stop the mailbox and let the blocked calls to it return, as described below.

Note that kernel threads, and possibly some other system processes created early during start-up, will not have mailboxes. The mailbox field in the task_struct description of these tasks should be the null pointer. You can recognize a kernel thread because it has no virtual memory — i.e., its task_struct->mm pointer is null. An attempt to send a message to a task with a null mailbox pointer should return the error MAILBOX_INVALID.

Memory allocation for messages

Messages come and go frequently, so using kmalloc() to allocate memory for each new message is not appropriate. Instead, memory for messages should come from the slab allocator in linux/slab.h. These will be in a cache of identically sized chunks of kernel memory implemented by the kernel as a set of slabs. Such a cache can be created by

kmem_cache_t* kmem_cache_create(const char* name, size_t size,
size_t align, unsigned long flags, NULL, NULL)

This function returns a pointer to a cache object. The first argument name points to a string containing the text name of the object, and the second argument is the size of each element in the cache (i.e., MAX_MSG_SIZE plus whatever overhead is needed for links and for storing the sender ID, the actual length of a message, and any other information you need). The final two arguments, align and flags, may be zero. These last two arguments are for constructors and destructors to create or delete slabs within the cache; setting these to NULL causes the kernel to use the default constructor and destructor.

Caution: Creating the cache is something that you need to do at kernel initialization time, before processes start to be forked. However, if you do it too early, the kernel will crash. Look at the kernel file init.c to see where to put your mailbox cache initialization.

After a cache has been created, an object is obtained from the cache by

void* kmem_cache_alloc(kmem_cache_t* cachep, int flags)

This returns a pointer to a cache object of size bytes, where size was specified when the cache was created. For flags, the value GFP_KERNEL is probably what you want. To free the object, use

void kmem_cache_free(kmem_cache_t* cachep, void* objp)

This returns the object pointed to by objp to the cache for subsequent reuse.

It is not clear whether you need to destroy the cache upon shutdown. If you need to, you should use

int kmem_cachd_destroy(kmem_cache_t* cachep)

The caller must ensure that all slabs in the cache are empty and that no one accesses the cache during and after its destruction. It seems like it would be much simpler to let the system complete its shutdown, at which time everything goes away.

See pages 198-201 in Robert Love’s book for a discussion of slab allocation.

Synchronization and Design of the Mailbox

The mailbox facility of this assignment clearly falls under the category of task parallelism as discussed in class.[2] It is suggested that you pattern each individual mailbox as a monitor, along the lines of the FIFOMessageQueue example presented in class. However, the C programming language has no support for monitors, and the Linux kernel does not directly provide monitor-style synchronization. Therefore, you will have to simulate the basic monitor operations using the existing synchronization methods described in Chapters 8 and 9 of Loves’ Linux Kernel Development.

The two most useful synchronization primitives for this project are probably spinlocks and semaphores. Spinlocks are defined in linux/spinlock.h. A spinlock can be used as the monitor lock for each individual mailbox. Note that each simulated kernel monitor should contain precisely one spinlock initialized to SPIN_LOCK_UNLOCKED. All monitor functions must acquire this spinlock before proceeding and release it after completion – i.e.,

spinlock_t mr_lock = SPIN_LOCK_UNLOCKED;

long sys_mailbox_send(...) {
   spin_lock(&mr_lock);
   /*   SendMsg function body      */
   spin_unlock(&mr_lock);
};

long sys_mailbox_rcv(...) {
   spin_lock(&mr_lock);
   /*   RcvMsg function body */
   spin_unlock(&mr_lock);
};
...     /*   other mailbox kernel functions */

The kernel can hold a particular spinlock on behalf of only one task at a time, so that at most one thread can be in a particular monitor at any give instant. A spinlock protects against the issues of multiprocessors, even if two tasks were running on separate CPUs. Recall that spinlocks are most appropriate for code that executes quickly and is guaranteed not to become blocked while holding the spinlock.

Spinlocks must not be used in the Linux kernel for code that might sleep, take page faults, or otherwise wait for a long time. If a task might need to block itself — for example, to await an external event such as the sending of a message by some other task — then some mechanism for simulating a condition variable should be used. Recall that when a task waits on a condition variable within a monitor, it releases the monitor lock, goes into the blocked state, and reacquires the monitor lock when it is awakened again. In normal monitor semantics, another task may signal the condition variable and, if any task is waiting on that condition variable, precisely one of the tasks is unblocked. However, if no task is waiting, the signal has no effect.

This is a little bit harder to simulate using the Linux kernel synchronization primitives, but it can be done with a semaphore and a counter indicating the number of waiting tasks. Semaphores are defined in asm/semaphore.h. This defines the structure struct semaphore, which may be allocated statically or dynamically. Since mailboxes are created dynamically, you probably want to dynamically create semaphores for them by

sema_init(sem, count)

where sem points to your semaphore structure and count is its initial value. Functions for operating on semaphores include

down(struct semaphore* s)
down_interruptible(struct semaphore* s)
down_try_lock(struct semaphore* s)
up(struct semaphore* s)

The down operations are attempts to decrease a semaphore, also known in CS-502 as wait_s and popularized by Dijkstra as P. The up operation increases the value of the semaphore or, if other tasks are waiting on a down operation, it unblocks one task. The up operation was introduced in CS-502 as post_s and was popularized by Dijkstra as V.

For a task that needs to wait on a simulated condition variable in your monitor, the following pseudo-code is suggested, where mr_lock is the monitor lock, sem is the semaphore for the condition variable (initially zero), and wait_count is a count of the number of tasks waiting on the condition variable (also initially zero).

/* assume that the task is holding the monitor lock */
wait_count++;           /* increment # of waiting tasks */
spin_unlock(&mr_lock); /* release monitor lock */
down_interruptible(&sem);    /* wait on the semaphore */
spin_lock(&mr_lock);    /* reacquired the monitor lock */

To signal the condition variable, the following pseudo-code is suggested:–

/* the task is holding the monitor lock */
if (wait_count > 0)
up(&sem); /* unblock exactly one waiting task */
wait_count--; /* decrement # of waiting tasks */

Note that the signaling task decrements the wait count while holding the monitor lock, in order to avoid a race condition between waiting tasks and signaling tasks.

Flushing the message queue

When you stop the mailbox, it is necessary consult the wait count to see if any tasks need to be unblocked. If so, unblock them in a while loop while holding the monitor lock. Note that any tasks blocked on SendMsg or RcvMsg would be immediately unblocked but would receive an error indicating that the message could not be sent because the mailbox is stopped. The message queue can only be flushed if all blocked receiving tasks have reacquired the monitor lock and dequeued their messages.

Note: There appears to be the potential for a race condition between the flushing task and receiving tasks that had been previously blocked and signaled, but that have not yet finished dequeuing their messages. Students need to address this race condition.

User-space support

No application program will call the kernel directly. Instead, you must implement a user-space support module that can be bound with an application program to invoke the message facility. This module must implement the interface mailbox.h as specified earlier in this document. Your implementation should be compiled into a file mailbox.o that can be bound with any test program, either yours, the instructor’s. or another student’s.

Testing your Message System

Testing any inter-process synchronization system is difficult. In previous terms, students were asked to create multiple processes or threads to add up numbers or to do a matrix multiplication in parallel (see Silbershatz, pp. 236-241). This turned out to be too simple and did not flush out the typical mistakes and problems that occur in the design of a multi-task synchronization problem such as these mailboxes.

A year ago, we tried to build a distributed Game of Life to more thoroughly exercise the message passing system. This does expose the kinds of problems that occur in synchronization systems, but the complexity of the Game itself made the project too difficult.

Therefore, we now leave it up to the student to develop a test program that stimulates the message passing system. Such a program should fork a number of processes, passing its own process ID to each one.

The parent should then randomly sleep and read messages. It should forward process IDs of its children randomly to other children so that they can exchange messages with their peers. Each child process should also randomly sleep, read messages, and send messages. When sending a message, a process should make a record of that message so that it can determine whether it has received a reply. When a process receives a message, it should sleep a small, random amount of time, and then reply back to the sender.

The general idea is to get lots of messages going back and forth to stress the message system. However, to debug the test program, it is useful to send and receive messages in a simple, controlled way at first, before expanding to more complex patterns of communication.

General Notes on Submission of this Assignment

Be sure to put your name at the top of every file you submit and/or at the top of every file that you edit!

Submit your assignment for grading as directed in class. We will use the web-based Turnin tool developed by Professor Fisler’s group. A brief introduction can be found at

http://web.cs.wpi.edu/~kfisler/turnin.html

and access to the turnin system itself can be found at

http://turnin.cs.wpi.edu:8088/servlets/turnin.ss

For purposes of Turnin, this assignment is Project 2.

Your submission should include

· A patch file with all of your modifications of the kernel after you applied the pre-patch file. Be sure to remove all “~” files before making the patch file.

· A Makefile to compile the user-space interface module and your test programs.

· The source code for the user-space interface module and your test programs. You should also include a copy of mailbox.h here to simplify the Makefile.

· A write-up describing your implementation and any problems or issues.

Do not put separate parts of the assignment in separate folders or specify separate makefiles for them. Do not zip everything together into a zip file.

Grading

This is the largest programming project of the term, representing 20% of the total grade for the first seven weeks of the course. Grading will be allocated approximately as follows: –

· Patch file that correctly modifies the kernel based on pre-patch and correctly builds — 1 point.

· Creating and deleting mailboxes — 5 points.

· Correctly establishing the monitor and synchronization for a mailbox — 5 points.

· Correctly using the slab-allocator for messages — 5 points.

· Stopping and starting mailboxes and correctly handling blocked calls when a mailbox is stopped — 5 points.

· User space support module that implements the mailbox.h interface — 3 points

· Test program that thoroughly exercises the mailbox facility from multiple processes and threads — 4 points.

· Write-up describing your implementation — 2 points

Partial credit

In past experience, some students found this assignment straightforward, while others found it virtually impossible. You are encouraged to summit a partial project for partial credit. In your write-up, explain what worked and what did not work (and why). Be sure to include the patch file for your kernel.

Individual Assignment

This is an individual project, not a team project. Each student should submit his/her own work, not copies of jointly developed code.

Nevertheless, if you are puzzled or unsure of some aspect of this assignment, you should consult your friends, colleagues, the class e-mail list, or the instructor to clarify your understand or to help develop an approach to the problem.

Appendix – Monitor example from class

[1] In a professional setting, the chief architect of the project might insist on using slab allocation for mailboxes. However, to keep things simple in this project, kmalloc()and kfree()are acceptable.

[2] A case could be made for pipelined parallelism, but experience of students in previous terms suggests that such an approach would make the project more difficult.