CS 2005 Techniques of Programming WPI, B Term 1996
Craig E. Wills Lab 5
Assigned: Wednesday, December 4, 1996

Not only is it important to know how to implement a program using a language such as C/C++, it is also important to know that given a problem, it is not always the most appropriate action to write a program to solve the problem. This lab is designed to introduce useful software tools that are available with the Unix Operating System and can be used to solve problems without coding a C/C++ program to do so.

At the end of this lab you should be familiar with a number of tools and how to put these tools together in the form of a shell script. You are encouraged to learn more about these tools from the ``man'' pages and other Unix references. To illustrate the tools you will be using a sample of a gradetable and nametable for this course. You will first need to copy these files using the command

> cp /cs/cs2005/pub/labs/lab5/* .

You should now have a file gradetable in your current directory that contains a dump from a spread sheet containing userids, total points, project 1 grades, exam 1 grades and the grades of the first three labs. Fields in the file are separated by colons. You will also have the file nametable that contains userids, student names and lab section numbers. Use the command cat to see the contents of these files.

wc and grep

Two useful commands are wc, which returns the line, word and character count of a file and grep, which filters the contents of a file based on a regular expression. Execute the following script of commands on your copy of gradetable and see the results. If you do not understand the command or its output ask the TA or a PLA.

> wc gradetable
> grep moe gradetable
> grep ^b gradetable
> grep ^b gradetable | wc

The first command shows the number of lines, words and characters in gradetable. The second command shows all lines containing the string ``moe''. Grep is a useful command for quickly searching a file (or a set of files) for a string. For example ``grep foobar *.c'' will search all C files in the current directory for lines containing the string ``foobar''. The third command above finds all lines beginning (indicated by ``^'') with a ``b''. The last command show combining the two commands using a pipe where the output of the first command is passed directly to the second command. The compound command has the net effect of counting lines beginning with a ``b''.

sed

The command sed is a stream editor for taking an input stream (or file) and performing editor manipulations on the text. It is often used in a script of commands. Two example uses are shown below that you should try.

> sed -e 's/:/ /g' gradetable
> sed -e '1,3d' gradetable

The first example changes all (indicated by the ``g'') occurrences of a colon to a space while the second example deletes the first three lines. The sed command does not actually change the file contents, but is used to filter information that is in the file.

awk

The awk command is powerful and allows users to write little ``programs'' to operate on individual fields of each line in a text file. The general format of an awk program is a list of condition{action} pairs where the condition is optional. An awk program can either be specified in the command line or in a separate file. Create a file ave.awk with the following contents.

# script of commands to average the 4th field
BEGIN{sum=0.0} {sum = sum + $4} END{print sum/NR}

This program sums up the 4th field of each line and at the end prints the average using the built-in variable NR, which represents the number of records (lines).

The following examples illustrate the use of awk.

> awk -F: -f ave.awk gradetable
> awk -F: '{print $1}' gradetable

The first example shows the use of the program in ave.awk. The -F option is used to specify the field separator (by default it is white space). The second example shows a simple awk program in the command line to print the first field of every line.

Shell Scripts

Each of these commands is useful, but they are particularly powerful when combined to together in a shell script. A shell script is simply a list of commands that are executed as a unit. You can use any commands that you would enter at the command line. The following shell script combines the contents of the files gradetable and nametable into a single output. You should create the file mesh with these contents:

#! /bin/tcsh -f
# this is a comment
set IDLIST=`awk -F: '{print $1}' gradetable`
foreach i ($IDLIST)
    grep ^$i nametable 
    grep ^$i gradetable | sed -e 's/:/ /g' 
end

After creating this file you will need to execute the command ``chmod u+x mesh'' to make the script executable. Having done so you can simply execute what is now a command mesh. The first line of the script indicates the shell to use. The second line is a comment. The third line executes the awk command we previously used to get a list of userids. Backquotes (``) cause the command inside to be executed with the results in this case stored in the shell variable IDLIST. The command foreach is available from the shell and iterates through the list stored in IDLIST by successively setting the variable i to the next value in the list. Inside the loop, the corresponding line is printed from each file with all colons replaced by spaces from the gradetable line. Note: the sed command is reading its input from the pipe rather than a file.

Turnin

Modify the mesh shell script so that rather than printing the entire line from gradetable each time in the loop only print the second field (first numeric field). Hint: use the awk command. Once you have the new mesh command working create a script using the script command and cat your new mesh shell script along with executing it. Type exit to close the script file and the use the turnin command to turn it in:

/cs/cs2005/bin/turnin lab5 lab5.script

As an optional exercise, the file /etc/passwd contains password information for each user with fields separated by a colon. Use grep and the other commands to find information about you and friends.