Not only is it important to know how to implement a program using a language such as C/C++, it is also important to know that given a problem, it is not always the most appropriate action to write a program to solve the problem. This lab is designed to introduce useful software tools that are available with the Unix Operating System and can be used to solve problems without coding a C/C++ program to do so.
At the end of this lab you should be familiar with a number of tools and how to put these tools together in the form of a shell script. You are encouraged to learn more about these tools from the ``man'' pages and other Unix references. To illustrate the tools you will be using a sample of a gradetable and nametable for this course. You will first need to copy these files using the command
> cp /cs/cs2005/pub/labs/lab5/* .
You should now have a file gradetable in your current directory that contains a dump from a spread sheet containing userids, total points, project 1 grades, exam 1 grades and the grades of the first three labs. Fields in the file are separated by colons. You will also have the file nametable that contains userids, student names and lab section numbers. Use the command cat to see the contents of these files.
Two useful commands are wc, which returns the line, word and character count of a file and grep, which filters the contents of a file based on a regular expression. Execute the following script of commands on your copy of gradetable and see the results. If you do not understand the command or its output ask the TA or a PLA.
> wc gradetable > grep moe gradetable > grep ^b gradetable > grep ^b gradetable | wc
The first command shows the number of lines, words and characters in
gradetable. The second command shows all lines containing the string
``moe''. Grep is a useful command for quickly searching a file (or
a set of files) for a string. For example ``grep foobar *.c'' will
search all C files in the current directory for lines containing the string
``foobar''. The third command above finds all lines beginning (indicated
by ``^
'') with a ``b''. The last command show combining the two
commands using a pipe where the output of the first command is passed
directly to the second command. The compound command has the net effect of
counting lines beginning with a ``b''.
The command sed is a stream editor for taking an input stream (or file) and performing editor manipulations on the text. It is often used in a script of commands. Two example uses are shown below that you should try.
> sed -e 's/:/ /g' gradetable > sed -e '1,3d' gradetable
The first example changes all (indicated by the ``g'') occurrences of a colon to a space while the second example deletes the first three lines. The sed command does not actually change the file contents, but is used to filter information that is in the file.
The awk command is powerful and allows users to write little
``programs'' to operate on individual fields of each line in a text file.
The general format of an awk program is a list of
condition{action}
pairs
where the condition is optional. An awk program can either be
specified in the command line or in a separate file. Create a file
ave.awk with the following contents.
# script of commands to average the 4th field BEGIN{sum=0.0} {sum = sum + $4} END{print sum/NR}
This program sums up the 4th field of each line and at the end prints the average using the built-in variable NR, which represents the number of records (lines).
The following examples illustrate the use of awk.
> awk -F: -f ave.awk gradetable > awk -F: '{print $1}' gradetable
The first example shows the use of the program in ave.awk. The -F option is used to specify the field separator (by default it is white space). The second example shows a simple awk program in the command line to print the first field of every line.
Each of these commands is useful, but they are particularly powerful when combined to together in a shell script. A shell script is simply a list of commands that are executed as a unit. You can use any commands that you would enter at the command line. The following shell script combines the contents of the files gradetable and nametable into a single output. You should create the file mesh with these contents:
#! /bin/tcsh -f # this is a comment set IDLIST=`awk -F: '{print $1}' gradetable` foreach i ($IDLIST) grep ^$i nametable grep ^$i gradetable | sed -e 's/:/ /g' end
After creating this file you will need to execute the command ``chmod
u+x mesh'' to make the script executable. Having done so you can
simply execute what is now a command mesh. The first line of the
script indicates the shell to use. The second line is a comment. The
third line executes the awk command we previously used to get a
list of userids. Backquotes (``
) cause the command inside to
be executed with the results in this case stored in the shell variable
IDLIST. The command foreach is available from the shell and
iterates through the list stored in IDLIST by successively setting
the variable i to the next value in the list. Inside the loop, the
corresponding line is printed from each file with all colons replaced by
spaces from the gradetable line. Note: the sed command is
reading its input from the pipe rather than a file.
Modify the mesh shell script so that rather than printing the entire line from gradetable each time in the loop only print the second field (first numeric field). Hint: use the awk command. Once you have the new mesh command working create a script using the script command and cat your new mesh shell script along with executing it. Type exit to close the script file and the use the turnin command to turn it in:
/cs/cs2005/bin/turnin lab5 lab5.script
As an optional exercise, the file /etc/passwd contains password information for each user with fields separated by a colon. Use grep and the other commands to find information about you and friends.