Multimedia Networking Project 2

Speak - a Simple VoIP Application

Due date: March 8^th by 11:59pm

Index

Overview
Details
Hints
Hand In
Grading

Overview

An audioconference, or Voice over IP (VoIP) application, allows people to talk to each other from computers connected across a network. Although networked computers have been able to do audio well for over 20 years, the explosive growth in connectivity and capacity on the Internet has fueled interest in VoIP.

For this project, you are to write a basic two-person VoIP application called Speak and explore how some basic system parameters effect the quality of the audio stream. Speak will incorporate aspects of speech detection from your project 1, to avoid sending unnecessary silent packets onto the network.

Speak can have a minimal user interface, but needs to support some command line parameters (or basic menu interface) to allow varying of system parameters. You are free to add any additional features, as you see fit.

Details

You can develop Speak on pretty much any OS: Windows, Mac or Linux. You will have to get it working on two machines connected by an Internet connection.

Speak will use standard Internet sockets to make connections between the VoIP processes. From any Internet host, a user running Speak should be able to connect to another user running Speak from any other Internet host (not withstanding firewall or NAT issues), so you need a way to specify the hosts at run-time. You may wish to make the port numbers to which they connect dynamic, too, but that is optional.

Speak needs to support both TCP and UDP sockets. You can have a default connection type, but there should be a way the user can specify the socket type when Speak starts.

Speak should support a variety of samples intervals. Typical VoIP clients take chunks of audio from the audio device every 20, 40 or 60 ms in order to keep latency low. You may choose one of these for the default, but must then provide a means to specify alternate sample sizes (up to a second) when Speak starts. Running Speak at larger sample intervals will give you some insight in how latency makes interactive communication difficult (you will evaluate this effect in project 2b).

Speak can enable basic speech detection, if indicated by the user, at startup time. Since the size of your speech chunks will likely be much smaller than the sample interval used in project 1, searching backward (or forward) for a zero-level crossing rate for 250ms is not practical. Thus, you can detect speech based on Energy levels only for the sound chunk sampled. You can tune your speech detection thresholds to work well in your environment and may want to make thresholds startup parameters.

In order to evaluate how Internet packet loss affects audio, Speak must be able to randomly drop packets it receives. Loss should be done on a packet level and at various rates when Speak starts (again, you will evaluate this in project 2b).

Hints

There are many different architectural solutions you can have for your implementation of Speak.

Windows Implementation

This sub-section has some Windows-specific hints.

Here is some sample code showing system calls that you may find helpful. Some must be used while others may be used depending upon your implementation:

A simple header file to generate error messages for Winsock errors: sockerr.h
Basic TCP sockets: tcp-talk.cpp, tcp-listen.cpp
Basic UDP sockets: udp-talk.cpp, udp-listen.cpp
Get the IP address of a host: getIPaddress.cpp

Linux Implementation

This sub-section has some Linux-specific hints.

Here is some sample code showing system calls that you may find helpful. Some must be used while others may be used depending upon your implementation:

Basic TCP sockets: talk-tcp.c, listen-tcp.c
Basic UDP sockets: talk-udp.c, listen-udp.c
Setting a timer: setitimer.c
Using POSIX threads: add2.c - add and subtract to an variable protected by a mutex (compile with -lpthread)
Allowing multiple interrupts: select.c
Parsing command line parameters: get-opt.c

All of the above sample calls work in Linux but may work in other environments, especially Unix environments such as Cygwin, as well.

Use the man command to find out additional information on the system calls used.

Hand In

You must turn in:

All source code used in your project, including header files. Please include a Makefile, too, for building your code.
A README file with any special instructions or platform requirements for running your application. Be especially careful to provide details on how to start Speak and change the system parameters. Provide some examples of how your program could be started up.
An executable (compiled speak) that can be run, making sure to have all included libraries, etc.

You will use email to turn in your files. When ready, create a directory of your project based on your last name (i.e. claypool) and tar up (with gzip or winzip) your files, for example:

    mkdir claypool
    cp * claypool  // copy all your files to submit to the claypool dir
    tar czvf claypool.tgz claypool

then attach claypool.tgz to an email with "cs529-proj2" as the subject.

Grading

The grading breakdown is as follows:

60% Basic 2-way VoIP Application (TCP or UDP), with parameters.

15% Uses both UDP and TCP.

5% Supports Speech Detection (add 5 pts).

5% Works under all 'Loss' parameters (add 5 pts):

5% Works under all 'Delay' parameters (add 5 pts):

10% Clear, well-written code, including comments. Should be easy to "read" and understand.

Below is a general grading rubric:

100-90. The project clearly exceeds requirements. Connection and communication work flawlessly. Both TCP and UDP are equally robust. All parameters are fully functional and implemented. Speech detection works effectively. Code is clear, commented and well-written.

89-80. The project meets requirements. Connection and communication are fully functional. Both TCP and UDP connections are implemented. Most parameters and speech detection are implemented and functional. Code is clear, adequately commented and adequately well-written.

79-70. The project barely meets requirements. Connection and communication are mostly functional, with an occasional problem. Either TCP or UDP is not implemented or fully functioning. Some parameters and speech detection are implemented and functional, but others are not. Code may not be entirely clear, adequately commented or adequately well-written.

69-60. The project fails to meet requirements in some places. Connection and communication are only partly functional, with intermittent problems. Either TCP or UDP is not implemented or fully functioning. Parameters and speech detection are not fully implemented and functional. Code is not clear enough, adequately commented or well-written.

59-0. The project does not meet requirements. Connection or communication do not function or have significant problems, both for TCP and UDP. Parameters and speech detection are not implemented and functional, or are not testable. Code is not clear, adequately commented or well-written.

Return to the Multimedia Networking Home Page

Send all questions to the staff mailing list.