Emerging new technologies in real-time operating systems and network protocols along with the explosive growth of the Internet provide great opportunity for distributed multimedia applications. Multimedia over a Wide Area Network often suffers from delay, jitter and packet loss. Packet loss in particular can be extremely high on the MBone and the Internet. Unlike traditional applications, multimedia applications can tolerate some data loss. A small gap in a video stream may not impair the perceptual quality as much, and may not even be noticeable to some users. However, too much data loss can result in unacceptable media quality.
User perceived loss is often greater than the network loss alone. Since raw video usually requires considerable bandwidth, most video streams are compressed, often with MPEG [MP96]. To achieve a high compression rate, MPEG exploits temporal redundancies of subsequent pictures. I-frames, for Intra-coded, are self-contained. P-frames, for Predictive-coded, are encoded using information from previous I-frames and/or all previous P-frames. B-frames, for Bi-directionally predictive-coded, are encoded using information from previous and following I- and/or P-frames. The loss of one P-frame can make some other P- and B-frames useless, while the loss of one I-frame can result in the loss a sequence of P- and B-frames.
Building upon the work of [HSH+95], our research approach combines media specific forward error correction and repetition error concealment [PHH98] to MPEG video. The sender piggy-backs a small, low-quality redundant copy of each primary video frame with the next video frame. If a primary frame is lost, the receiver replaces it with the lower-quality redundant frame from the next packet. If the redundant frame is also lost, the previous frame is repeated for error concealment. Figure 1 depicts our approach.
|Figure 1 - Video Redundancy. This figure depicts our use of video redundancy and repetition to repair lost packets. The blue boxes represent the high-quality primary frames. The red boxes represent the low-quality secondary frames, which are piggy-backed with the subsequent primary frames. The 2nd and 3rd packets are lost.|
We present our evaluation of the effects of our technique on user perceptual quality of video in Section 2. Since the redundancy added to the video stream requires extra processing time and network bandwidth, we briefly present our analysis of the system overhead in Section 3. We briefly summarize the impact of our approach in Section 4, and provide a demonstration of the end-quality video in the presence of loss in Section 5.
We simulated the effects of our technique on MPEG video streams in the presence of packet loss by building movies that repeat frames if there is no redundancy and use a low quality frame when using redundancy. MPEG includes a quality number that we use for the high-quality primary frames and the low-quality secondary frames. The higher the MPEG quality number, the higher the compression rate and the lower the frame quality, and vice versa. We encoded the primary frames with quality 1, which is the best quality. We encoded the secondary frames with quality 25, out of maximum 31. We chose a quality level of 25 for the secondary frames based on tests that indicated users could notice the degradation of the clearness, but the frames still conveyed the basic content information.
Based on observed Internet loss patterns [GBC98], we choose three loss rates and three consecutive loss patterns for examination:
We recorded 22 video clips from sports, sitcom, news and cartoon television shows. Two clips were shown without any loss, ten were redundancy repaired with the above five combinations of loss rate and loss pattern, and ten were unrepaired with the same five combinations of loss rate and loss pattern. The 22 clips were ordered such that the video clips with relatively low quality were not clustered together.
We had 42 users participate in the study. Each user was first shown a video clip without loss in order to `prime' all user expectations equally. Then, each each user indicated the quality of each clip with a score from 0 to 100, based on clearness and continuity.
|Figure 2 - Effects of Percent Loss on Perceptual Quality. The x-axis represents percent loss, ranging from 0% to 20%. The y-axis represents average score of the perceptual quality we gathered from the user study. The error bars represent the confidence intervals with the probability confidence to be 95%.|
Figure 2 depicts the effects of percent loss on perceptual quality. The average score for video with no loss is 72. Video redundancy improves the quality of the video by 20% in the presence of low loss (1% raw loss rate). With high raw loss rate (20%), redundancy improves the quality of the video by 65%. For a 1% frame loss, the average score for redundancy repaired video is 69, which is very close to that of video with no loss. The average for 1% loss with redundancy repair falls within the range of the confidence interval of the average quality for perfect videos. The difference between the qualities of these two kinds of videos is small and cannot be noticed in some cases. With the same percent loss, there is no overlap between the confidence intervals of the perceptual quality scores with redundancy and the perceptual quality scores without redundancy. Without video redundancy, the quality of the frame decreases dramatically to 58, which shows a big perceived difference between 0% and 1% loss. Apparently, users can easily notice the seemingly small degradation in the quality.
Overall, with the increase of the percent loss, the quality for both redundancy repaired videos and normal videos decreases exponentially. However, the perceptual quality with redundancy repair decreases much less than without. Without redundancy, the average quality score is 30 while the average quality score with redundancy is 49.
|Figure 3 - Effects of Consecutive Loss on Perceptual Quality. The x-axis represents the number of the consecutive frames lost. The y-axis represents the average perceptual quality. The raw loss rate is 20%. The error bars represent 95% confidence intervals around the mean.|
In Figure 3, the average quality increases as the number of consecutive losses increases. With higher consecutive loss and the same loss rate, there are fewer gaps within the stream than within the single losses. Thus, fewer dependent frames are lost because of the loss of other frames. Consecutive loss also makes redundancy less useful. The average perceptual quality for redundancy repaired video clips increases when the consecutive loss number changes from 1 to 2, but it decreases when the number increases further. For single losses, the redundancy can always be received and thus the loss can always be repaired. With the existence of consecutive loss, the redundancy can be lost with the primary frame in a sequence. With a 2-frame loss, it is likely that no important frames, such as I- or P-frames, are lost. However, with a 4-frame loss, there will always be one I- or P-frame within the lost frames that quite likely cannot be repaired.
Although our user study indicated that redundancy can improve the perceptual quality of video in the existence of packet loss, the secondary frames require additional buffer space and more processing time. As the emphasis of this article is on a demonstration of the effects of video redundancy on perceptual quality, in this section, we only briefly analyze the overhead that the low quality redundancy adds to the system. Details on the system analysis can be found in [Liu99].
Figure 4 shows the average ratios of the size of the overhead over the size of the primary frames, along with 95% confidence intervals. Overall, the average overhead from video redundancy is about 13% for each frame and about 9-10% in total number of bytes. Based on the sizes of the confidence intervals, there is little variance in the overhead for I- and P-frames.
|Figure 4 - Frame Size Overhead. The ratios of the secondary (redundant) frames over the primary frames is shown. "All" on x-axis represents all the I-, P- and B-frames. "I" represents I-frames, "P" represents P-frames and "B" represents B-frames. The error bars represent 95% confidence intervals around the mean.|
In summary, in this document, we presented a best-effort solution to ameliorate the effects of network data loss for video data transmission. Our approach piggy-backs redundant video frames within the transmitted video stream in order to repair lost frames. Our evaluation included user tests with over 40 users watching over 10 hours of video, and overhead analysis of over 25 video clips from a variety of television shows. This extensive analysis indicates that with the addition of about 10% overhead, video redundancy can greatly improve the perceptual quality of video streams in the presence of packet loss.
We include 6 video clips that demonstrate the effects of video redundancy on different loss conditions. The loss characteristics of each clip are included in the table below. Each video clip is encoded at 30 frames per second, but uses frame repetition to give the user the perception of 5 frames per second, a more realistic video frame rate over a WAN. Observations on each follows the table.
|"Perfect" clip||a||0||-||-||72||17 MB|
|Low loss, no repair||b||1%||1||no||58||16 MB|
|Low loss, repair||c||1%||1||yes||69||17 MB|
|High loss, no repair||d||20%||1||no||30||9 MB|
|High loss, repair||e||20%||1||yes||49||17 MB|
Consecutive high loss,
Although this video is of "perfect" quality, users only rated this clip 72 out of a maximum 100.
Even with only 1% loss, the degradation in quality is quite noticeable. Users rated this clip an average of a 58, a drop in quality of nearly 20% versus a video clip with no loss.
By using video redundancy, the 1% loss is nearly completely repaired. Users found the quality of this video clip the same as the quality of a clip with no loss.
At 20% loss, the video quality is severely degraded. Users rated the quality of this clip less than half that of a clip with no loss.
At 20% loss, even video redundancy cannot improve the video to a quality level anywhere near perfect. Still, users found the quality of this repaired clip 50% better than a clip with similar loss that was unrepaired.
Consecutive-frame losses, most common at high-loss rates, do not noticeably change the video quality versus single-frame losses. This holds true for redundancy repaired video, too.
[GBC98] Jason Gerek, William Buchanan and Mark Claypool, MMlib - A Library for End-to-End Simulation of Multimedia over a WAN. MQP CS-MLC-ML98, Worcester Polytechnic Institute, Worcester MA, May 1998.
[HSH+95] Vicky Hardman, Martina Angela Sasse, Mark Handley and Anna Watson, Reliable Audio for Use over the Internet. In Proceedings of Internet Society's International Networking Conference (INET), Ohahu, Hawaii, 1995.
[Liu99] Yanlin Liu, Video Redundancy - A Best-Effort Solution to Network Data Loss. Master Thesis, Worcester Polytechnic Institute, May 1999.
[MP96] Joan L. Mitchell and William B. Pennebaker, MPEG Video: Compression Standard. Chapman & Hall, ISBN: 0412087715, 1996.
[PHH98] Colin Perkins, Orion Hodson and Vicky Hardman, A Survey of Packet-Loss Recovery Techniques for Streaming Audio. IEEE Network Magazine, September/October 1998.