Analyzing Factors That Influence End-to-End Web Performance

Balachander Krishnamurthy and Craig E. Wills

AT&T Labs--Research
180 Park Ave
Florham Park, NJ 07932 USA
bala@research.att.com
Computer Science Department
Worcester Polytechnic Institute
Worcester, MA 01609 USA
cew@cs.wpi.edu

Abstract:

Web performance impacts the popularity of a particular Web site or service as well as the load on the network, but there have been no publicly available end-to-end measurements that have focused on a large number of popular Web servers examining the components of delay or the effectiveness of the recent changes to the HTTP protocol. In this paper we report on an extensive study carried out from many client sites geographically distributed around the world to a collection of over 700 servers to which a majority of Web traffic is directed. Our results show that the HTTP/1.1 protocol, particularly with pipelining, is indeed an improvement over existing practice, but that servers serving a small number of objects or closing a persistent connection without explicit notification can reduce or eliminate any performance improvement. Similarly, use of caching and multi-server content distribution can also improve performance if done effectively.

Keywords: Web Performance, Web Protocols, End-to-End Performance, Active Measurement

1 Introduction

Several projects have studied factors that influence the performance of the Web using Web server logs, proxy logs and client traces. Significant effort has gone into improving the protocol on which the Web is based and the new version of HTTP, namely HTTP/1.1 has been recently upgraded to a draft standard by IETF [4]. However, to the best of our knowledge, no one has performed an end-to-end test of the Web in terms of the factors that influence performance as perceived by users when they visit various Web sites.

This work grows out of independent work by the authors for testing different aspects of Web performance. It is a natural follow-up PROCOW study [9], an original large scale study examining if the Web servers running at popular Web sites around the world claiming to run HTTP/1.1 were indeed compliant with the the HTTP/1.1 protocol specification. The PROCOW study examined the compliancy of the servers by sending valid HTTP/1.1 client requests from several places in the world. The major conclusion of the PROCOW study was that many sites were not fully compliant with the HTTP/1.1 protocol and some sites were turning off the new features in HTTP/1.1. The work described in this paper also follows up on the methodology of the work examining content reuse and server responses relevant for Web caching [19].

In this work we build on the PROCOW infrastructure and use nine client sites around the world to test how various factors affect the performance of the Web. Our study is based on the key set of changes between HTTP/1.0 and HTTP/1.1 [10] to test their impact on performance. We also examine their impact on performance in conjunction with other factors such as caching and distributed Web content.

Our study is significant in that it examines the performance of various protocol options by measuring end-to-end response for actual Web servers from a variety of client sites. This is a significantly broader study compared to that of Nielsen, et al [17], which measured the performance of HTTP/1.0 and HTTP/1.1 in a controlled setting for a single synthesized Web page. We gathered a set of active measurements by sending requests from various client sites around the world to over 700 popular sites to help quantify the benefits of HTTP/1.1 features in conjunction with other factors. These include persistent connections, persistent connections with pipelining, range requests, caching and responses from multiple servers for a single Web page.

The collection of Web servers tested are representative of the interesting set of Web servers since we attempted to include servers to which a significant portion of the request traffic is addressed. We do not claim our client sites are representative of all users, but they do include a variety of sites. We ensured that there is some degree of control in our experiment but given that the Web experience changes fairly often, our study mimics real life. In analyzing the results, we look for trends rather than absolute numbers. The fact that the results are relatively consistent over multiple clients and that we make some measurements from the same client/server pair over multiple days at the same time gives us some degree of belief in the repeatability of our experiments.

In the remainder of this paper we describe the factors studied in our work, followed by a discussion of the methodology we used in performing the study. The middle portion of the paper presents the results from our study on the test sets we use followed by a discussion on possible implications of these results for Web servers, caches and the HTTP protocol. The paper concludes with a description of related work followed by a summary and our own directions for future work.

2 Study

There are numerous factors involved in the end-to-end performance of retrieving a Web page over the Internet. In the following we describe the specific factors we study (and those we do not) in this work and our reasons for doing so. In addition there are many factors that we do not explicitly study, but consider in the testing and analysis of the various factors.

2.1 Factors Studied

The factors we explicitly study are:

Protocol Options. What is the effect of different HTTP options on performance? HTTP/1.0 has been used by clients and servers for a long time, but clients and servers have been moving to HTTP/1.1. The de facto standard use of HTTP/1.0 has been for clients to open multiple parallel connections to the same server in order to improve performance over serialized connections. In HTTP/1.1 persistent connections are supported, which can either be used for serialized request/response pairs or for pipelining multiple requests and then receiving multiple responses. The relative performance of these protocol options was studied by Nielsen, et al, but it was done on a limited basis with a single artificially created page with 40 embedded objects. The experiment was useful to isolate specific low level issues on the impact of TCP on HTTP/1.1 and demonstrated the usefulness of pipelining/persistent connections under the test conditions. A primary goal of our work is to study these protocol options using real-life servers.
Caching. We do not study expected cache hit rates nor particular policies for cache replacement and coherency. These issues have been studied in numerous pieces of previous work such as [3, 11]. Rather we focus on studying the relative performance gains for different levels of caching effectiveness. These relative gains may vary according to the protocol option being used, the cache hit rate and whether validation of cached content is required.
Multi-server content. Rather than serve the base page and all embedded images from the same base server, Web sites are serving some or all of the embedded objects from different servers--both servers located at the same site and on servers located at different sites. This approach reduces the load on the base server, but does it improve the response time for the end user?
Byte range requests. This feature of HTTP/1.1 allows a client to request only parts of an object rather than the entire object. Obviously, there is a clear gain in bandwidth usage since fewer bytes are shipped. But does this translate into significant savings in user perceived latency? For example, would a proxy cache be better off retrieving the entire object in response to a client request for a specific byte range request? If the time to respond to a byte range request is not significantly less than the time to respond to a full content request then the proxy could retrieve the entire contents for possible later use.

2.2 Factors Considered

In studying these factors there are numerous other factors that must be accounted for in our testing and analysis. These factors are:

Network delay. The magnitude and variance of network delay between a client and server can have a major influence on the total response time for serving a set of Web objects from the server to the client. In an ideal study we would be able to control this variable as we conduct various tests, but to do so requires an artificial testbed, which would raise questions about the applicability of any results to actual networks. Rather, our approach is to test with real networks and account for effects of variable network delays in our analysis. Tests of the different protocol options to the same server are tested consecutively so they are done under similar network conditions. We also tested client/server pairs at different times of day to vary the network conditions under which measurements were taken.
Server load. The server load can have a major influence on the time to service a request. As is the case for network delay, we do not control, or even know, the load of a server under test, but we always test different protocol options between a client and a server at approximately the same time and we do vary the time of day in which testing is done.
Number of objects and total bytes. The time to retrieve a Web page is dependent on the number and size of objects it contains. In our study we do not specifically control this variable, but rather we select a test set of Web pages and then account for the number and size of objects in our analysis.

2.3 Factors Not Examined

There are many other factors influencing the end-to-end performance in retrieving a Web page. The following are some that we considered for our current study, but did not include.

DNS lookup time. Once a server to IP address mapping has been obtained and is cached, the DNS lookup time is a trivial contribution to the overall performance. If the mapping is not cached then this contributes some time to overall performance. Because we expect most mappings to be cached in our study we did not explicitly study the lookup time.
Redirection. Some Web sites use HTTP redirection to direct user requests to another URL. We did not consider the cost of this redirection, but rather dropped such sites from our study.
Dynamic content. The amount of time for a server to serve an object stored as a static file should be less than the time to serve an object of the same size that must be dynamically generated. We plan to study this issue in the future.
Packet level performance. Examining packet level performance through tools such as tcpdump would be useful in understanding reasons for observed delays and variance in network performance. However, to keep a simpler test set up for distribution to a number of client sites around the world we chose not to include packet level performance issues at this time.

3 Methodology

Our basic methodology is to make active measurements on the effect of different protocol options for a large number of client/server pairs at different times. The previous study on HTTP/1.1 performance improvements used an artificial environment for testing [17]. Other Web performance studies have used logs and packet traces to obtain timing information, but this approach does not allow a controlled set of requests to be issued. Our approach was to identify a set of client and server sites for our tests.

We came to an early conclusion that it would be difficult to have a set of representative client sites since information about the distribution of clients and their network connections is both hard to obtain and verify. Additionally, to obtain a fair sampling of clients around the world would involve significant effort. We chose to sample from a set of client sites where we had professional connections. The client sites used in our study along with their location and network setup are

AT&T Research Labs, NJ USA, with multiple T-1 connections to the Internet.
A commercial site in Santiago, Chile (10 Mbps via fiber to Telefonica Net which has a slower link to the Internet and links to Cable&Wireless and Alternet via two hops of 45 Mbps ATM link.
Hewlett-Packard Labs, Palo Alto, CA, USA connects to the public Internet via one of four major ISP's depending upon the traffic's destination, each ISP being connected with one or more T-3 circuits.
ACIRI: AT&T Center for Internet Research at ICSI, Berkeley, CA, USA (10 Mbps link from ACIRI to UCB, then UCB to Internet over Calren).
University of Kentucky, Lexington, Kentucky, USA connected via UUNET over a DS3 (45 Mbps link).
A site belonging to a academic network UNINETT AS, in Trondheim, Norway (10 Mbps link to UNINETT-GW and increasing speed links to the Internet via NORDUnet).
University of Western Australia, Nedlands, Western Australia.
A private site in Cape Town, South Africa with a 64K Digital connection, similar to a US-standard 56K connection, to the Internet.
Worcester Polytechnic Institute, Worcester, MA USA with multiple T-1 links to the Internet.

The choice of servers is a bit easier since there is a necessary concurrence in the notion of ``popular'' sites. Advertisers depend on this information and sites would attempt to demonstrate that studies significantly lowering their standing in any ranking are incorrect. Accordingly, we went with a subset of the collection done for the PROCOW study [9], which used a combination of collection techniques. Briefly, it merged a combination of recognized rating sites (MediaMetrix [13], Netcraft [16], Hot100 [18]) and a set of sites that are likely to be popular given their business prominence (Fortune 500 [5] and Global 500 [7]). The end result was a list of 711 popular server sites for which we retrieved the home page and all embedded objects. In creating the list, we did not try to distinguish whether a server site was supporting HTTP/1.0 or HTTP/1.1.

The basic engine for making all retrievals in our study is httperf [15]. We obtained a publicly available copy of the software and modified it slightly to print out additional information needed for our study. The httperf software is attractive because it allows a set of objects to be retrieved from a server using the variety of 1.0 and 1.1 protocol options of interest to our study. The native software collects and prints out a number of statistics about the status and performance of retrieving the set of objects. Of particular interest to our study is that it records the number of server connections made and for each retrieved object, the time the request was sent, the time receipt of the response began and the time the complete response was received. For small objects the last two times may be the same.

The algorithm in Figure 1 describes the method used for a single test between a client and a server. It exploits features of httperf and overcomes certain limitations of httperf for the purposes of our study. The two primary limitations that we had to overcome was that httperf does not parse HTML code to retrieve embedded images and a single run of httperf could communicate with only one server. However, we exploited httperf's feature of retrieving a fixed set of URLs from a server using serial requests over separate connections, parallel requests over separate connections, serial requests over a persistent connection and pipelined requests over a persistent connection. The resulting algorithm starting with a base URL for a given server is shown in Figure 1.

1. Use httperf to retrieve the base URL from the server and store the results.
2. Parse the base URL code to determine all unique embedded objects.
3. Separate the base and embedded objects according to their server.
4. For each server containing needed objects {
    5. (serial-1.0) Use httperf to retrieve all objects using 
       serialized HTTP/1.0 requests.
    6. (burst-1.0) Use httperf to retrieve all objects using 
       up to four parallel HTTP/1.0 requests.
    7. (serial-1.1) Use httperf to retrieve all objects using 
       serialized requests over an HTTP/1.1 persistent connection.
    8. (burst-1.1) Use httperf to retrieve all objects using 
       pipelined requests over an HTTP/1.1 persistent connection.
}

Figure 1: Basic Algorithm to Test a Server from a Client

There are a number points to note about the algorithm. The initial retrieval in Step 1 is used to determine the set of objects to fetch. If all embedded objects are from the same server as the base URL then there will be only one server list in Step 3. The server list for the base server includes the base object. The burstiness of parallel connections in Step 6 and pipelined connections in Step 8 does not begin until the first object in the list is retrieved. Step 1 retrieves and stores the object contents. Steps 5-8 retrieve, but do not store object contents. Steps 5-8 are used for all performance measurements.

We used four parallel connections in the burst-1.0 method since that appears to be the default in the popular browsers (Netscape and Internet Explorer). With each additional parallel connection, there is a necessary load on the client and the server (more on the server since it has to have free TCP slots to handle several client connections in parallel).

This basic test is used to compare performance of each protocol option for a specific client and server. While the comparison of response times for an individual test may not be meaningful due to variation in network and server load, we use these individual tests as building blocks for measuring the relative performance of the protocol options over a large number of tests.

We used this basic test over two sets of test data. All tests were made in November, 1999. The first set of test data consists of the 711 previously identified servers. Tests of each of these servers were run once from each of the client sites. The tests last several hours so each client/server test may be run under different network conditions, but the four protocol options of each test are run under approximately the same conditions.

In addition, we ran the test in a more controlled setting from the AT&T, Chile and WPI client sites. For these tests we selected 72 server sites from a list of 200 sites identified on the current MediaMetrix, Netcraft and Hot100 sites. The selected sites were chosen because they supported pipelining and persistent connections in a test run on these sites. The controlled tests were run on the same fixed 6 hour intervals from each site for a week. This controlled test was designed to study caching and time of day factors.

4 Results

This section provides results to address the factors for study that were identified in Section 2.

4.1 Test Sets

The initial part of our results is analyzing the base set of statistics for our client/server test sets, and is shown in Table 1 for client/server pairs. The clients are our test sites, the servers are either PROCOW indicating we tested the 711 servers from the earlier PROCOW study or ``select'' indicating we tested with 72 servers repeated every six hours.

Client/ Server Set	Successful Retrieval of Base URL	Servers with Successful Object Retrieval	Multiple Object Servers	Perfect Conn. Persistence Servers (%)	Imperfect Conn. Persistence Servers (%)
att/procow	670	855	674	167 (25%)	121 (18%)
aciri/procow	673	858	667	223 (33%)	73 (11%)
aust/procow	667	854	664	201 (30%)	56 (8%)
chile/procow	674	862	671	200 (30%)	74 (11%)
hp/procow	665	854	662	201 (30%)	73 (11%)
uky/procow	645	824	635	196 (31%)	84 (13%)
norway/procow	668	856	662	128 (19%)	38 (6%)
safrica/procow	663	848	654	194 (30%)	92 (14%)
wpi/procow	657	834	662	192 (29%)	66 (10%)
att/select	1515	2588	1975	858 (43%)	206 (10%)
chile/select	1873	3161	2423	910 (38%)	288 (12%)
wpi/select	1897	3223	2456	1049 (43%)	274 (11%)

Table 1: Test Sets

Focusing on the PROCOW test set, the second column indicates the number of servers out of the 711 that returned an HTTP 200 response code (success) when the base URL was retrieved. Those not returning this value either returned 302 (redirection), 404 (not found), or the client timed out. Once the base URL is retrieved, all objects contained on this page were retrieved. As shown in Figure 1 multiple servers may be accessed to retrieve all objects. The third column of Table 1 shows a count of these servers that successfully returned all objects. The fourth column shows the number of these servers that return more than one object. We focus on these servers because persistent connections will not have an effect on client access time if only one object is retrieved. The last two columns in Table 1 show the number and percentage of multiple object servers exhibiting persistent connections.

The last two columns need further explanation: one focus of our study is to compare the performance of the four protocol options. Thus for a client/server test, we only want to consider cases where all objects needed from that server are successfully retrieved for all protocol options. If one or more of the four protocol option tests retrieves fewer than all objects then we discount that test (all four protocol options) for further study. In addition to the number of objects retrieved we also examine the number of TCP connections that are used. If an HTTP/1.1 test used as many TCP connections as objects then all objects have been retrieved, but not with any persistent connections. These test cases are also eliminated from further study. The remaining tests are classified as showing some connection persistence. These tests retrieve all objects for all protocol options and use fewer TCP connections than objects needed for both 1.1 options. Of this category, we classify tests using only one TCP connection for both HTTP/1.1 tests as ``perfect'' indicating all objects are retrieved in a single persistent connection. We characterize tests exhibiting some connection persistence, but not perfect connection persistence, as ``imperfect'' meaning that one or both of the 1.1 options used more than one TCP connection.

The last three rows of Table 1 show the same statistics for the selected set of servers tested periodically from three client sites. The last two columns indicate that these tests are a bit better in exhibiting a higher percentage of persistence, but the percentages are not as high as we would expect from a ``select'' group. These reduced numbers come from the inclusion of servers not exhibiting persistence in our initial tests due to an early error (subsequently fixed) in one of our analysis scripts. In addition, just because the base server supports persistence, it is possible that other servers providing its embedded objects may not. Again only tests exhibiting persistence are considered for further analysis.

As a final point on summarizing the test sets we note that the Nielsen, et al, paper on HTTP/1.1 also described two other performance improvements--cascading style sheets (CSS) and portable network graphics (PNG). As an interesting sidelight to our study we examined penetration of these improvements to the set of PROCOW servers. Examination of the AT&T results (typical for other results) showed 82 (12%) of the 670 base URLs using style sheets. A similar examination for PNG images found zero usage.

4.2 Protocol Options

In examining the various issues influencing end-to-end Web performance we first examined the impact of the four protocol options described in Figure 1. For this analysis we only consider the client/server tests exhibiting persistence. Results for the four protocol options from three of the client sites using the PROCOW test set are shown in Table 2 for servers that exhibit perfect connection persistence in the retrieval of objects. Results from the remaining client sites are shown in Table 8 at the end of the paper.

Client Site	Object Count Range	Range Pct.	Pct. Persistent for Range	Ave. Retrieval Time (Sec.) (Ratio with Burst-1.0)
Client Site	Object Count Range	Range Pct.	Pct. Persistent for Range	Serial-1.0	Burst-1.0	Serial-1.1	Burst-1.1
att	2-5	26%	49%	1.83 (1.1)	1.65 (1.0)	1.68 (1.0)	1.73 (1.0)
att	6-15	22%	26%	2.96 (1.3)	2.28 (1.0)	1.74 (0.8)	1.40 (0.6)
att	16+	28%	3%	3.71 (1.4)	2.70 (1.0)	1.72 (0.6)	1.21 (0.4)
att	Multi	76%	25%	2.25 (1.2)	1.88 (1.0)	1.70 (0.9)	1.61 (0.9)
chile	2-5	26%	48%	9.23 (1.5)	6.27 (1.0)	6.45 (1.0)	6.24 (1.0)
chile	6-15	22%	25%	23.45 (2.0)	11.73 (1.0)	12.08 (1.0)	9.13 (0.8)
chile	16+	28%	18%	45.11 (2.9)	15.52 (1.0)	23.51 (1.5)	13.06 (0.8)
chile	Multi	76%	30%	20.47 (2.1)	9.59 (1.0)	11.53 (1.2)	8.42 (0.9)
wpi	2-5	26%	46%	5.24 (1.1)	4.90 (1.0)	3.69 (0.8)	3.60 (0.7)
wpi	6-15	21%	25%	14.23 (1.7)	8.15 (1.0)	7.49 (0.9)	5.87 (0.7)
wpi	16+	28%	18%	26.87 (2.1)	12.87 (1.0)	14.28 (1.1)	8.20 (0.6)
wpi	Multi	75%	30%	12.24 (1.6)	7.46 (1.0)	6.97 (0.9)	5.18 (0.7)

Table 2: Servers Exhibiting Perfect Connection Persistence from Three Client Sites

Table 2 shows four lines of results for each client site. For each client, the first three lines are classifications based on the number of objects to be retrieved while the fourth line is a summary for all multi-object server tests exhibiting perfect connection persistence. The categorization is introduced to examine variations that occur due to the number of objects retrieved. The ranges of 2-5, 6-15 and 16+ are intended to reflect a small, medium and large number of objects for retrieval. The second column in the table reflects the relative percentage of servers with the given number of objects relative to the total count of servers (column 3 in Table 1). The fourth column in the table indicates the percentage of servers exhibiting perfect connection persistence among all servers with the given range of objects. For example, 49% of servers with 2-5 objects showed perfect connection persistence when tested from the AT&T client. The fact that only 3% of the servers with 16+ objects exhibited perfect connection persistence from the AT&T client is out of line with the performance of all other client sites in Tables 2 and 8. We do not have a clear explanation for this behavior, but do note that the percentage of the servers with 16+ objects exhibiting imperfect connection persistence from the AT&T client in Table 4 is actually higher than for other client sites.

The last four columns show the average retrieval time for each of the four protocol options in seconds. The number in parentheses is the ratio of the given time to the time for the burst-1.0 option. The ratio is intended to show the relative performance of each option relative to common HTTP/1.0 usage. Again illustrating with an example, the burst-1.1 option (pipelining with persistence) for 6-15 objects from the AT&T client took on average 1.40 seconds to retrieve. This time is approximately 60% (0.6) of the time taken to retrieve the same objects using burst-1.0.

Table 3 shows an alternate approach for presenting the relative performance of the four protocol options for perfect connection persistence servers. The table shows the relative variation in the results shown in Table 2. The retrieval times for each protocol option from a client site are compared against the time for the burst-1.0 option. If the absolute value of the difference is less than one second then the relative performance of these two options is considered the "same". If the protocol option exhibits better than one second performance improvement then this option is classified as "better" than burst-1.0 and if its performance is more than one second worse then it is classified as "worse". Table 3 shows the percentages of servers that are classified as better, the same and worse for each protocol option from the three client sites in Table 2. The results are consistent with the ratios given in Table 2, except they reduce the significance of differences for relatively well-connected clients such as AT&T where the retrieval times for all protocol options are relatively small.

Client Site	Object Count Range	Better/Same/Worse% Performance Relative to Burst-1.0
Client Site	Object Count Range	Serial-1.0	Burst-1.0	Serial-1.1	Burst-1.1
att	2-5	15/75/9%	0/100/0%	15/76/8%	15/74/11%
att	6-15	10/42/48%	0/100/0%	16/84/0%	22/70/8%
att	16+	29/29/43%	0/100/0%	43/29/29%	57/43/0%
att	Multi	14/63/22%	0/100/0%	17/77/7%	19/71/10%
chile	2-5	12/18/70%	0/100/0%	15/58/28%	23/59/18%
chile	6-15	6/0/94%	0/100/0%	32/21/47%	62/21/17%
chile	16+	0/0/100%	0/100/0%	7/0/93%	73/2/25%
chile	Multi	8/10/82%	0/100/0%	17/37/47%	43/38/20%
wpi	2-5	23/46/31%	0/100/0%	31/50/19%	33/54/13%
wpi	6-15	15/7/78%	0/100/0%	28/41/30%	41/48/11%
wpi	16+	12/2/86%	0/100/0%	30/21/49%	53/23/23%
wpi	Multi	19/27/55%	0/100/0%	30/41/29%	40/46/15%

Table 3: Variation in Performance for Servers Exhibiting Perfect Connection Persistence from Three Client Sites

Overall, the burst-1.1 option generally exhibits the best performance with the burst-1.0 and serial-1.1 options in the middle and the serial-1.0 option exhibiting the worst performance. These results are as expected and consistent with those presented in [17]. However, a number of other results come out of examination of the results in Tables 2, 3 and 8:

The percentage of servers that exhibit perfect connection persistence goes down as the number of objects retrieved increases.
The relative performance of burst-1.1 (compared to burst-1.0) improves as the number of objects increases. Thus pipelining and persistence improves relative performance with more objects, but also causes more problems in correctly obtaining all objects with a single connection.
The percentage of servers that support perfect connection persistence is relatively low. Looking at the ``Multi'' row for each client site (or the fifth column in Table 1), we see the range to be 25-31% of servers. Variations occur because tests were run at different times from different clients under different network and server conditions.
To better understand this result we retested and analyzed results from the WPI client with the PROCOW test set. We found that all objects were successfully retrieved from multiple object servers in 99% and 98% of the cases for the serial-1.0 and burst-1.0 options. However, only in 29% of the cases did we find that objects were successfully retrieved from these servers with only one TCP connection using the the burst-1.1 option. We found that objects were successfully retrieved for 40% of the cases for the serial-1.1 option. These results confirm that the failure of the burst-1.1 option to use only one connection is largely responsible for the small percentage of servers classified as perfect connection persistence servers.
In looking for reasons that persistence was not present, we found that in 36% of the cases for the burst-1.1 option, the server either reported it was using HTTP/1.0 or explicitly included Connection: close in one of its response headers. In 23% of cases for this option, the server did not exhibit any persistence nor was there any reason given based on the server response. These cases indicate that the TCP connection was closed or reset without explicit warning. The two figures were 24% and 11% for the serial-1.1 option.

Table 4 shows results from three client sites for servers that exhibit imperfect connection persistence. Results for the additional client sites are shown in Table 9 at the end of the paper. Servers are classified as imperfect from a client site when the number of needed TCP connections is more than one, but fewer than the number of retrieved objects, for at least one of the 1.1 options. The results show that the relative performance of the serial-1.1 and burst-1.1 options is worse than the perfect connection persistent sites. These results indicate that the reconnection costs for dropped or lost connections impact the overall performance to the point that imperfect persistence servers generally exhibit worse performance with the 1.1 options than with the burst-1.0 option.

Client Site	Object Count Range	Range Pct.	Pct. Persistent for Range	Ave. Retrieval Time (Sec.) (Ratio with Burst-1.0)
Client Site	Object Count Range	Range Pct.	Pct. Persistent for Range	Serial-1.0	Burst-1.0	Serial-1.1	Burst-1.1
att	2-5	26%	5%	7.57 (3.7)	2.05 (1.0)	2.75 (1.3)	4.89 (2.4)
att	6-15	22%	15%	4.78 (1.5)	3.15 (1.0)	3.03 (1.0)	2.61 (0.8)
att	16+	28%	33%	8.87 (2.0)	4.51 (1.0)	4.36 (1.0)	2.48 (0.6)
att	Multi	76%	18%	7.73 (2.0)	3.93 (1.0)	3.87 (1.0)	2.75 (0.7)
chile	2-5	26%	8%	17.82 (2.1)	8.69 (1.0)	10.92 (1.3)	12.25 (1.4)
chile	6-15	22%	15%	32.14 (2.0)	16.36 (1.0)	20.27 (1.2)	21.86 (1.3)
chile	16+	28%	12%	60.51 (2.6)	23.57 (1.0)	33.33 (1.4)	34.02 (1.4)
chile	Multi	76%	11%	39.39 (2.3)	17.22 (1.0)	22.94 (1.3)	24.13 (1.4)
wpi	2-5	26%	7%	5.77 (1.3)	4.28 (1.0)	7.32 (1.7)	6.44 (1.5)
wpi	6-15	21%	14%	19.14 (2.1)	9.04 (1.0)	11.69 (1.3)	12.64 (1.4)
wpi	16+	28%	11%	33.11 (2.6)	12.63 (1.0)	18.01 (1.4)	18.98 (1.5)
wpi	Multi	75%	10%	21.60 (2.3)	9.37 (1.0)	13.19 (1.4)	13.73 (1.5)

Table 4: Servers Exhibiting Imperfect Connection Persistence from Three Client Sites

4.3 Time of Day Analysis

The previous analysis used results from retrievals by a client to a large number of servers. The various protocol options were tested at approximately the same time for each client/server pair, but there was no control when these tests were run. To have more control on when tests were run we created the smaller, select set of servers and created a script to test each server at precise six hour intervals for one week. This script was run from the AT&T, Chile and WPI client sites. A test for the first server in the list was started at 0:02, 6:02, 12:02 and 18:02GMT each day. Tests for subsequent servers in the list were started at three minute intervals for an approximate testing period of three and one-half hours for a single round of testing. Results for each round from each of the three client sites are shown in Table 5, which is of similar format to Tables 2 and 4, but includes average object and byte count. Note that the results shown are only for servers exhibiting perfect connection persistence during at least one of the seven days in each time period.

Client Site	Time Range (GMT)	Pct. Persistent	Ave. Obj. Cnt.	Ave. Obj. Bytes	Ave. Retrieval Time (Sec.) (Ratio with Burst-1.0)
Client Site	Time Range (GMT)	Pct. Persistent	Ave. Obj. Cnt.	Ave. Obj. Bytes	Serial-1.0	Burst-1.0	Serial-1.1	Burst-1.1
att	00:00-03:30	42%	7.4	32248	1.85 (1.4)	1.30 (1.0)	2.06 (1.6)	1.74 (1.3)
att	06:00-09:30	42%	7.7	33093	1.60 (1.4)	1.18 (1.0)	2.22 (1.9)	1.68 (1.4)
att	12:00-15:30	43%	7.5	32886	3.47 (1.6)	2.22 (1.0)	2.51 (1.1)	2.14 (1.0)
att	18:00-21:30	42%	7.3	31754	3.81 (1.6)	2.31 (1.0)	2.64 (1.1)	2.32 (1.0)
chile	00:00-03:30	32%	9.0	33004	30.80 (1.9)	16.59 (1.0)	17.13 (1.0)	12.98 (0.8)
chile	06:00-09:30	40%	9.1	35322	19.60 (2.0)	9.67 (1.0)	10.80 (1.1)	7.00 (0.7)
chile	12:00-15:30	38%	9.3	35225	25.17 (1.9)	13.11 (1.0)	14.09 (1.1)	9.46 (0.7)
chile	18:00-21:30	35%	9.1	34658	30.09 (1.9)	16.25 (1.0)	17.69 (1.1)	12.86 (0.8)
wpi	00:00-03:30	41%	8.7	33963	16.11 (1.8)	8.76 (1.0)	8.71 (1.0)	6.25 (0.7)
wpi	06:00-09:30	42%	9.0	34889	12.70 (1.9)	6.54 (1.0)	7.09 (1.1)	5.05 (0.8)
wpi	12:00-15:30	43%	9.4	36526	9.28 (1.8)	5.20 (1.0)	5.73 (1.1)	4.04 (0.8)
wpi	18:00-21:30	40%	9.6	37143	22.04 (1.9)	11.56 (1.0)	11.04 (1.0)	8.33 (0.7)

Table 5: Servers Exhibiting Perfect Connection Persistence Tested at Different Times of Day

The results show that average performance is generally best for the 6:00-9:30GMT time period (1:00-4:30 on the east coast of the U.S.). WPI results show that the 12:00-15:30 time period is best. Performance is generally the worst for the 18:00-21:30 time period (13:00-16:30 EST). These variations are expected with an approximate ratio of two between the worst and best time periods for a protocol option. Of more interest to our study are the variations in relative performance of the four protocol options. The results show little variation in the relative performance of the options as network/server activity varies other than results from the AT&T client, which shows relatively fast access so small variations have a larger effect on the ratio.

4.4 Caching

We again used the select set of servers to analyze the end-to-end performance effects of caching and restricted our analysis to those server tests exhibiting perfect connection persistence. The performance of each of the protocol options is shown in the first row for each client in Table 6. The relative performance of the four protocol options are relatively the same as found in the PROCOW test set for the given number of objects.

Client Site	Cache Use	Ave. Retrieval Time (Sec.) (Ratio with Burst-1.0)				Retrieved Objects	Retrieved Bytes
Client Site	Cache Use	Serial-1.0	Burst-1.0	Serial-1.1	Burst-1.1	Retrieved Objects	Retrieved Bytes
att	no cache	3.05 (1.7)	1.83 (1.0)	2.39 (1.3)	1.91 (1.0)	7.6	32405
att	with cache	0.51 (1.9)	0.27 (1.0)	0.27 (1.0)	0.28 (1.0)	0.5	6550
att	validate cache	2.44 (1.8)	1.37 (1.0)	1.23 (0.9)	0.74 (0.5)
chile	no cache	25.88 (1.9)	13.58 (1.0)	14.51 (1.1)	10.32 (0.8)	8.9	33810
chile	with cache	2.91 (1.3)	2.24 (1.0)	2.11 (0.9)	1.63 (0.7)	0.5	6210
chile	validate cache	19.14 (2.0)	9.41 (1.0)	9.92 (1.1)	5.13 (0.5)
wpi	no cache	13.95 (1.8)	7.54 (1.0)	7.55 (1.0)	5.45 (0.7)	8.9	34873
wpi	with cache	1.35 (1.1)	1.24 (1.0)	1.00 (0.8)	0.80 (0.6)	0.5	6704
wpi	validate cache	10.57 (1.9)	5.65 (1.0)	4.97 (0.9)	2.57 (0.5)

Table 6: Caching Impact for Servers Exhibiting Perfect Connection Persistence

Of interest is the second row in the table for each client. This row predicts the performance results if a client cache was used. Because the select data set has a number of tests between the same client and server we can determine when an object from that server has been previously retrieved. For this study, we assume the cached contents can be reused if the size of the object has not changed. While this assumption is not always valid it is sufficient for the scope of this analysis. The number of objects and bytes retrieved in the presence of a cache is significantly reduced from the results in our test. The results include all of the initial cache misses. These results indicate that the set of Web pages in the select set were relatively static over the week of our study. The performance for each of the protocol options were not measured directly, but derived from the test data with an assumption of zero time for a cache hit. The results show that such a high cache reuse percentage leads to much better performance for all protocol options. As expected, caching has the most relative impact for the slowest serial-1.0 option.

The last row for each client in Table 6 again shows derived costs if each cache hit also incurred a validation cost where the client must send a GET If-Modified-Since or a GET If-None-Match request to the server and receive a 304 response before reusing the cache content. While this assumption is unrealistic, it examines the impact of validation requests on end-to-end performance. Our measured results for an object retrieval differentiate between when the first byte of response is received and when all bytes are received. For our derivation we used the time when the first byte is received as an approximation to the time for a header-only 304 response. The results show that the derived validation costs significantly increase the overall time, particularly for the serial-1.0 option. However, the relative performance of serial-1.1 improves relative to the other options.

In summary, the impact of caching is to reduce the number of objects retrieved. We can use the results in Tables 2, 3, 4, 8 and 9 to see that as the number of objects is reduced there is less relative difference between the protocol options. As the tables show, this reduction is not uniform so caching will yield the most cost reduction for serial-1.0 and the least for burst-1.1. The results also show that validation costs can be significant, particularly when connection and request times are in the critical path. As one measure of the validity of our assumption for deriving validation costs, we also issued a ``GET If-None-Match: *'' request to servers from the AT&T client. The ratio between average If-None-Match and full retrieval times for an object was 0.6. Hence the derived costs for validation shown in Table 6 do not appear unreasonable, but further testing is warranted.

4.5 Multi-Server Content

In analyzing the end-to-end performance impact of embedded objects served by servers other than the base server we use the PROCOW set of servers. The first part of our analysis is to determine the extent to which content is being served from multiple servers. These results are shown in Table 7 where the second column shows the number of cases where the base object includes embedded objects by servers other than the base server. These numbers are relative to the count of base servers in column 2 of Table 1. For example, the HP site shows that 99 out of 665 (15%) base URLs use more than one server to serve content.

Client Site	Multi- Server Cnt.	Local Servers			Ad Servers			Akamai Servers			Other Servers
Client Site	Multi- Server Cnt.	Cnt.	Obj. Pct.	Byte Pct.	Cnt.	Obj. Pct.	Byte Pct.	Cnt.	Obj. Pct.	Byte Pct.	Cnt.	Obj. Pct.	Byte Pct.
aciri	97	31	42%	25%	34	11%	3%	10	55%	33%	42	14%	9%
aust	95	32	49%	29%	34	12%	5%	7	54%	40%	38	12%	8%
chile	103	33	39%	20%	34	11%	4%	7	63%	40%	45	13%	7%
hp	99	31	45%	24%	38	11%	4%	11	59%	33%	40	12%	7%
uky	90	27	39%	19%	36	9%	4%	8	69%	43%	39	14%	10%
norway	96	30	44%	27%	36	11%	4%	11	60%	37%	37	16%	9%
safrica	92	28	55%	31%	37	11%	3%	9	53%	38%	39	13%	8%
wpi	92	28	42%	23%	33	12%	7%	8	63%	44%	37	21%	12%

Table 7: Use of Multi-Server Content

We label content servers separate from the base servers as auxiliary. To further explore these results, we classified each of the auxiliary servers used. We did this classification through a combination of looking at the auxiliary server's name and IP address (results from the AT&T client are not shown because of problems due to masking when the IP address is printed). If the network portion of auxiliary server IP address matched the network portion of the base server's address it was classified as a local server (i.e., local to the base server, not the client). We realize that such a classification may not always be correct but we were able to verify by hand most of the sample. Non-local servers containing the string ``ad'' in the server name (such as ``adforce'') were classified as ad servers. Non-local servers containing the string ``akamai'' were another category. Akamai [1] is a commercial company, which serves content for contracted sites. All other non-local servers were grouped in the final category as Other. The purpose of these other sites is unknown--some may also serve ads or contracted content.

Results in Table 7 show the count of non-local servers in each category as well as the percentage of objects and bytes served by this category relative to the total number of objects and bytes for the base Web page. The percentages only include Web pages where the given category of servers are present. All cases show that auxiliary servers serve relatively more objects than bytes.

We also examined the impact of these categories on end-to-end performance. Because the base server and its auxiliary servers serve different numbers of objects and bytes it is not possible to compare the response times directly. Rather we determined the rate at which objects and bytes are served from each category of server using the best case time of the supported protocol options for each server. We found that local servers in the same network as the base servers have a higher object rate but a lower byte rate. Ad servers are mixed on object rate and lower on byte rate. Some of the servers that we categorized as other could be ad servers as well. Their byte rate is often less than the base servers.

The results also show that content distribution servers (which happen to be only Akamai servers in our test set) almost always show improved data rates relative to the base server. The relative object rate for the Akamai servers ranged from 2.0 for the Chile client to 20.0 for the HP client. The relative byte rate for the Akamai servers ranged from 0.7 for the Chile client to 7.5 for the Australia client. To further investigate these results we exploited the mechanism used to name Akamai-served objects where the object name includes the base server URL for these objects. This naming scheme allowed us to design a test where we retrieved from each client site the same set of objects from an Akamai server and their original base server. In this test the Akamai servers always yielded relatively better data rates. The relative object rate ranged from 1.9 for Chile to 15.2 for AT&T. The relative byte rate ranged from 2.1 for South Africa to 15.6 for AT&T.

Translating the impact of these improved data rates to end-to-end response time is more difficult and dependent on the strategy used by a server in distributing objects and a client in retrieving these objects from multiple servers. If a client retrieves objects from one server at a time then improvements in data rates for the remote content servers might be mitigated by increased costs to establish new connections with these servers, particularly if the client already has a pipelined, persistent connection with the base server. If on the other hand, the client retrieves objects in parallel from all servers then total response time will be controlled by the server spending the most time serving content. Results in Table 7 show that the base server is still serving most of the bytes, if not the objects, when multiple servers are used. In the case of most of the content coming from the base server then offloaded content does reduce response time, but then the importance between distributing content to a faster server versus simply a different server is less clear.

In summary, the performance impact of multi-server content is dependent on how multiple servers are used (for example performance is not a key consideration for ad servers) and how multi-server content is retrieved by clients. More study is needed on this issue, but it is important to consider that the use of multi-server content is still relatively small at the time of this study with it being used by 15% of base URLs, including less than 1.5% using it for remote content distribution.

4.6 Range Requests

We examine a subset of requests sent to HTTP/1.1 servers that were able to handle Range requests to see if there was appreciable reduction in latency in getting just the first hundred bytes and the first thousand bytes of an object as compared to a full retrieval. For the first hundred bytes test, among the servers that responded correctly with the 206 Partial Content response, the average latency for a Range response was 60% that of a full response for the AT&T Labs data, though the average full response time was only half a second. The numbers for South Africa are 53% with average full response time of 3.84 seconds, Norway 65% with average response time 0.72 seconds. For the thousand bytes range test the numbers are similar (AT&T 59% with the average full response time of 0.71 seconds, South Africa 57% with 3.84 seconds average, Norway 59% with 1 second average). So it is clear that there is improvement in user perceived latency but the appreciable nature of improvement depends on the speed of the link from a client site and its location in the Internet relative to server sites.

5 Discussion

The results from this study lead to a number of interesting observations about the factors that influence end-to-end Web performance. Focusing on the protocol option used, the results show that the best end-to-end performance is obtained when servers support persistent connections with pipelining (burst-1.1 option) and the connection persists over the lifetime of object retrieval from the server. The improved performance for this protocol option grows relative to other protocol options as more objects need to be retrieved from a server. The amount of this improvement is relatively constant over different times-of-day with different network and server conditions.

However there are issues with this expected result that dampen its effect. First, the likelihood that a server is able to support pipelining over a single persistent connection for the lifetime of object retrieval decreases as the number of objects increases. Second, the performance benefit of persistent connections is generally lost relative to the parallel HTTP/1.0 (burst-1.0) option if the connection is reset by the server and a new one (or ones) must be reestablished by the client. These results are highlighted by differences between perfect connection persistence results in Tables 2, 3 and 8 and imperfect connection persistence results in Tables 4 and 9. Finally, the potential performance benefits of the burst-1.1 policy are only available for about 50% of the servers we tested, as approximately 25% of servers served only one object and another 25% served a small number of objects. End-to-end performance for these servers will differ little based on what protocol option is used.

Our results also show that the interactions between various factors are important in end-to-end performance. The relative impact of caching for a client will vary according to the protocol option being employed by the client. Caching without the need for validation of the contents with the server can significantly improve all options, particularly the serial-1.0 option . However if validation is needed then the cost of each TCP connection is significant and the burst-1.1 option performs even better than the other options.

Interactions are also important in measuring the impact of multi-server content on end-to-end performance. If a client already has a persistent, pipelined connection with a base server, then retrieving a small amount of content from a different server, even if access to that server is faster than the base, may actually increase the total response time to retrieve all objects. However, if a significant amount of content does not need to be retrieved from the base server, but can be retrieved from a server closer to the client then the client should gain in performance if comparable protocol options are available from the servers.

In summary the study raises a number of issues for further investigation, but the results do point at some recommendations that we can make for clients and servers to improve end-to-end performance based upon the factors we studied.

Servers should continue to move towards support for HTTP/1.1 and its new performance features such as byte ranges, persistent connections and particularly pipelining. When these features are available and when they work correctly then clients see real performance improvements.
Clients and servers need to make the management of TCP connections more deterministic so that a connection is at least not reset during the lifetime of retrieving all objects for the given page from the server. The absence of a policy on when to close persistent connections in the HTTP/1.1 protocol standard is understandable, however this close and the subsequent client recovery can eliminate any performance benefits of pipelining. If the server cannot support the connection over this lifetime then overall performance could be improved if an explicit Connection: close header was sent by the server on retrieval of first object so that the client could pursue an alternate strategy, such as parallel retrievals, for subsequent objects.
The cost of validating cached objects is significant, particularly if pipelining over a persistent connection is not available. Servers and caches need improvements so that unnecessary invalidations of cached objects are reduced [11,20].
Base servers can distribute content to auxiliary servers and improve end-to-end performance for a client if the amount of content is significant, and the auxiliary servers are faster, closer to the client or support improved protocol options compared to the base server. Unless the base server is at full capacity then distributing content without these conditions will not improve and may reduce client performance.

6 Related Work

There have been many studies to examine Web performance from various perspectives. Nielsen, et al published the first work on measured performance of the HTTP/1.1 protocol. A related piece of work studied the performance interactions of persistent HTTP with TCP. Both of these studies found significant interaction problems that needed correction before persistent HTTP performed as expected. These works build on prior work to examine the impact of persistent connection HTTP [14].

More recent work has examined how bottlenecks in the network, CPU and the disk system affect the relative performance of HTTP/1.0 and HTTP/1.1 [2]. This work describes a controlled experiment to understand what relative impact of these subsystems on the protocol versions. One of the results of this work is a recommendation that HTTP/1.1 clients implement an ``early close'' policy where the client closes a connection after retrieving all objects associated with a Web page.

A number of other studies and tools directly examine the user-perceived performance of the Web. Keynote makes available a tool to visualize the performance of a Web retrieval [8]. Kruse, et al examine the impact of interleaving requests on user-perceived performance [12]. Gilbert and Brodersen demonstrate the benefits of progressive delivery of Web images in Web page rendering [6].

7 Summary and Future Work

In this work, we have examined factors contributing to end-to-end delay for a client's Web experience by performing a large scale study of popular Web sites. We have built on the PROCOW infrastructure by adding additional clients to the collection of client sites around the world. We have presented results on performance improvements due to the changes in the HTTP/1.1 protocol and for the impact of caching and multi-server content distribution in conjunction with different protocol options.

Our results show that the HTTP/1.1 protocol, particularly with pipelining, is indeed an improvement over existing practice, but that servers serving a small number of objects or closing a persistent connection without explicit notification can reduce or eliminate any performance improvement. Similarly, use of caching and multi-server content distribution can also improve performance if done effectively.

We believe that our work is a step in the right direction of measuring end-to-end performance. Our global testing infrastructure is solidifying and our process is largely automated. Clearly, further work is warranted. We expect more content diversification on the Web and the dependence on DNS for such methods needs to be examined more closely. We need to investigate the contribution of network aspects of the latency and see how they interact with the HTTP layer. This includes passive measurements to examine the variance against the observed measurements at the application layer. We also plan to test other types of target servers such as those responsible for server objects requiring a larger portion of the bandwidth usage and those serving dynamically generated objects.

Acknowledgements

The authors would like to thank several people who were helpful in getting us access to machines in many parts of the world--the study would not be possible without their help. They include Martin Arlitt, Alan Barrett, Steven Bellovin, Randy Bush, Jim Griffioen, Eduardo Krell, Anders Lund, Mark Murray, Scott Shenker, Graeme Yates. We thank Mikhail Mikhailov for assistance in setting up the testing framework and David Finkel for consultation on the analysis. We thank Bruce Maggs for answering questions related to Akamai.

Client Site	Object Count Range	Range Pct.	Pct. Persistent for Range	Ave. Retrieval Time (Sec.) (Ratio with Burst-1.0)
Client Site	Object Count Range	Range Pct.	Pct. Persistent for Range	Serial-1.0	Burst-1.0	Serial-1.1	Burst-1.1
aciri	2-5	27%	52%	2.18 (1.0)	2.17 (1.0)	1.79 (0.8)	1.90 (0.9)
aciri	6-15	21%	29%	5.40 (2.7)	2.00 (1.0)	2.15 (1.1)	1.26 (0.6)
aciri	16+	28%	20%	9.64 (2.7)	3.59 (1.0)	5.21 (1.5)	3.05 (0.9)
aciri	Multi	76%	34%	4.62 (1.9)	2.45 (1.0)	2.64 (1.1)	2.01 (0.8)
aust	2-5	26%	48%	10.61 (1.4)	7.42 (1.0)	7.10 (1.0)	6.16 (0.8)
aust	6-15	21%	27%	27.28 (2.4)	11.40 (1.0)	13.24 (1.2)	10.93 (1.0)
aust	16+	28%	18%	59.80 (2.8)	21.06 (1.0)	28.11 (1.3)	19.20 (0.9)
aust	Multi	75%	31%	25.11 (2.2)	11.29 (1.0)	13.06 (1.2)	10.09 (0.9)
hp	2-5	26%	50%	1.61 (1.3)	1.20 (1.0)	1.69 (1.4)	1.37 (1.1)
hp	6-15	21%	27%	5.18 (2.2)	2.32 (1.0)	2.55 (1.1)	1.52 (0.7)
hp	16+	28%	17%	10.22 (2.5)	4.07 (1.0)	3.80 (0.9)	2.45 (0.6)
hp	Multi	75%	31%	4.24 (2.1)	2.06 (1.0)	2.33 (1.1)	1.63 (0.8)
uky	2-5	25%	49%	6.64 (1.3)	5.07 (1.0)	4.27 (0.8)	3.59 (0.7)
uky	6-15	20%	28%	14.00 (2.4)	5.80 (1.0)	5.62 (1.0)	3.88 (0.7)
uky	16+	27%	18%	38.76 (2.8)	13.69 (1.0)	15.69 (1.1)	7.01 (0.5)
uky	Multi	72%	32%	15.53 (2.2)	7.15 (1.0)	7.11 (1.0)	4.41 (0.6)
norway	2-5	26%	48%	3.50 (1.6)	2.19 (1.0)	2.65 (1.2)	2.23 (1.0)
norway	6-15	22%	9%	8.00 (2.1)	3.79 (1.0)	3.97 (1.0)	2.63 (0.7)
norway	16+	27%	2%	19.18 (2.9)	6.57 (1.0)	11.24 (1.7)	8.09 (1.2)
norway	Multi	75%	20%	4.55 (1.8)	2.53 (1.0)	3.09 (1.2)	2.46 (1.0)
safrica	2-5	26%	49%	14.22 (1.4)	10.45 (1.0)	10.35 (1.0)	10.71 (1.0)
safrica	6-15	21%	26%	35.43 (2.1)	16.95 (1.0)	22.36 (1.3)	17.20 (1.0)
safrica	16+	27%	16%	83.90 (2.8)	29.79 (1.0)	47.13 (1.6)	30.28 (1.0)
safrica	Multi	73%	31%	32.90 (2.1)	15.78 (1.0)	20.40 (1.3)	16.08 (1.0)

Table 8: Servers Exhibiting Perfect Connection Persistence from Additional Client Sites

Client Site	Object Count Range	Range Pct.	Pct. Persistent for Range	Ave. Retrieval Time (Sec.) (Ratio with Burst-1.0)
Client Site	Object Count Range	Range Pct.	Pct. Persistent for Range	Serial-1.0	Burst-1.0	Serial-1.1	Burst-1.1
aciri	2-5	27%	5%	2.77 (0.9)	3.05 (1.0)	3.80 (1.2)	3.98 (1.3)
aciri	6-15	21%	16%	6.48 (2.0)	3.32 (1.0)	3.58 (1.1)	4.59 (1.4)
aciri	16+	28%	13%	7.16 (2.7)	2.69 (1.0)	3.83 (1.4)	3.46 (1.3)
aciri	Multi	76%	11%	6.16 (2.0)	3.01 (1.0)	3.72 (1.2)	4.01 (1.3)
aust	2-5	26%	8%	11.10 (1.5)	7.58 (1.0)	8.55 (1.1)	10.00 (1.3)
aust	6-15	21%	10%	40.56 (2.7)	15.00 (1.0)	23.51 (1.6)	20.52 (1.4)
aust	16+	28%	8%	61.44 (3.1)	20.03 (1.0)	33.77 (1.7)	21.66 (1.1)
aust	Multi	75%	9%	37.27 (2.6)	14.10 (1.0)	21.73 (1.5)	17.32 (1.2)
hp	2-5	26%	6%	1.75 (0.9)	2.05 (1.0)	3.29 (1.6)	3.19 (1.6)
hp	6-15	21%	13%	7.31 (2.1)	3.55 (1.0)	3.79 (1.1)	4.81 (1.4)
hp	16+	28%	15%	9.94 (2.4)	4.16 (1.0)	4.32 (1.0)	5.32 (1.3)
hp	Multi	75%	11%	7.62 (2.1)	3.58 (1.0)	3.96 (1.1)	4.77 (1.3)
uky	2-5	25%	7%	5.80 (1.9)	3.12 (1.0)	3.21 (1.0)	7.09 (2.3)
uky	6-15	20%	17%	19.65 (1.7)	11.86 (1.0)	12.63 (1.1)	10.68 (0.9)
uky	16+	27%	17%	38.74 (2.5)	15.65 (1.0)	15.04 (1.0)	12.69 (0.8)
uky	Multi	72%	14%	26.27 (2.2)	12.10 (1.0)	12.10 (1.0)	11.00 (0.9)
norway	2-5	26%	5%	4.87 (2.3)	2.10 (1.0)	4.54 (2.2)	11.06 (5.3)
norway	6-15	22%	9%	13.86 (2.2)	6.39 (1.0)	6.89 (1.1)	18.21 (2.8)
norway	16+	27%	4%	14.89 (2.1)	7.26 (1.0)	9.09 (1.3)	16.80 (2.3)
norway	Multi	75%	6%	11.53 (2.1)	5.38 (1.0)	6.79 (1.3)	15.77 (2.9)
safrica	2-5	26%	6%	29.21 (1.4)	20.57 (1.0)	28.13 (1.4)	25.19 (1.2)
safrica	6-15	21%	17%	50.55 (1.9)	26.31 (1.0)	38.97 (1.5)	29.21 (1.1)
safrica	16+	27%	21%	84.44 (2.4)	34.57 (1.0)	58.05 (1.7)	46.57 (1.3)
safrica	Multi	73%	14%	65.22 (2.2)	29.81 (1.0)	47.39 (1.6)	37.70 (1.3)

Table 9: Servers Exhibiting Imperfect Connection Persistence from Additional Client Sites

References

1: Akamai.
http://www.akamai.com.
2: Paul Barford and Mark Crovella. A performance evaluation of hyper text transfer protocols. In Proceedings of the ACM SIGMETRICS '99 Conference, Atlanta, Georgia, May 1999. ACM.
3: Pei Cao and Sandy Irani. Cost-aware WWW proxy caching algorithms. In Symposium on Internet Technology and Systems. USENIX Association, December 1997.
http://www.usenix.org/publications/library/proceedings/usits97/cao.html.
4: R. Fielding, J. Gettys, J. C. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee. Hypertext Transfer Protocol - HTTP/1.1. RFC 2616, HTTP Working Group, June 1999.
ftp://ftp.ietf.org/rfc2616.txt.
5: 1999 Fortune 500 companies, Fortune volume 139 number 8, April 26 1999.
6: Jeffrey Gilbert and Robert Brodersen. Globally progressive interactive web delivery. In Proceedings of the IEEE Infocom '99 Conference, New York, NY, March 1999. IEEE.
7: 1998 Global 500 companies, Fortune Magazine 1998.
8: Keynote lifeline.
http://lifeline.keynote.com/Lifeline/buyitonline/snapshot.asp.
9: Balachander Krishnamurthy and Martin Arlitt. PRO-COW: Protocol compliance on the web. Technical Report 990803-05-TM, AT&T Labs, August 1999.
http://www.research.att.com/~bala/papers/procow-1.ps.gz.
10: Balachander Krishnamurthy, Jeffrey C. Mogul, and David M. Kristol. Key differences between HTTP/1.0 and HTTP/1.1. In Eighth International World Wide Web Conference, Toronto, Canada, May 1999.
http://www.research.att.com/~bala/papers/h0vh1.ps.gz.
11: Balachander Krishnamurthy and Craig E. Wills. Piggyback server invalidation for proxy cache coherency. In Seventh International World Wide Web Conference, pages 185-193, Brisbane, Australia, April 1998. Published in Computer Networks and ISDN Systems (30)1-7 (1998) pp. 185-193.
http://www.cs.wpi.edu/~cew/papers/www7/www7.html.
12: Hans Kruse, Mark Allman, and Paul Mallasch. Network and user-perceived performance of web page retrievals. In Proceedings of the First International Conference on Telecommunications and Electronic Commerce, Nashville, TN USA, November 1998.
http://roland.lerc.nasa.gov/~mallman/papers/ecom98.ps.
13: Media metrix. http://www.mediametrix.com.
14: Jeffrey C. Mogul. The case for persistent-connection HTTP. In Proceedings of the ACM SIGCOMM '95 Conference. ACM, August 1995.
http://www.acm.org/sigcomm/sigcomm95/papers/mogul.html.
15: David Mosberger and Tai Jin. httperf - a tool for measuring web server performance. In Workshop on Internet Server Performance, Madison, Wisconsin USA, June 1998.
http://www.cs.wisc.edu/~cao/WISP98/final-versions/davidm.ps.
16: The netcraft web server survey. http://netcraft.co.uk/survey/.
17: Henrik Frystyk Nielsen, Jim Gettys, Anselm Baird-Smith, Eric Prud'hommeaux, Hikon Lie, and Chris Lilley. Network performance effects of HTTP/1.1, CSS1, and PNG. In Proceedings of the ACM SIGCOMM '97 Conference. ACM, September 1997.
http://www.acm.org/sigcomm/sigcomm97/papers/p102.html.
18: 100hot.com. http://www.100hot.com.
19: Craig E. Wills and Mikhail Mikhailov. Towards a better understanding of web resources and server responses for improved caching. In Eighth International World Wide Web Conference, Toronto, Canada, May 1999.
http://www.cs.wpi.edu/~cew/papers/www8.ps.gz.
20: Craig E. Wills and Mikhail Mikhailov. Studying the impact of more complete server information on web caching. Technical Report WPI-CS-TR-99-36, Computer Science Department, Worcester Polytechnic Institute, November 1999.
http://www.cs.wpi.edu/~cew/papers/tr99-36.ps.gz.

Vitae

Balachander Krishnamurthy is a Member of Technical Staff at AT&T Labs--Research in Florham Park, New Jersey, USA.

Craig E. Wills is an associate professor in the Computer Science Department at Worcester Polytechnic Institute. His research interests include distributed computing, operating systems, networking and user interfaces.