4BA2 Group 6
Chris Keogh, John McKeown, John Fuller, Noel OSullivan, Stuart Barlow
Links to local resources:
Links to HTTP resources on the WWW:
In recent years, the exponential growth of the Internet has been widely attributed to the usefulness of the World-Wide Web (WWW). This is a distributed hyper-media information resource based around the HyperText Markup Language (HTML) which is delivered by a wide spread implementation of the HyperText Transfer Protocol (HTTP). HTTP is an extensible protocol for the transfer of generic data types but specially optimised for the transfer of HTML. To quote the specification of HTTP/1.0: [Draft1.0]
"The Hypertext Transfer Protocol (HTTP) is an application-level protocol with the lightness and speed necessary for distributed, collaborative, hypermedia information systems. It is a generic, stateless, object-oriented protocol which can be used for many tasks, such as name servers and distributed object management systems, through extension of its request methods (commands). A feature of HTTP is the typing and negotiation of data representation, allowing systems to be built independently of the data being transferred.
The WWW is regularly used by millions of users world wide and because of the congestion on the Internet is beginning to put strain on the infrastructure thus making true interactive use of the Internet an oxymoron. Although the main cause of this is the network speed, we will be looking at how the inefficiencies of HTTP, particularly in conjunction with TCP, have contributed to the problem. We also briefly examine other issues to do with HTTP in its current incarnation and as an evolving standard.
go back to index
HTTP originated as a very simple protocol, HTTP 0.9, developed to reduce the inefficiencies of the FTP protocol. The goal was fast request-response interaction without requiring state at the server. The protocol was extended to include a MIME style wrapper (to convey the content type and encoding of the returned document), and a basic authentication mechanism. This extended protocol became HTTP 1.0, which is in very widespread use.
The HTTP model is extremely simple. The client (user) establishes a connection to the remote server, then issues a request. The server then processes the request, returns a response, and closes the connection.
The simplicity of HTTP has been a major factor in its rapid adoption, but this very simplicity has become its main drawback.
go back to index
Problems with HTTP/1.0
One main problem with HTTP/1.0 is the fact that it opens a connection (usually TCP) for each request for data and closes that connection directly after receiving the data object. TCP/IP uses a slow start mechanism to avoid congestion and gradually increases the throughput to match the bandwidth available. As a result most HTTP transactions operate at a reduced bandwidth (operating at around 10% was quoted). This was discussed by Spero in the 31st IETF Meeting [31Minutes]. Figures quoted in Speros [SperoAnal] paper on HTTP performance over TCP are 530ms for a open/get/close for a single data object, and only 120ms if the connection is already open. This is a substantial improvement considering that as HTTP/1.0 stands it cannot ask for any more than one object per request. With small transfers the protocol spends more time waiting for connections to be setup and closed than for the actual transfer of the data.
The above problem with respect to small data objects is particularly relevant as a study of 200,000 HTTP retrievals referenced in Padmanabhan and Moguls paper [PadMog] regarding the improvement of HTTP latency problems states that the mean size of data objects transported was 13767 and median of 1946 bytes (excluding zero length transfers).
HTTP/1.0 only has basic access authentication. This meant that passwords and private details are sent as plain text and so are vulnerable to snooping on the network.
Scaleablity is also large problem due to this single request per connect paradigm. One is due to the widespread use of TCP which requires the server to maintain connection information 4 minutes after the connection is closed. This on a busy server can amount to thousands of control blocks. This concept is discussed in Speros paper [SperoAnal]. Another scaleablity problem is that clients, to overcome the one data item per connection, use multiple threads, with a connection per thread, thus increasing the load seen by the already overloaded server. Netscape Navigator and Microsoft Explorer now due this multi-threading as standard.
go back to index
The HTTP Working Group was set up to develop extensions to the existing HTTP/1.0. Their findings and proposals were embodied is HTTP/1.1 which was published as an internet draft before the Stockholm IETF meeting in July 1995. According to Dave Kristol, the ultimate aim of the extended protocol was to provide a small yet relatively versatile system for security, payment information, packetizing and compression [32Minutes]. A separate group was formed to deal with security and payment information.
The extensions focused on the main problems of HTTP/1.0 as outlined above. They introduced the concept of a session method. This improvement introduces a persistent connection where the connection is only terminated on an agreement between the client and server, which is sent using the Connection header field. One of the key improvements was that multiple transactions within a single connection were now available. Also, it allowed session-long negotiation of Accept-*, authentication, and privacy extensions.
According to the HTTP/1.1 Internet Draft [Draft1.1], persistent HTTP connections, allowing multiple transactions, have a number of advantages:
- By limiting the number of TCP connections, CPU time is saved, and memory used for TCP protocol control blocks is also saved.
- HTTP requests and responses can be pipelined during one connection. Pipelining allows a client to make multiple requests without waiting for each response. The acknowledgement system implemented by the combination of HTTP/1.0 and TCP was far too restrictive and as Spero said, only used ten per cent of the total bandwidth available.
- Network congestion is reduced by reducing the number of packets caused by TCP opens, and by allowing TCP sufficient time to determine the congestion state of the network.
- The HTTP protocol can be developed with ease since errors can be reported without the penalty of closing the TCP connection. Clients using future versions of HTTP could implement a new feature, but if communicating with an older server, retry with old semantics after an error is reported.
Another possible extension which was discussed was better access authentication. The Digest Authentication Scheme [DraftDigest] was devised to encrypt sensitive details for transmission across the Internet. This was not included, however, in the final draft of the HTTP/1.1.
As it stands, HTTP/1.1 has only just become a proposed standard and may become a Request For Comment if accepted.
go back to index
HTTP 1.2 - Extensions to HTTP/1.1
Within the Working Group on HTTP under the IETF, there are several groups set up to deal with particular aspects of improving on previous protocols and at the time of writing these groups submit there findings / drafts under the title of HTTP/1.2. These submissions, such as the Internet Draft on Transparent Content Negotiation in HTTP [TransCont], and the Protocol Extension Protocol (PEP) [PEP] are such extensions. There are several different working groups and all have Internet Drafts associated with them. These drafts are heavily influenced by HTTPng which for some time now has been the goal of the Working Group.
go back to index
HTTP - Next Generation is just that, the next generation in the HTTP series. It is outlined by Simon Spero in his progress report [HTTPng]. The report outlines the current state of the protocol, and is writen primarily in comparison to HTTP/1.0 as it was during the development of HTTP/1.1, and thus some of the comments on such things as persistent connections are now obsolete, but more recent comments on HTTPng are hard to find.
HTTPng is a protocol based on not only persistent connections but also on multiple data flows / streams within the same connection. This allows the downloading of different data items concurrently so that if one stalls for a period, that time slot will be used by another. The protocol uses a special Session Layer protocol called SCP (Session Control Protocol) [SCP] which manages these streams of data.
In HTTP/1.x, all the data is encoded/decoded using a variant of MIME. This is easy for humans to understand but very wasteful and complicated for computers. To avoid this, HTTPng uses a different way of describing and encoding the request message. HTTPng uses a simplified form of ASN.1 (Abstract Syntax Notation) and PER (Packed Encoding Rules). This scheme allows efficient, compact parsers to be generated automatically, whilst remaining simple enough to allow hand-crafted parsers to be built easily. An example of the result of this is the bit vectors used in the HTTPng header to specify body content instead of lengthy character strings. Each bit of the vector specifies the most commonly transferred data types and so makes the transport of these types efficient but is also extensible so that other types can still be transferred. This is in direct contrast to HTTP/1.x which has to send a list of data types that it can accept on each request and each type in the list is a character string describing it.
At the moment, as HTTPng is so different to HTTP/1.x, it is probably going to be implemented first on a server to server basis / proxy servers where its efficiency would be of greatest importance. The servers could also see what items are needed and pre-fetch them using HTTPng and then feed them to the client using say HTTP/1.1.
go back to index
With the many improvements outlined above hopefully not only will the WWW and thus the Internet become much more useful but also very much less congested. It is interesting to note that while new versions of the protocol are being designed recently, very few actual implementations have used even a large part of the HTTP/1.0 specification. This is partly due to the complexity and inadequate standard definitions provided to developers but also due to the speed of growth which has led to the need to rush implementations.
In practice it has been discovered that the vast majority of transfers have been of a very small subset of data type. While NG tackles this by having short bit codes for popular types, others have investigated even simpler solutions than HTTP. One of these is Suns WebNFS which has been outlined in an RFC. This views the Internet as a huge public file system. Type checking is left purely to the client. This may prove to be a viable alternative for bulk traffic.
Finally, HTTP is not in itself immature, however most implementations of it are. New tools, requiring set standards, could make all the difference.
go back to index
[SperoAnal] Simon E. Spero, "Analysis of HTTP Performance problems".
[PadMog] Venkata N. Padmanabhan and Jeffrey C. Mogul, "Improving HTTP Latency".
[31Minutes] Minutes taken at the 31st HTTP-WG Meeting.
[32Minutes] Minutes taken at the 32nd HTTP-WG Meeting.
[Draft1.0] Internet Draft on HTTP/1.0, September 1995.
[Draft1.1] Internet Draft on HTTP/1.1, August 1996.
[DraftDigest] Internet Draft on Digest Authentication Scheme, September 1996.
[TransCont] Internet Draft on Transparent Content Negotiation in HTTP, Nov 1996.
[PEP] Internet Draft on the HTTP/1.2 Extension Protocol, August 1996.
[HTTPng] HTTPng: Progress So Far. Simon Spero.
[SCP] HTTPng: Progress So Far - SCP. Simon Spero.
go back to index