Common File Systems - WebNFS


Origins of WebNFS

Central design strategy

Scalability

Security

Proxying and Caching

WebNFS Advantages

WebNFS Disadvantages

References

Return to Common File System Index


Origins of WebNFS

WebNFS is a product and proposed standard protocol from Sun Microsystems that extends its Network File System (NFS). NFS was created in the early 1980's by a small team at Sun Microsystems. NFS was originally designed as a file access protocol for local area networks. The basic idea behind NFS is that clients can "mount" a server filesystem to appear as if on the local hard drive. Instead of manually transferring entire files across the network, the user can rely on NFS to move the data as needed (originally in 8K blocks, now as negotiated between the client and the server), and take advantage of local caching.

Taking NFS as a basis, Sun extended it to be applicable over the Internet, allowing reading and writing of files over the Web instead of just viewing them through a browser. The basic idea is to provide the Web with a file system.

Currently, using the Internet to edit documents involves downloading a file via HTTP or FTP, editing the file and subsequently uploading the entire file again. Adopting an Internet file system would permit in-place editing and seamless remote access, enabling direct file editing via an application and eliminating the download/upload burden. Furthermore, many collaborative applications, which are designed to work via a shared directory, could become instantly effective without having to be rewritten for the Internet, perhaps as a client/server application.

In 1997 there were an estimated 10-12 million nodes connected to various NFS servers throughout the world. WebNFS allows access to the information on any one of these servers. Instead of having to use a Web browser to retrieve the files, switch tasks, and cut and paste the data into applications, WebNFS allows for a direct and transparent access to Web data from within the applications themselves, compatible with the way applications now access local disks.


Central design strategy. What it had to do technically.

NFS has been widely adopted as the de facto standard for Unix-based file servers, and it also has a healthy following in the PC market. In fact, NFS client software is available from numerous vendors for practically every operating system, and more than 50 percent of all installed NFS clients run on Microsoft Windows-based PCs. The most common flavor is NFS version 2. Version 3 is an update of the NFS protocol that addresses common limitations with version 2. WebNFS simply builds on NFS version 3, adding features to work better over the Internet.

Currently, NFS runs over User Datagram Protocol (UDP) more often than TCP. However, TCP is far better-suited for communication over the Internet: TCP guarantees delivery even if packets get lost, it guarantees information delivery in the proper order, and it contains network-congestion and flow-control algorithms. So the main modification in changing from NFS to WebNFS was having TCP rather than UDP.





Sun also lets WebNFS work across Internet firewalls. NFS versions 2 and 3 use a remote procedure call (RPC) mechanism in several places. To use this RPC, the client sends a service request to the "portmapper" process, which is listening on the server at a well-known IP port. The portmapper responds with the port number where the NFS server can be found. The problem is that few Internet firewalls are sophisticated enough to follow this transaction; most firewalls deal only with TCP communication on static, well-known ports. Since NFS already uses port 2049, it makes sense that WebNFS requires clients to bypass the portmapper process and simply connect to port 2049.

Obtaining a file handle from the NFS MOUNT daemon, which initializes a connection to an NFS server, also requires the portmapper. We bNFS avoids this step by using public file handles, analogous to a well-known port. These steps not only permit WebNFS to work across a firewall, they also significantly reduce the amount of back-and-forth communication, or turnarounds, required to initialize a connection. Considering the high network latency of the Internet, minimizing turnarounds has a profound impact on increasing responsiveness. When compared to NFS, the WebNFS protocol also reduces the number of turnarounds in other areas of the specification.


Scalability

What happens to the behavior of NFS servers or networks under heavy load? Response time increases and data transferred decreases. In practice, complete saturation is never completely reached since load shedding comes into play as users get fed up and abandon the server. While this might be good for the server, it is disastrous for the service provider to be losing clients through poor service. Clearly, it's important that servers scale up to provide good service to a growing user population.

Protocols like HTTP and FTP that are provided by user-level daemons are hindered by the limitations of their interface to the underlying operating system. The use of worker processes at user level, or even worker threads within a single process cannot match the tight integration and low overheads achievable within the operating system.

NFS servers are implemented within the server operating system and benefit the server with low utilization of server resources which is why NFS servers handle much heavier client loads than an HTTP or FTP servers running on the same hardware, and yet provide better response times.


Security

WebNFS uses the authentication techniques supported by the underlying Remote Procedure Call (RPC) layer. Sun RPC supports multiple authentication techniques. This design can accept newer authentication techniques as they are developed. Currently the following are supported:

  1. AUTH_NONE This is used by applications that have no authentication requirement. NFS does not use this.
  2. AUTH_SYS This is the default for NFS. The caller's identification is included, but not verified. Since this is insecure, most NFS servers accept these credentials only if the client's network address appears in a list of trusted hosts.
  3. AUTH_DES The caller's identification includes a DES encrypted verifier. The DES key is exchanged via Diffie-Hellman public key encryption. The public keys for the client and server are obtained from a secure name service.
  4. AUTH_KERB As for AUTH_DES, the caller's identification includes a DES encrypted verifier. The DES key is exchanged via Kerberos version 4 private key encryption.


File and directory access is controlled by using UNIX permission bits. These permissions bits control read, write, and execute or search permissions to the owner of the file, the owner's group, or all other users. NFS servers generally support whatever file access control is supported on the server's operating system whether they be simple permission bits (read, write, search, execute) or Access Control Lists (ACLs). Most NFS servers also feature access control based on the client's network address.


Proxying and Caching

Most NFS clients cache file data in memory. In recent years more aggressive disk-based caching has become available [CacheFS]. Web browsers and proxy servers can cache data obtained through NFS just as they do through other proocols. Cached NFS documents can be validated by comparing their cached file attributes with the attributes on the server. As yet there is no support for a pure NFS proxy or proxy/cache mechanism in the NFS protocol. Research indicates that cache hit rates for proxy caches are not particularly high, so the benefit of a pure proxy cache for NFS is questionable.


WebNFS Advantages

Connection Management: A WebNFS client can download multiple files over a single, persistent TCP connection.

Concurrency: WebNFS clients can issue multiple, concurrent requests to an NFS server. The effect is better utilization of server and network resources, and better performance for the end user.

Fault Tolerance: WebNFS is well known for its fault tolerance in the face of network and server failures. When interrupted, other file transfer protocols require downloads to be resumed from the beginning, causing users to retrace steps and waste time in duplicating efforts. However, when a WebNFS client faces an interruption it can resume a download from where it was previously left off.

Performance and Scalability: NFS servers currently handle over 21,000 operations per second. NFS servers are highly integrated with the operating system, tuned for maximum system performance, and are easy to administer.

Higher Resource Utilization: instead of having a number of local copies of data, WebNFS allows applications to share this data across the Web. Also significantly facilitates distributed computing by allowing machines across the Internet to have full read and write priveleges for the same files.

Acceptance and Industry Support NFS servers are already widely used throughout the world. There are 75 vendors who already support NFS, and such companies as IBM, Sequent, Sun, Auspex, Oracle, Spyglass, and JavaSoft have commited to integrate WebNFS into their browser and Internet server software.


Disadvantages

No reliable history of performance: At this point, NFS is virtually unknown on the Internet, and there is now history of how well it performs in the Internet's shaky environment.

Speed: Although WebNfs claims to be up to 10 times faster than HTTP, it is still too slow to solely rely upon. A mere RTT latency imposed by the speed of light between Salt Lake City and Krasnoyarsk, Russia would be 100 milliseconds. WebNFS requires up to 5 turnarounds ( portmap, mount, portmap, lookup, read) to establish a READ connection, which would take half a second in the perfect case, and A LOT longer in the usual case.

Transparency - Good?: Designers of WebNFS claim its complete transparency, which may take the decision of whether to go across the Web for a file away from the user. Although it makes things simpler, it may require to connect to a remote NFS server which can be extremely slow, or that server may be down altogether.

UDP vs. TCP: Historically, NFS has been using UDP protocol as a transport, since it is faster and suited better for LAN's. WebNFS, however, switches to TCP, common for the Internet. This adds to the complexity of protocol layers switching.

Security: WebNFS designers claim to support all firewalls. However, none of them offer a comlete security, neither for data traveling across the Internet, nor even for files residing on the server. Adopting WebNFS opens an additional door, making data more vulnerable to both system design flaws and human errors.

Complete Internet dependency: By accepting WebNFS, the system becomes extremely dependent on the behavior of the Internet, which is not the most reliable thing in the world. The traffic is already very heavy, keeps growing exponentially, and unless radical measures are taken, is expected to crash in 5 years. Decision to adopt WebNFS may lead to a number of temporary and possibly terminal file system crashes.


References

Sun's site:

Other material:


Previous Next

Return to Common File System Index