YouServ – An Analysis
- History Of Development
- Performance And Security
- Social Aspects
History Of Development
YouServ grew out of an IBM project called userv. The project was created in order to allow the same ease of publishing material as was currently available in accessing it. Although YouServ (as it is now known) is a proprietary product, which has not yet been released, it has been deployed successfully at Carnegie Mellon University (www.userv.web.cmu.edu), albeit in an early form. However IBM have been running it for a number of months internally all over the world on their corporate network. Any IBM employee can use YouServ and it has proven extremely popular amongst IBM employees despite the availability of alternatives such as email and networked file storage.
At present licenses may be obtained from IBM for use of the system but by their own words they cite it as being "complicated". So for the moment YouServ is not publicly available (at least on a practical level) and therefore not testable. They hint in their rudimentary FAQ that they may well release a version of YouServ to the public, but whether or not it will be open-source, proprietary, free or otherwise or in fact any information regarding its release is not revealed.
I would imagine that this system (as described below) would prove very popular as it makes creating content just as easy as viewing content. It should bring content publishing to the masses.
The Future Of YouServ
YouServ implements some interesting technologies, for e.g. Replication, Proxies (See Below), but it is important to understand that this project is still in its infancy, at least in terms of age, its current state should be pretty stable. YouServ could be improved upon in the future by for e.g. Adding in load balancing as an extension of proxying or any number of extensions. The real utility of YouServ lies in its simplicity though, easy to install, use and conceptualize. As long as YouServ can maintain its simplicity it will be a success.
Features of YouServ
YouServ establishes a simple interface that makes it easy for web content both to be accessed and published. A user who wishes to publish some web content from their computer is immediately assigned a convenient domain name, which always points to their site’s content, even if they are using an ISP with dynamically-assigned IP addresses. Normally, a home user’s computer has a different IP address each time it connects to the internet, due to dynamically-assigned IP addresses, but YouServ helps to overcome this difficulty, by providing a user with a constant URL, which will always point to their machine.
YouServ also provides a facility to enable users to pool their resources together to gain high availability web hosting and file sharing. Any YouServ user can list other users, known as ‘replicators’, who are willing to host a replica of their content, so that it can be accessible when the user’s machine is off-line. Any YouServ user wishing to replicate another user’s content must also specify these ‘masters’, thus forcing a two-way agreement for replication. The YouServ developers have found that on the network set up in IBM, may groups or teams have at least one member who will leave their desktop computer running continuously and that this member is typically used as the replicator for the other members of that team. The developers have also found that some users with multiple machines, e.g. a desktop machine and a laptop, tend to use the desktop as a YouServ replicator system and maintain the site’s master copy on the laptop. This replication is transparent to any user. Once the masters and replicators are specified, the replicas will synchronise with the master sites automatically. The centralised YouServ coordinator will automatically activate the replicator site when the master site disconnects. This means that the site is switched over to the replica automatically and transparently to anyone who wishes to access the site. This feature alone provides a great benefit to anyone wishing to host web pages. This kind of replication is hard for average users to achieve and provides high enough availability that the YouServ users’ sites can be indexed in a search engine, such as Google, meaning that unrelated users can find content that they require on a YouServ site on a standard search engine.
YouServ provides a way for machines behind firewalls that do not accept inbound port 80 (HTTP) connections to still host web content using YouServ. This means that the web content may not be accessed easily and directly by a web browser client program, because browsers expect HTTP servers to listen on port 80. This was added to YouServ because most Corporations have firewalls that will not accept inbound port 80 connections and YouServ was initially aimed at being used in a corporate intranet, such as at IBM where it is being developed. YouServ provides a service called peer-to-peer proxying. This works, very simply, by calling upon members of the YouServ community that can accept inbound port 80 connections to accept connections on behalf of users who are not. The YouServ software will detect whether a proxy is needed when it starts up and notifies the YouServ coordinator. The coordinator then forwards the contact information of a machine that is willing to serve as a proxy. The first machine will then connect to the proxy, which will then accept connections on its behalf.
YouServ also enables Access Control, as mentioned in the security section.
As YouServ is in essence a web hosting alternative the primary security issues facing it are access control to the information on the site and protecting the site from possible malicious attacks. More recently the new added functionality of support for Java plugins on YouServ sites has also brought with it some new security issues.
YouServ as a web-hosting site attempts to make it easy for users to set level of security on their files. There are three levels public, protected and private. Any user can access files inside the public folder. The protected folder requires authenticated access to control the downloading of content. Finally the private folder is only available to the owner of the site and any other users they name in a file.
To offer secure access control to a website, there must be the facility to pass encrypted data and the ability to verify users who wish to access the site. HTTP has facilities to transmit encrypted data and to verify websites to users. However there is a problem authenticating users to websites. YouServ addresses this issue by having a centralized site for anyone requiring restricted access which redirects users to site when access is granted.
There are two features to YouServ which help make it more robust than normal websites to malicious attack. Firstly it is written in Java which makes it less susceptible to buffer overflow attacks. Also due to its replication feature where websites are hosted by other websites, taking down one of the YouServ node will not necessarily bring down the website.
The added functionality of the java plugins which have full access to the java virtual machine also brings up some security issues. In response to this YouServ has two features, digital signatures and sandboxing. Digital signatures mean that any downloaded plugin will be checked for a proper authentication certificate by YouServ, if one does not exist then the user will be warned. Sandboxing involves controlling the amount of operations that the code of the plugin has available to it. This will reduce the functionality but not necessarily make redundant the plugin whilst protecting the machine if the plugin does turn out to be malicious.
Despite all the features listed above the YouServ project is still considered to be in its developmental stage despite widespread use within IBM. For example it is possible for a
“Hacker to "sniff" and reuse encrypted identify information (but not passwords) to temporarily impersonate any person whose request for secured content can be intercepted.”
These are problems however are a problem on other web hosting services and YouServ does deliver an improvement on existing HTTP security feature.
Performance And Scalability
YouServ is very similar to existing web technologies, and it uses some of them to achieve good scalability. And the web does scale well; it has well over a billion pages. However it does have its own server system which takes care of replication and proxying as well as user-authentication. There is also a dynamic DNS server to handle DNS queries and implement proxying and replication transparently. The servers are only used during host lookups, and user authentications whereas the content is delivered on a peer-to-peer basis. In some ways this resembles the way Napster worked, although Napster's servers also indexed the files any one node had, whereas YouServ sites can be indexed via regular web search tools for e.g. Google. Because the content transfer is peer-to-peer there is no real scaling problem here. The performance of the file transfers will be constant with even massive increases in nodes. The bottlenecks are in the servers during DNS name resolution, and the YouServ co-coordinator doing user authentication and availability monitoring.
As the dynamic DNS component is essentially identical to existing dynamic DNS servers we can look to them to see how well they deal with large-scale systems. There are many dynamic DNS servers which map specific names to changing IP addresses. In general these systems are successful - there are many examples such as cjb.net, myip.org, DynDNS.dk etc. All these systems have many thousands of customers - they give away these services for free. We find that in general they scale very well - Dynamic DNS Network Services has over 150 thousand subscribers. Dynamic DNS has proven to scale well and is a well tested technology, as YouServ Dynamic DNS is essentially identical to regular Dynamic DNS it should experience no differences when scaling to higher numbers of users.
The remaining bottleneck therefore lies in the YouServ component and more specifically in the authentication part of that component. The node availability management for the most part can be left up to the nodes themselves to monitor. The authentication process is a relatively simple process involving a database lookup and some small data exchange over the network. Also the authentication process is only used when:
- A site is brought up onto the network
- Someone wants to access the protected or private file store.
Hence for many common uses of the system authentication isn't used. And when it is used it is a relatively inexpensive process. Hence this process should also scale well. If a single server doesn't stand up to the number of authentication requests if receives, peer servers could be added and data could be replicated or partitioned across these servers. However for the most part a single server should scale adequately.
IBM are keen to emphasize the ease of use of YouServ. Those who wish to access YouServ-published content do not need to install special purpose software.
YouServ employs standard web and Internet protocols (e.g. HTTP and DNS) and because of this anyone with a web browser can access its sites.
Those who want to publish their own site will have to install software. The installation program consists of a wizard that prompts the user to enter a user ID and password. IBM claim that in only two-clicks of the mouse the first file is online and accessible. The users’ home page is given a tidy, convenient address based on their e-mail address or username e.g. in the Trinity College domain, a users’ domain name might be “username.userv.tcd.ie”. This is significantly better than some of the convoluted URLs that free web hosts offer.
Putting files on the web is child’s play. To share a file (or an entire folder of files), a user can either copy it to his or her shared folder, or simply right-click over it and select a "Publish to YouServ" desktop menu option that is installed in Windows environments along with the peer software. When publishing through this menu option, YouServ prompts the user to choose a specific shared destination folder. YouServ copies the file (or folder) to the designated destination and displays a URL that addresses the published content on the web. A user may choose to offer his content as a collection of files and folders, in which case YouServ will list the contents of any folder that is accessed. If the user would prefer a home page, they simply call their page ‘index.html’ and upload it do their root directory.
The use of so-called “peer-to-peer proxying” makes it possible for people behind firewalls to tunnel out. Having lived on campus, with my computer behind a firewall I can say that peer-to-peer proxying is desirable.
So there are not very many people currently using YouServ, a current regular user base of 1500, mostly employees of IBM or some students. At present the system is not available to the punter. The YouServ project has a relatively low profile compared to some other sharing programs out there, and it remains to be seen whether YouServ could burst on to the scene and penetrate the market. However, YouServ isn’t exactly cutting edge in terms of the technology it employs, and in a way offers little that you couldn’t do yourself. It does make life easier for you if you are not technically inclined and want to publish material on the web and are a bit strapped for cash.
Eventually legal issues will crop up, whether it is copyright, pornography, etc. If this thing ever takes off it will have to win over the pirates, hackers and pornographers of the digital world and that is unlikely given the lack of anonymity. All the geeks will hate it because it isn’t open source and it’s user-friendly.
Like any other project the success of YouServ will be primarily be judged upon its influence. The technology it uses is fairly simple and hardly revolutionary, instead it is attempting to be as user friendly as possible.
Currently YouServ is still at a developmental stage within IBM so it is currently hard to predict what kind of influence the project will wield over time. Within IBM the developers claim that YouServ is used by 1500 people on a regular basis. This suggests that it is a useful tool. However the vision of the project suggests it could lead either through itself or some other p2p product based on similar lines to having a very real influence on the structure of the Internet.
Basically YouServ provides a facility for people who are not technically adept to set up web pages in order to share content. It also provides the possibility of websites being created that do not have to be hosted by large companies like Geocities. In an age when the demand for personal web pages is increasing all the time a product such as YouServ may find itself in demand.
However at the moment the YouServ community is still quite small and outside of IBM is relatively unknown so the social effect of YouServ at the moment is minimal and in the future can only be speculated upon.
Alternatives To YouServ
There are other services that use web servers to allow the sharing of files and services that provide other facilities similar to those provided by YouServ, such as replication and location independent URLs. BadBlue, http://www.badblue.com, is a webserver aimed at personal web serving. It provides easy-to-use webserving features and also implements the Gnutella protocol to support dynamic search over BadBlue and Gnutella-shared content. The Microsoft Personal Web Server and the MacOS personal web sharing function both facilitate the web publishing of local files. YouServ is different from these as it solves the problems that stop them from being a viable alternative to paid web-hosting facilities.
Many other similar tools also support file-sharing, such as Kazaa, Gnutella, Napster and Instant Messaging systems such as AOL Instant Messenger, MSN Messenger, Yahoo Messenger and ICQ. All of these, however, require the use of special client software, which needs to be downloaded, installed and configured in order to use them. YouServ enables people to share content simply using a web browser, which will be available to almost anyone trying to access published content. Other, similar services, such as Lotus QuickPlace, provide web access of content, but these services require centralised servers, which causes problems with scaling to large numbers of users without spending extra money on hardware and administration.
There are also other tools which allow location independent URLs, such as that provided by XDegrees Corp. There isn’t much information available on their solution, but a whitepaper on it mentions that they use a naming system that resolves the physical location of an entire URL. Existing web browsers, however, only use the domain name portion of the URL to resolve the IP address, because the DNS protocol has no provisions for resolving IP addresses from full URLs.Freenet, allows content to be shared in a way that makes it uncensorable and anonymous to use. This uses a web interface, as does YouServ, but it does not support high availability of content directly, as YouServ does. Content is propagated by requests from users. This means that the replication of files is difficult to control and some files can be hard to find, whereas with YouServ, replication is explicitly controlled, so users choose what they want to replicate and what they don’t want to replicate.
Only time will tell…
- Carnegie Mellon University www.cmu.edu
- Dynamic DNS provider www.cjb.net
- IBM YouServ project homepage www.almaden.ibm.com/cs/people/bayardo/userv
- Slashdot – online techie magazine www.slashdot.org
- Google – search engine used in this project www.google.com
- Bad Blue – Alternative web server www.badblue.com
- Freenet – Another P2P network www.freenetproject.org