Diagnosing Home Network Misconfigurations using Shared Knowledge
The Problem
Today’s home network is typically a collection of several devices and applications. While the broadband modem acts as the home’s window to the Internet world, there are a variety of different hardware and software components that interact within people’s homes. For example, you may have a wireless router attached to your broadband modem, with several laptops and desktops connecting both through wires and wirelessly to the router. Additionally, you may have special-purpose devices such as gaming consoles or Wi-Fi media players connected as well. This creates an rich, diverse and unmanaged network in the home which, more often than not, has to be administered and configured by the home users themselves. Configuring the home network such that all applications and devices work as expected is extremely difficult. Evidence of this can be found on technical support websites, forums and mailing-lists that hold posts related to a countless number of problems such as:
“My Xbox does not connect to the Xbox Live service.”
“My Xbox 360 doesn’t work with the wireless network.”
“My VPN client does not work from home.”
“My browser takes ages to load any website.”
“I have set up an FTP server on my machine but it does not work.”
“My IM client does not work from home.”
When a home user runs into such problems, they typically troubleshoot the problem by either asking a friend if they can fix it, by looking online for solutions, or by calling technical support personnel. Very often, this manual process not only takes up a significant amount of time, it also causes immense frustration to the home user.
Our concentration is on fixing application-specific networking problems like the ones listed above. We believe that advanced systems like Windows Vista already have diagnostics to resolve basic connectivity problems such as “Cannot connect to the network”, caused by an unplugged network cable, a powered-off router, or similar issues. On the other hand, consider problems such as this: your email client works just fine as does your browser, but your instant messaging client fails to connect to the network. Such problems are far subtler and difficult to resolve and can cause much more frustration to the users than basic connectivity issues. Hence our emphasis is on the latter set of problems.
Our Solution in a Nutshell
The objective of the NetPrints project is to alleviate home user frustration by making the search for the correct configuration parameters automatic. The main idea of the project is to use shared knowledge to resolve misconfiguration issues in home networks. Say user A and user B run the same application and, while it works well for user A, it doesn’t for user B. But say that user A had had the same problem as user B in the past and had figured out a fix to the problem. User B should then have access to the fix. Also, if the application has always worked perfectly for user A, a comparison of A and B’s home network configurations can yield a solution to user B’s problems.
This approach is akin to how users today scour through online discussion forums looking for a solution to their problem. However, a key distinction is that the accumulation, indexing, and retrieval of shared knowledge in NetPrints happens automatically, with little human involvement.
System Architecture
NetPrints comprises client and server components. The client component gathers configuration information from the client host and from network devices such as the home router. In addition, it captures a trace of the network traffic associated with an application run and extracts a set of features that characterize the corresponding network communication. The client component then uploads its local configuration information along with the network traffic features to the server. In addition, in the case of a failed application run, the user clicks a diagnose button to invoke NetPrints diagnostics (this is the only human input needed in NetPrints). This also signals to the server that the configuration information and network traffic features just uploaded correspond to an unsuccessful run of the application. We term the combination of configuration information, network traffic features, and the indication of whether or not an application run was successful an anecdote.
The server gathers such anecdotes from clients and constructs a decision tree (using a machine learning algorithm) for every application to represent the knowledge of good and bad configurations. The figure above shows an example decision tree that we obtained specific to the Xbox 360 using three home routers. The tree informs us that whenever UPNP is disabled on the router, or when the WAN MTU parameter is set lower than 1300, or when the wireless authentication protocol is set to WPA2, the configuration of the home network is “bad”. In other words, the Xbox console will not be able to connect to the Xbox Live service.
The server also uses a decision tree algorithm to identify the network traffic features that are important, thereby generating a network signature for each application. In addition, the server maintains a suggestion table where, indexed by the network signature, it stores a potential set of configuration fixes that other clients have previously reported as their solution to a similar problem. The suggestion table provides hints to solve the problem, which are particularly useful in solving problems that are relatively less common since, in such cases, the decision tree learning algorithm may not capture enough information to resolve the issue.
When the server is presented with a client request triggered by a user invoking “help”, it walks down the decision tree that codifies its knowledge and it identifies configuration changes that might help resolve the problem using a procedure we call configuration mutation. If the decision tree traversal does not yield a suitable fix, the server looks up the suggestion table for any isolated configuration changes that might solve the problem. If both the tree traversal and the suggestion table lookup fail in generating a configuration fix, NetPrints infers that the problem is not related to the client’s home network configuration.
人员
B. Ashok
Senior Director of Applied Sciences and Engineering
Venkat Padmanabhan
Managing Director, Microsoft Research India