Online Behavioral Genome Sequencing from Usage Logs: Decoding the Search Behaviors
We present a system to analyze user interests by analyzing their online behaviors from large-scale usage logs. We surmise that user interests can be characterized by a large collection of features we call the behavioral genes that can be deduced from both their explicit and implicit online behaviors. It is the goal of this research to sequence the entire behavioral genome for online population, namely, to identify the pertinent behavioral genes and uncover their relationships in explaining and predicting user behaviors, so that high quality user profiles can be created and the online services can be better customized using these profiles. Within the scope of this paper, we demonstrate the work using the partial genome derived from web search logs. Our demo system is supported by an open access web service we are releasing and sharing with the research community. The main functions of the web service are: (1) calculating query similarities based on their lexical, temporal and semantic scores, (2) clustering a group of user queries into tasks with the same search and browse intent, and (3) inferring user topical interests by providing a probability distribution over a search taxonomy.