Mining Acronym Expansions and Their Meanings Using Query Click Log

Published by WWW Conference 2013

论文与出版物

Acronyms are abbreviations formed from the initial components of words or phrases. Acronym usage is becoming more common in web searches, email, text messages, tweets, blogs and posts. Acronyms are typically ambiguous and often disambiguated by context words. Given either just an acronym as a query or an acronym with a few context words, it is immensely useful for a search engine to know the most likely intended meanings, ranked by their likelihood. To support such online scenarios, we study the offline mining of acronyms and their meanings in this paper. For each acronym, our goal is to discover all distinct meanings and for each meaning, compute the expanded string, its popularity score and a set of context words that indicate this meaning. Existing approaches are inadequate for this purpose. Our main insight is to leverage“co-clicks”in search engine query click log to mine expansions of acronyms. There are several technical challenges such as ensuring 1:1 mapping between expansions and meanings, handling of “tail meanings” and extracting context words. We present a novel, end-to-end solution that addresses the above challenges. We further describe how web search engines can leverage the mined information for prediction of intended meaning for queries containing acronyms. Our experiments show that our approach (i) discovers the meanings of acronyms with high precision and recall, (ii) significantly complements existing meanings in Wikipedia and (iii) accurately predicts intended meaning for online queries with over 90% precision.