GeoTrend: Spatial Trending Queries on Real-time Microblogs
- Amr Magdy ,
- Ahmed M. Aly ,
- Mohamed F. Mokbel ,
- Sameh Elnikety ,
- Yuxiong He ,
- Suman Nath ,
- Walid G. Aref
ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (GIS) |
Published by ACM
This paper presents GeoTrend; a system for scalable support of spatial
trend discovery on recent microblogs, e.g., tweets and online
reviews, that come in real time. GeoTrend is distinguished from
existing techniques in three aspects: (1) It discovers trends in arbitrary
spatial regions, e.g., city blocks. (2) It supports trending
measures that effectively capture trending items under a variety of
definitions that suit different applications. (3) It promotes recent
microblogs as first-class citizens and optimizes its system components
to digest a continuous flow of fast data in main-memory while
removing old data efficiently. GeoTrend queries are top-k queries
that discover the most trending k keywords that are posted within
an arbitrary spatial region and during the last T time units. To support
its queries efficiently, GeoTrend employs an in-memory spatial
index that is able to efficiently digest incoming data and expire
data that is beyond the last T time units. The index also materializes
top-k keywords in different spatial regions so that incoming
queries can be processed with low latency. In case of peak times,
a main-memory optimization technique is employed to shed less
important data, so that the system still sustains high query accuracy
with limited memory resources. Experimental results based
on real Twitter feed and Bing Mobile spatial search queries show
the scalability of GeoTrend to support arrival rates of up to 50,000
microblog/second, average query latency of 3 milli-seconds, and at
least 90+% query accuracy even under limited memory resources.