Real-time, distributed, social media, search engine

Data streams from real time sources (e.g. twitter, facebook and other social media sites) are constantly updated with large volumes of ephemeral posts. In this context, a framework is needed for crawling, processing, indexing and retrieval of ephemeral content. A storage mechanism will be implemented for storing time related content descriptors and other metadata. The developed indexing structures will provide real time access to the content while it is pushed in the system. A distribute crawling mechanism will be implemented, in order to handle the huge incoming content load. Time related queries will be supported, for fast real time filtering of content based on similarity metrics or for range queries within a specific time window. A trend prediction module will track positive and negative trends on the available content. Popular and non popular content will be used to provide recommendations to the end users. All framework modules will expose their functionalities through appropriate web service interfaces.

Demonstrations

Relevant Project

cubrikCUBRIK
Human-enhanced time-aware multimedia search

 

Relevant Publications

T. Semertzidis, D. Rafailidis, E. Tiakas, M. G. Strintzis, P. DarasMultimedia Indexing, Search and Retrieval in Large Databases of Social Networks“, Social Media Retrieval, Computer Communications and Networks series, Springer 2012, ISBN 978-1-4471-4554-7, November 30, 2012