Abstract:
Social media is an ever-evolving web based platform for sharing thoughts, opinions, ideas and other contents. Among all social media networks, Twitter has become one of the most popular social networking/micro-blogging sites, allowing users to share their thoughts with massive audience. In recent years, a piece of information published in an article on social media is facing a critical challenge to determine its social provenance. Like data provenance, social provenance describes the ownership and origin of such information. It aids in clarifying opinions to avoid rumors, investigations and explaining how and when this information was created and by whom. In this paper, we present a Zero-Information Loss Graph Database (ZILGDB) based Provenance Framework for twitter data and its applicability in terrorist attack investigation by identifying suspicious persons and their linked community. This framework provides provenance analysis through visualization along with its capability to capture provenance information for historical data queries, standing queries, and querying through time. We evaluate the performance of the framework in terms of provenance query execution time and provenance capturing overhead for a query set.