Spike Detection,
Social Media,
Twitter,
Analytics,
Peak Detection,
Event Detection
We evaluate the effectiveness of three peak detection algorithms when applied to collection of social media datasets. Each dataset is composed of a year's worth of tweets relating to a topic. The datasets were converted to time series composed of hourly tweet volumes. The objective of the analysis was to identify abnormal surges of communication, which are taken to be representative of the occurrence of events relevant to the topic under consideration. The ground truth was established by manually tagging the time series in order to identify peaks apparent to a human operator. Candidate algorithms were then evaluated in terms of the precision, recall, and F1 scores obtained when their output was compared to the manually identified peaks. A general-purpose algorithm is found to perform reasonably well, but seasonality in social media data limits the effectiveness of applying simple algorithms without filtering.