Paper accepted @ IEEE/IFIP Network Operations and Management Symposium (NOMS) 2026
YTLive: A Dataset of Real-World YouTube Live Streaming Sessions
IEEE/IFIP Network Operations and Management Symposium (NOMS) 2026
Rome, Italy- 18 – 22 May 2026
[PDF]
Mojtaba Mozhganfar (University of Tehran), Pooya Jamshidi (University of Tehran), Seyyed Ali Aghamiri (University of Tehran), Mohsen Ghasemi (Sharif University of Technology), Mahdi Dolati (Sharif University of Technology), Farzad Tashtarian (Alpen-Adria-Universität Klagenfurt), Ahmad Khonsari (University of Tehran), Christian Timmerer (Alpen-Adria-Universität Klagenfurt)
Abstract
Live streaming plays a major role in today’s digital platforms, supporting entertainment, education, social media, etc. However, research in this field is limited by the lack of large, publicly available datasets that capture real-time viewer behavior at scale. To address this gap, we introduce YTLive, a public dataset focused on YouTube Live. Collected through the YouTube Researcher Program over May and June 2024, YTLive includes more than 507000 records from 12156 live streams, tracking concurrent viewer counts at five-minute intervals along with precise broadcast durations. We describe the dataset design and collection process and present an initial analysis of temporal viewing patterns. Results show that viewer counts are higher and more stable on weekends, especially during afternoon hours. Shorter streams attract larger and more consistent audiences, while longer streams tend to grow slowly and exhibit greater variability. These insights have direct implications for adaptive streaming, resource allocation, and Quality of Experience (QoE) modeling. YTLive offers a timely, open resource to support reproducible research and system-level innovation in live streaming. The dataset is publicly available at: https://github.com/ghalandar/YTLive.

