Multimedia Communication

Vignesh V Menon

2022 NAB Broadcast Engineering and Information Technology (BEIT) Conference

April 24-26, 2022 | Las Vegas, US

Conference Website

Vignesh V Menon (Alpen-Adria-Universität Klagenfurt),  Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Christian Feldmann (Bitmovin, Klagenfurt),
Adithyan Ilangovan
(Bitmovin, Klagenfurt), Martin Smole (Bitmovin, Klagenfurt), Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt).

Abstract:

Current per-title encoding schemes encode the same video content at various bitrates and spatial resolutions to find optimal bitrate-resolution pairs (known as bitrate ladder) for each video content in Video on Demand (VoD) applications. But in live streaming applications, a fixed bitrate ladder is used for simplicity and efficiency to avoid the additional latency to find the optimized bitrate-resolution pairs for every video content. However, an optimized bitrate ladder may result in (i) decreased storage or network resources or/and (ii) increased Quality of Experience (QoE). In this paper, a fast and efficient per-title encoding scheme (Live-PSTR) is proposed tailor-made for live Ultra High Definition (UHD) High Framerate (HFR) streaming. It includes a pre-processing step in which Discrete Cosine Transform (DCT)-energy-based low-complexity spatial and temporal features are used to determine the complexity of each video segment, based on which the optimized encoding resolution and framerate for streaming at every target bitrate is determined. Experimental results show that, on average, Live-PSTR yields bitrate savings of 9.46% and 11.99% to maintain the same PSNR and VMAF scores, respectively compared to the HTTP Live Streaming (HLS) bitrate ladder.

Architecture of Live-PSTR

As a Valentine’s day gift to video coding enthusiasts across the globe, we release Video Complexity Analyzer (VCA) version 1.0 open-source software on Feb 14, 2022. The primary objective of VCA is to become the best spatial and temporal complexity predictor for every frame/ video segment/ video which aids in predicting encoding parameters for applications like scene-cut detection and online per-title encoding. VCA leverages x86 SIMD and multi-threading optimizations for effective performance. While VCA is primarily designed as a video complexity analyzer library, a command-line executable is provided to facilitate testing and development. We expect VCA to be utilized in many leading video encoding solutions in the coming years.

VCA is available as an open-source library, published under the GPLv3 license. For more details, please visit the software online documentation here. The source code can be found here.

Heatmap of spatial complexity (E)

Heatmap of temporal complexity (h)

 

 

 

 

 

 

 

 

 

ACM Mile-High video 2022 (MHV)

March 01-03, 2022 | Denver, CO, USA

Conference Website

Authors: Minh Nguyen (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt, Austria), Christian Timmerer (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt, Austria), Stefan Pham (Fraunhofer FOKUS, Germany), Daniel Silhavy (Fraunhofer FOKUS, Germany), Ali C. Begen (Ozyegin University, Turkey)

Abstract: With the introduction of HTTP/3 (H3) and QUIC at its core, there is an expectation of significant improvements in Web-based secure object delivery. As HTTP is a central protocol to the current adaptive streaming methods in all major over-the-top (OTT) services, an important question is what H3 will bring to the table for such services. To answer this question, we present the new features of H3 and QUIC, and compare them to those of H/1.1/2 and TCP. We also share the latest research findings in this domain.

Keywords: HTTP adaptive streaming, QUIC, CDN, ABR, OTT, DASH, HLS.

ACM Mile-High video 2022 (MHV)

March 01-03, 2022 | Denver, CO, USA

Conference Website

Minh Nguyen (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Ekrem Çetinkaya (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Hermann Hellwagner (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), and Christian Timmerer (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt)

Abstract: The advancement of mobile hardware in recent years made it possible to apply deep neural network (DNN) based approaches on mobile devices. This paper introduces a lightweight super-resolution (SR) network, namely SR-ABR Net, deployed at mobile devices to upgrade low-resolution/low-quality videos and a novel adaptive bitrate (ABR) algorithm, namely WISH-SR, that leverages SR networks at the client to improve the video quality depending on the client’s context. WISH-SR takes into account mobile device properties, video characteristics, and user preferences. Experimental results show that the proposed SR-ABR Net can improve the video quality compared to traditional SR approaches while running in real-time. Moreover, the proposed WISH-SR can significantly boost the visual quality of the delivered content while reducing both bandwidth consumption and the number of stalling events.

Keywords: Super-resolution, Deep Neural Networks, Mobile Devices, ABR

Hadi

On Tuesday the 25th of January 2022, Hadi Amirpour successfully defended his Ph.D. thesis under supervision of Assoc.-Prof. DI Dr. Christian Timmerer and Assoc.-Prof. Dr. Klaus Schöffmann. The defense was chaired by Assoc.-Prof. DI Dr. Mathias Lux and the examiners were Emeritus Prof. Dr. Mohammad Ghanbari (University of Essex, UK) and Univ.-Prof. DI Dr. Hermann Hellwagner (University of Klagenfurt).

We are pleased to congratulate Dr. Hadi Amirpour on passing his Ph.D. exam!

Vignesh V Menon

2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

May 22-27, 2022 | Singapore

Conference Website

Vignesh V Menon (Alpen-Adria-Universität Klagenfurt),  Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt).

Abstract:

Current per-title encoding schemes encode the same video content at various bitrates and spatial resolutions to find an optimal bitrate ladder for each video content in Video on Demand (VoD) applications. However, in live streaming applications, a fixed resolution-bitrate ladder is used to avoid the additional encoding time complexity to find optimum resolution-bitrate pairs for every video content. This paper introduces an online per-title encoding scheme (OPTE) for live video streaming applications. In this scheme, each target bitrate’s optimal resolution is predicted from any pre-defined set of resolutions using Discrete Cosine Transform(DCT)-energy-based low-complexity spatial and temporal features for each video segment. Experimental results show that, on average, OPTE yields bitrate savings of 20.45% and 28.45% to maintain the same PSNR and VMAF, respectively, compared to a fixed bitrate ladder scheme (as adopted in current live streaming deployments) without any noticeable additional latency in streaming.

Keywords:

Per-title encoding, live streaming, bitrate ladder, convex-hull prediction

IEEE Transactions on Multimedia

Farzad Tashtarian (Alpen-Adria-Universität Klagenfurt), Abdelhak Bentaleb (National University of Singapore), Alireza Erfanian (Alpen-Adria-Universität Klagenfurt), Hermann Hellwagner (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt), and Roger Zimmermann (National University of Singapore).

Abstract: While most of the HTTP adaptive streaming (HAS) traffic continues to be video-on-demand (VoD), more users have started generating and delivering live streams with high quality through popular online streaming platforms. Typically, the video contents are generated by streamers and being watched by large audiences which are geographically distributed far away from the streamers’ locations.

The locations of streamers and audiences create a significant challenge in delivering HAS-based live streams with low latency and high quality. Any problem in the delivery paths will result in a reduced viewer experience. In this paper, we propose HxL3, a novel architecture for low-latency live streaming. HxL3 is agnostic to the protocol and codecs that can work equally with existing HAS-based approaches. By holding the minimum number of live media segments through efficient caching and prefetching policies at the edge, improved transmissions, as well as transcoding capabilities, HxL3 is able to achieve high viewer experiences across the Internet by alleviating rebuffering and substantially reducing initial startup delay and live stream latency. HxL3 can be easily deployed and used. Its performance has been evaluated using real live stream sources and entities that are distributed worldwide. Experimental results show the superiority of the proposed architecture and give good insights into how low latency live streaming is working.

Index TermsLive streaming, HAS, DASH, HLS, CMAF, edge computing, low latency, caching, prefetching, transcoding.

IEEE International Conference on Communications (ICC)

May 16–20, 2022 | Seoul, South Korea

Conference Website

Reza Farahani (Alpen-Adria-Universität Klagenfurt),  Farzad Tashtarian (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt), Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK), and Hermann Hellwagner (Alpen-Adria-Universität Klagenfurt).

Abstract: With the emerging demands of high-definition and low-latency video streams, HTTP Adaptive Streaming (HAS) is considered the principal video delivery technology over the Internet. Network-assisted video streaming schemes, which employ modern networking paradigms, e.g., Software-Defined Networking (SDN), Network Function Virtualization (NFV), and edge computing, have been introduced as promising complementary solutions in the HAS context to improve users’ Quality of Experience (QoE) as well as network utilization. However, the existing network-assisted HAS schemes have not fully used edge collaboration techniques and SDN capabilities for achieving the aforementioned aims. To bridge this gap, this paper introduces a coLlaborative Edge- and SDN-Assisted framework for HTTP aDaptive vidEo stReaming (LEADER). In LEADER, the SDN controller collects various information items and runs a central optimization model that minimizes the HAS clients’ serving time, subject to the network’s and edge servers’ resource constraints. Due to the NP-completeness and impractical overheads of the central optimization model, we propose an online distributed lightweight heuristic approach consisting of two phases that runs over the SDN controller and edge servers, respectively. We implement the proposed framework, conduct our experiments on a large-scale testbed including 250 HAS players, and compare its effectiveness with other strategies. The experimental results demonstrate that LEADER outperforms baseline schemes in terms of both users’ QoE and network utilization, by at least 22% and 13%, respectively.

Keywords:

Dynamic Adaptive Streaming over HTTP (DASH), Network-Assisted Video Streaming, Video Transcoding, Quality of Experience (QoE), Software-Defined Networking (SDN), Network Function Virtualization (NFV), Edge Computing, Edge Collaboration

Vignesh V Menon

Data Compression Conference (DCC)

March 22-25, 2022 | Snowbird, Utah, US

Conference Website

Vignesh V Menon (Alpen-Adria-Universität Klagenfurt),  Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt).

Abstract:

High Framerate (HFR) video streaming enhances the viewing experience and improves visual clarity. However, it may lead to an increase of both encoding time complexity and compression artifacts at lower bitrates. To address this challenge, this paper proposes a content-aware frame dropping algorithm (CODA) to drop frames uniformly in every video (segment) according to the target bitrate and the video characteristics. The algorithm uses Discrete Cosine Transform (DCT)-energy-based low-complexity spatial and temporal features to determine the video properties and then predict the optimized framerate, yielding the highest compression efficiency. The effectiveness of CODA is evaluated with High Efficiency Video Coding (HEVC) bitstreams based on the x265 HEVC open-source encoder. Experimental results show that, on average, CODA reduces the overall Ultra High Definition (UHD) encoding time by 21.82% with bit-rate savings of 15.87% and 18.20% to maintain the same PSNR and VMAF scores, respectively compared to the original frame-rate encoding.

Vignesh V Menon

Vignesh V Menon and Hadi Amirpour gave a talk on ‘Video Complexity Analyzer for Streaming Applications’ at the Video Quality Experts Group (VQEG) meeting on December 14, 2021. Our research activities on video complexity analysis were presented in the talk.

The link to the presentation can be found here (pdf).