Multimedia Communication

IEEE Open Journal of Signal Processing

Authors: Ekrem Çetinkaya (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Hadi Amirpour, (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt, Bitmovin), and Mohammad Ghanbari (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt, University of Essex)

Abstract: Video streaming applications keep getting more attention over the years, and HTTP Adaptive Streaming (HAS) became the de-facto solution for video delivery over the Internet. In HAS, each video is encoded at multiple quality levels and resolutions (i.e., representations) to enable adaptation of the streaming session to viewing and network conditions of the client. This requirement brings encoding challenges along with it, e.g., a video source should be encoded efficiently at multiple bitrates and resolutions. Fast multi-rate encoding approaches aim to address this challenge of encoding multiple representations from a single video by re-using information from already encoded representations. In this paper, a convolutional neural network is used to speed up both multi-rate and multi-resolution encoding for HAS. For multi-rate encoding, the lowest bitrate representation is chosen as the reference. For multi-resolution encoding, the highest bitrate from the lowest resolution representation is chosen as the reference. Pixel values from the target resolution and encoding information from the reference representation are used to predict Coding Tree Unit (CTU) split decisions in High-Efficiency Video Coding (HEVC) for dependent representations. Experimental results show that the proposed method for multi-rate encoding can reduce the overall encoding time by 15.08% and parallel encoding time by 41.26%, with a 0.89% bitrate increase compared to the HEVC reference software. Simultaneously, the proposed method for multi-resolution encoding can reduce the encoding time by 46.27% for the overall encoding and 27.71% for the parallel encoding on average with a 2.05% bitrate
increase.

Keywords: HTTP Adaptive Streaming, HEVC, Multirate Encoding, Machine Learning

Vignesh V Menon

Conference info: Picture Coding Symposium (PCS), 29 June-2 July 2021, Bristol, UK

Conference Website: https://pcs2021.org

Authors: Vignesh V Menon (Alpen-Adria-Universität Klagenfurt),  Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), and Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK)

Abstract: Since video accounts for the majority of today’s internet traffic, the popularity of HTTP Adaptive Streaming (HAS) is increasing steadily. In HAS, each video is encoded at multiple bitrates and spatial resolutions (i.e., representations) to adapt to a heterogeneity of network conditions, device characteristics, and end-user preferences. Most of the streaming services utilize cloud-based encoding techniques which enable a fully parallel encoding process to speed up the encoding and consequently to reduce the overall time complexity. State-of-the-art approaches further improve the encoding process by utilizing encoder analysis information from already encoded representation(s) to improve the encoding time complexity of the remaining representations. In this paper, we investigate various multi-encoding algorithms (i.e., multi-rate and multi-resolution) and propose novel multi- encoding algorithms for large-scale HTTP Adaptive Streaming deployments. Experimental results demonstrate that the proposed multi-encoding algorithm optimized for the highest compression efficiency reduces the overall encoding time by 39% with a 1.5% bitrate increase compared to stand-alone encodings. Its optimized version for the highest time savings reduces the overall encoding time by 50% with a 2.6% bitrate increase compared to stand-alone encodings.

Keywords: HTTP Adaptive Streaming, HEVC, Multi-rate Encoding, Multi-encoding.

Conference info: NOSSDAV’21: The 31st edition of the Workshop on Network and Operating System Support for Digital Audio and Video Sept. 28-Oct. 1, 2021, Istanbul, Turkey

Conference Website: https://nossdav.org/2021/

Authors: Reza Farahani (Alpen-Adria-Universität Klagenfurt), Farzad Tashtarian (Alpen-Adria-Universität Klagenfurt), Alireza Erfanian (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK) and Hermann Hellwagner (Alpen-Adria-Universität Klagenfurt)

Abstract: Recently, HTTP Adaptive Streaming (HAS) has become the dominant video delivery technology over the Internet. In HAS, clients have full control over the media streaming and adaptation processes. Lack of coordination among the clients and lack of awareness of the network conditions may lead to sub-optimal user experience, and resource utilization in a pure client-based HAS adaptation scheme. Software-Defined Networking (SDN) has recently been considered to enhance the video streaming process. In this paper, we leverage the capability of SDN and Network Function Virtualization (NFV) to introduce an edge- and SDN-assisted video streaming framework called ES-HAS. We employ virtualized edge components to collect HAS clients’ requests and retrieve networking information in a time-slotted manner. These components then perform an optimization model in a time-slotted manner to efficiently serve clients’ requests by selecting an optimal cache server (with the shortest fetch time). In case of a cache miss, a client’s request is served (i) by an optimal replacement quality (only better quality levels with minimum deviation) from a cache server, or (ii) by the original requested quality level from the origin server. This approach is validated through experiments on a large-scale testbed, and the performance of our framework is compared to pure client-based strategies and the SABR system [11]. Although SABR and ES-HAS show (almost) identical performance in the number of quality switches, ES-HAS outperforms SABR in terms of playback bitrate and the number of stalls by at least 70% and 40%, respectively.

Keywords: Dynamic Adaptive Streaming over HTTP (DASH), Edge Computing, Network-Assisted Video Streaming, Quality of Experience (QoE), Software Defined Networking (SDN), Network Function Virtualization (NFV)

The paper “PSTR: Per-title encoding using Spatio-Temporal Resolutions” has been accepted for publication at the IEEE International Conference on Multimedia and Expo (ICME) 2021 at July 5-9, 2021 Shenzhen, China.

Authors: Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), and Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK)

Abstract: Current per-title encoding schemes encode the same video content (or snippets/subsets thereof) at various bitrates and spatial resolutions to find an optimal bitrate ladder for each video content. Compared to traditional approaches, in which a predefined, content-agnostic (“fit-to-all”) encoding ladder is applied to all video contents, per-title encoding can result in (i) a significant decrease of storage and delivery costs and (ii) an increase in the Quality of Experience. In the current per-title encoding schemes, the bitrate ladder is optimized using only spatial resolutions, while we argue that with the emergence of high framerate videos, this principle can be extended to temporal resolutions as well. In this paper, we improve the per-title encoding for each content using spatio-temporal resolutions. Experimental results show that our proposed approach doubles the performance of bitrate saving by considering both temporal and spatial resolutions compared to considering only spatial resolutions.

Keywords: Bitrate ladder, per-title encoding, framerate, spatial resolution.

IEEE International Conference on Multimedia and Expo (ICME) , 5-9 July 2021, Shenzhen, China

Authors: Bernhard Rinner, Christian Bettstetter, Hermann Hellwagner, and Stephan Weiss

Abstract: Drones have evolved from bulky research platforms to everyday objects that enable a variety of innovative applications. One of the current challenges is to unite individual drones into an integrated autonomous system. They should operate as a networked team to provide novel functionality that multiple individual drones can never achieve. This article addresses the building blocks of such multidrone systems: wireless connectivity, communication, and coordination. We discuss implementation aspects in three experimental case studies, compare our techniques for improving resource efficiency, and present some “lessons learned” from our research experience in this area.

NOSSDAV’21: The 31st edition of the Workshop on Network and Operating System Support for Digital Audio and Video
Sept. 28-Oct. 1, 2021, Istanbul, Turkey
Conference Website

Authors: Babak Taraghi (Alpen-Adria-Universität Klagenfurt), Abdelhak Bentaleb (National University of Singapore), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), Roger Zimmermann (National University of Singapore) and Hermann Hellwagner (Alpen-Adria-Universität Klagenfurt)

Abstract: Adaptive BitRate (ABR) algorithms play a crucial role in delivering the highest possible viewer’s Quality of Experience (QoE) in HTTP Adaptive Streaming (HAS). Online video streaming service providers use HAS – the dominant video streaming technique on the Internet – to deliver the best QoE for their users. Viewer’s delightfulness relies heavily on how the ABR of a media player can adapt the stream’s quality to the current network conditions. QoE for end-to-end video streaming sessions has been evaluated in many research projects to give better insight into the quality metrics. Objective evaluation models such as ITU Telecommunication Standardization Sector (ITU-T) P.1203 allow for the calculation of Mean Opinion Score (MOS) by considering various QoE metrics, and subjective evaluation is the best assessment approach in investigating the end-user opinion over a video streaming session’s experienced quality. We have conducted subjective evaluations with crowdsourced participants and evaluated the MOS of the sessions using the ITU-T P.1203 quality model. This paper’s main contribution is subjective evaluation analogy with objective evaluation for well-known heuristic-based ABRs.

Keywords: HTTP Adaptive Streaming, ABR Algorithms, Quality of Experience, Crowdsourcing, Subjective Evaluation, Objective Evaluation, MOS, (ITU-T) P.1203

Hadi

Authors: Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), and Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK)

Abstract: Light field imaging enables some post-processing capabilities like refocusing, changing view perspective, and depth estimation. As light field images are represented by multiple views they contain a huge amount of data that makes compression inevitable. Although there are some proposals to efficiently compress light field images, their main focus is on encoding efficiency. However, some important functionalities such as viewpoint and quality scalabilities, random access, and uniform quality distribution have not been addressed adequately. In this paper, an efficient light field image compression method based on a deep neural network is proposed, which classifies multiple views into various layers. In each layer, the target view is synthesized from the available views of previously encoded/decoded layers using a deep neural network. This synthesized view is then used as a virtual reference for the target view inter-coding. In this way, random access to an arbitrary view is provided. Moreover, uniform quality distribution among multiple views is addressed. In higher bitrates where random access to an arbitrary view is more crucial, the required bitrate to access the requested view is minimized.

Keywords: Light field, Compression, Scalable, Random Access.

Data Compression Conference (DCC)

23-26 March 2021, Snowbird, Utah, USA

https://www.cs.brandeis.edu/~dcc

Authors: Prateek Agrawal (University of Klagenfurt, Austria), Anatoliy Zabrovskiy (University of Klagenfurt, Austria), Adithyan Ilagovan (Bitmovin Inc., CA, USA), Christian Timmerer (University of Klagenfurt, Austria), Radu Prodan (University of Klagenfurt, Austria)

Abstract: HTTP adaptive streaming of video content becomes an integrated part of the Internet and dominates other streaming protocols and solutions. The duration of creating video content for adaptive streaming ranges from seconds or up to several hours or days, due to the plethora of video transcoding parameters and video source types. Although, the computing resources of different transcoding platforms and services constantly increase, accurate and fast transcoding time prediction and scheduling is still crucial. We propose in this paper a novel method called Fast video Transcoding Time Prediction and Scheduling (FastTTPS) of x264 encoded videos based on three phases: (i) transcoding data engineering, (ii) transcoding time prediction, and (iii) transcoding scheduling. Read more

Authors: Hadi Amirpour (Alpen-Adria-Universität Klagenfurt),Ekrem Çetinkaya (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), and Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK)

Abstract: HTTP Adaptive Streaming (HAS) enables high quality stream-ing of video contents. In HAS, videos are divided into short intervalscalled segments, and each segment is encoded at various quality/bitratesto adapt to the available bandwidth. Multiple encodings of the same con-tent imposes high cost for video content providers. To reduce the time-complexity of encoding multiple representations, state-of-the-art methods typically encode the highest quality representation first and reusethe information gathered during its encoding to accelerate the encodingof the remaining representations. As encoding the highest quality rep-resentation requires the highest time-complexity compared to the lowerquality representations, it would be a bottleneck in parallel encoding scenarios and the overall time-complexity will be limited to the time-complexity of the highest quality representation. In this paper and toaddress this problem, we consider all representations from the highestto the lowest quality representation as a potential, single reference toaccelerate the encoding of the other, dependent representations. We for-mulate a set of encoding modes and assess their performance in terms ofBD-Rate and time-complexity, using both VMAF and PSNR as objec-tive metrics. Experimental results show that encoding a middle qualityrepresentation as a reference, can significantly reduce the maximum en-coding complexity and hence it is an efficient way of encoding multiplerepresentations in parallel. Based on this fact, a fast multirate encodingmethod is proposed which utilizes depth and prediction mode of a middle quality representation to accelerate the encoding of the dependentrepresentations.

The International MultiMedia Modeling Conference (MMM)

25-27 January 2021, Prague, Czech Republic

Link: https://mmm2021.cz

Keywords: HEVC, Video Encoding , Multirate Encoding , DASH

Authors: Jesús Aguilar Armijo (Alpen-Adria-Universität Klagenfurt), Babak Taraghi (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), and Hermann Hellwagner (Alpen-Adria-Universität Klagenfurt)

Abstract: Adaptive video streaming systems typically support different media delivery formats, e.g., MPEG-DASH and HLS, replicating the same content multiple times into the network. Such a diversified system results in inefficient use of storage, caching, and bandwidth resources. The Common Media Application Format (CMAF) emerges to simplify HTTP Adaptive Streaming (HAS), providing a single encoding and packaging
format of segmented media content and offering the opportunities of bandwidth savings, more cache hits and less storage needed. However, CMAF is not yet supported by most devices. To solve this issue, we present a solution where we maintain the main
advantages of CMAF while supporting heterogeneous devices using different media delivery formats. For that purpose, we propose to dynamically convert the content from CMAF to the desired media delivery format at an edge node. We study the bandwidth savings with our proposed approach using an analytical model and simulation, resulting in bandwidth savings of up to 20% with different media delivery format distributions.
We analyze the runtime impact of the required operations on the segmented content performed in two scenarios: the classic one, with four different media delivery formats, and the proposed scenario, using CMAF-only delivery through the network. We
compare both scenarios with different edge compute power assumptions. Finally, we perform experiments in a real video streaming testbed delivering MPEG-DASH using CMAF content to serve a DASH and an HLS client, performing the media conversion for the latter one.

IEEE International Symposium on Multimedia (ISM)

2-4 December 2020, Naples, Italy

https://www.ieee-ism.org/

Keywords: CMAF, Edge Computing, HTTP Adaptive Streaming (HAS)