Multimedia Communication

Abstract: Video accounts for the vast majority of today’s internet traffic and video coding is vital for efficient distribution towards the end-user. Software- or/and cloud-based video coding is becoming more and more attractive, specifically with the plethora of video codecs available right now (e.g., AVC, HEVC, VVC, VP9, AV1, etc.) which is also supported by the latest Bitmovin Video Developer Report 2020. Thus, improvements in video coding enabling efficient adaptive video streaming is a requirement for current and future video services. HTTP Adaptive Streaming (HAS) is now mainstream due to its simplicity, reliability, and standard support (e.g., MPEG-DASH). For HAS, the video is usually encoded in multiple versions (i.e., representations) of different resolutions, bitrates, codecs, etc. and each representation is divided into chunks (i.e., segments) of equal length (e.g., 2-10 sec) to enable dynamic, adaptive switching during streaming based on the user’s context conditions (e.g., network conditions, device characteristics, user preferences). In this context, most scientific papers in the literature target various improvements which are evaluated based on open, standard test sequences. We argue that optimizing video encoding for large scale HAS deployments is the next step in order to improve the Quality of Experience (QoE), while optimizing costs.

Session organizers: Christian Timmerer (Bitmovin, Austria), Mohammad Ghanbari (University of Essex, UK), and Alex Giladi (Comcast, USA).

Picture Coding Symposium (PCS)  at 29 June to 2 July 2021, UK

Link: https://pcs2021.org

Christian Timmerer

Teaser: “Help me, Obi-Wan Kenobi. You’re my only hope,” said the hologram of Princess Leia in Star Wars: Episode IV – A New Hope (1977). This was the first time in cinematic history that the concept of holographic-type communication was illustrated. Almost five decades later, technological advancements are quickly moving this type of communication from science fiction to reality.

Authors: Jeroen van der Hooft (Ghent University), Maria Torres Vega (Ghent University), Tim Wauters (Ghent University), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), Ali C. Begen (Ozyegin University, Networked Media), Filip De Turck (Ghent University), and Raimund Schatz (AIT Austrian Institute of Technology)

Abstract: Technological improvements are rapidly advancing holographic-type content distribution. Significant research efforts have been made to meet the low-latency and high-bandwidth requirements set forward by interactive applications such as remote surgery and virtual reality. Recent research made six degrees of freedom (6DoF) for immersive media possible, where users may both move their heads and change their position within a scene. In this article, we present the status and challenges of 6DoF applications based on volumetric media, focusing on the key aspects required to deliver such services. Furthermore, we present results from a subjective study to highlight relevant directions for future research.

Link: IEEE Communication Magazine

Authors: Prateek Agrawal (University of Klagenfurt, Austria), Deepak Chaudhary (Lovely Professional University, India), Vishu Madaan (Lovely professional University, India), Anatoliy Zabrovskiy (University of Klagenfurt, Austria), Radu Prodan (University of Klagenfurt, Austria), Dragi Kimovski (University of Klagenfurt, Austria), Christian Timmerer (University of Klagenfurt, Austria)

Abstract: Automated bank cheque verification using image processing is an attempt to complement the present cheque truncation system, as well as to provide an alternate methodology for the processing of bank cheques with minimal human intervention. When it comes to the clearance of the bank cheques and monetary transactions, this should not only be reliable and robust but also save time which is one of the major factor for the countries having large population. Read more

Authors: Ekrem Çetinkaya (Alpen-Adria-Universität Klagenfurt), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), and Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK)

Abstract: HTTP Adaptive Streaming (HAS) is the most common approach for delivering video content over the Internet. The requirement to encode the same content at different quality levels (i.e., representations) in HAS is a challenging problem for content providers. Fast multirate encoding approaches try to accelerate this process by reusing information from previously encoded representations. In this paper, we use convolutional neural networks (CNNs) to speed up the encoding of multiple representations with a specific focus on parallel encoding. In parallel encoding, the overall time-complexity is limited to the maximum time-complexity of one of the representations that are encoded in parallel. Therefore, instead of reducing the time-complexity for all representations, the highest time-complexities are reduced. Experimental results show that the proposed method achieves significant time-complexity savings in parallel encoding scenarios (41%) with a slight increase in bitrate and quality degradation compared to the HEVC reference software.

Keywords: Video Coding, Convolutional Neural Networks, HEVC, HTTP Adaptive Streaming (HAS)

Christian Timmerer

With the coming of age of virtual/augmented reality and interactive media, numerous definitions, frameworks, and models of immersion have emerged across different fields ranging from computer graphics to literary works. Immersion is oftentimes used interchangeably with presence as both concepts are closely related. However, there are noticeable interdisciplinary differences regarding definitions, scope, and constituents that are required to be addressed so that a coherent understanding of the concepts can be achieved. Such consensus is vital for paving the directionality of the future of immersive media experiences (IMEx) and all related matters. Read more

Authors: Negin Ghamsarian (Alpen-Adria-Universität Klagenfurt), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), Mario Taschwer (Alpen-Adria-Universität Klagenfurt), and Klaus Schöffmann (Alpen-Adria-Universität Klagenfurt)

Abstract: Recorded cataract surgery videos play a prominent role in training and investigating the surgery, and enhancing the surgical outcomes. Due to storage limitations in hospitals, however, the recorded cataract surgeries are deleted after a short time and this precious source of information cannot be fully utilized. Lowering the quality to reduce the required storage space is not advisable since the degraded visual quality results in the loss of relevant information that limits the usage of these videos. To address this problem, we propose a relevance-based compression technique consisting of two modules: (i) relevance detection, which uses neural networks for semantic segmentation and classification of the videos to detect relevant spatio-temporal information, and (ii) content-adaptive compression, which restricts the amount of distortion applied to the relevant content while allocating less bitrate to irrelevant content. The proposed relevance-based compression framework is implemented considering five scenarios based on the definition of relevant information from the target audience’s perspective. Experimental results demonstrate the capability of the proposed approach in relevance detection. We further show that the proposed approach can achieve high compression efficiency by abstracting substantial redundant information while retaining the high quality of the relevant content.

ACM International Conference on Multimedia 2020, Seattle, United States.

Link: https://2020.acmmm.org

Keywords: Video Coding, Convolutional Neural Networks, HEVC, ROI Detection, Medical Multimedia.

Authors: Minh Nguyen, Hadi Amirpour, Christian Timmerer, Hermann Hellwagner (Alpen-Adria-Universität Klagenfurt)

Abstract: HTTP/2 has been explored widely for video streaming, but still suffers from Head-of-Line blocking, and three-way hand-shake delay due to TCP. Meanwhile, QUIC running on top of UDP can tackle these issues. In addition, although many adaptive bitrate (ABR) algorithms have been proposed for scalable and non-scalable video streaming, the literature lacks an algorithm designed for both types of video streaming approaches. In this paper, we investigate the impact of quick and HTTP/2 on the performance of adaptive bitrate(ABR) algorithms in terms of different metrics. Moreover, we propose an efficient approach for utilizing scalable video coding formats for adaptive video streaming that combines a traditional video streaming approach (based on non-scalable video coding formats) and a retransmission technique. The experimental results show that QUIC benefits significantly from our proposed method in the context of packet loss and retransmission.

Compared to HTTP/2, it improves the average video quality and also provides a smoother adaptation behavior. Finally, we demonstrate that our proposed method originally designed for non-scalable video codecs also works efficiently for scalable videos such as Scalable High EfficiencyVideo Coding (SHVC).

Keywords: QUIC, H2BR, HTTP adaptive streaming, Retransmission, SHVC

Conference: ACM SIGCOMM 2020 Workshop on Evolution, Performance, and Interoperability of QUIC (EPIQ 2020), August 10-14, 2020, Newyork City, USA.

Link: https://conferences.sigcomm.org/sigcomm/2020/workshop-epiq.html

Christian Timmerer

Title: Objective and Subjective QoE Evaluation for Adaptive Point Cloud Streaming

Authors: Jeroen van der Hooft (Ghent University), Maria Torres Vega (Ghent University), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), Ali C. Begen (Ozyegin University, Networked Media), Filip De Turck (Ghent University), Raimund Schatz (Alpen-Adria Universität Klagenfurt & AIT Austrian Institute of Technology, Austria)

Abstract: Volumetric media has the potential to provide the six degrees of freedom (6DoF) required by truly immersive media. However, achieving 6DoF requires ultra-high bandwidth transmissions, which real-world wide area networks cannot provide economically. Therefore, recent efforts have started to target efficient delivery of volumetric media, using a combination of compression and adaptive streaming techniques. It remains, however, unclear how the effects of such techniques on the user perceived quality can be accurately evaluated. In this paper, we present the results of an extensive objective and subjective quality of experience (QoE) evaluation of volumetric 6DoF streaming. We use PCC-DASH, a standards-compliant means for HTTP adaptive streaming of scenes comprising multiple dynamic point cloud objects. By means of a thorough analysis we investigate the perceived quality impact of the available bandwidth, rate adaptation algorithm, viewport prediction strategy and user’s motion within the scene. We determine which of these aspects has more impact on the user’s QoE, and to what extent subjective and objective assessments are aligned.

Keywords: Volumetric Media; HTTP Adaptive Streaming; 6DoF; MPEG V-PCC; QoE Assessment; Objective Metrics

International Conference on Quality of Multimedia Experience (QoMEX)
May 26-28, 2020, Athlone, Ireland
http://qomex2020.ie/

Authors: Minh Nguyen (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt / Bitmovin Inc.), Hermann Hellwagner (Alpen-Adria-Universität Klagenfurt)

Abstract: HTTP-based Adaptive Streaming (HAS) plays a key role in over-the-top video streaming. It contributes towards reducing the rebuffering duration of video playout by adapting the video quality to the current network conditions. However, it incurs variations of video quality in a streaming session because of the throughput fluctuation, which impacts the user’s Quality of Experience (QoE). Besides, many adaptive bitrate (ABR) algorithms choose the lowest-quality segments at the beginning of the streaming session to ramp up the playout buffer as soon as possible. Although this strategy decreases the startup time, the users can be annoyed as they have to watch a low-quality video initially. In this paper, we propose an efficient retransmission technique, namely H2BR, to replace low-quality segments being stored in the playout buffer with higher-quality versions by using features of HTTP/2 including (i) stream priority, (ii) server push, and (iii) stream termination. The experimental results show that H2BR helps users avoid watching low video quality during video playback and improves the user’s QoE. H2BR can decrease by up to more than 70% the time when the users suffer the lowest-quality video as well as benefits the QoE by up to 13%.

Keywords: HTTP adaptive streaming, DASH, ABR algorithms, QoE, HTTP/2

Packet Video Workshop 2020 (PV) June 10-11, 2020, Istanbul, Turkey (co-located with ACM MMSys’20)

Link: https://2020.packet.video/

Authors: Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), and Mohammad Ghanbari (University of Essex)

Abstract: Holography is able to reconstruct a three-dimensional structure of an object by recording full wave fields of light emitted from the object. This requires a huge amount of data to be encoded, stored, transmitted, and decoded for holographic content, making its practical usage challenging especially for bandwidth-constrained networks and memory-limited devices. In the delivery of holographic content via the internet, bandwidth wastage should be avoided to tackle high bandwidth demands of holography streaming. For real-time applications, encoding time-complexity is also a major problem. In this paper, the concept of dynamic adaptive streaming over HTTP (DASH) is extended to holography image streaming and view-aware adaptation techniques are studied. As each area of a hologram contains information of a specific view, instead of encoding and decoding the entire hologram, just the part required to render the selected view is encoded and transmitted via the network based on the users’ interactivity. Four different strategies, namely, monolithic, single view, adaptive view, and non-real time streaming strategies are explained and compared in terms of bandwidth requirements, encoding time-complexity, and bitrate overhead. Experimental results show that the view-aware methods reduce the required bandwidth for holography streaming at the cost of a bitrate increase.

Keywords: Holography, compression, bitrate adaptation, dynamic adaptive streaming over HTTP, DASH.