, ,

Paper Accepted at ICONIP 2021

Congratulations to Negin Ghamsarian et al., who got their paper “ReCal-Net: Joint Region-Channel-Wise Calibrated Network for Semantic Segmentation in Cataract Surgery Videos” accepted at the International Conference on Neural Information Processing (ICONIP 2021).

Abstract: Semantic segmentation in surgical videos is a prerequisite for a broad range of applications towards improving surgical outcomes and surgical video analysis. However, semantic segmentation in surgical videos involves many challenges. In particular, in cataract surgery, various features of the relevant objects such as blunt edges, color and context variation, reflection, transparency, and motion blur pose a challenge for semantic segmentation. In this paper, we propose a novel convolutional module termed as ReCal module, which can calibrate the feature maps by employing region intra-and-inter-dependencies and channel-region cross-dependencies. This calibration strategy can effectively enhance semantic representation by correlating different representations of the same semantic label, considering a multi-angle local view centering around each pixel. Thus the proposed module can deal with distant visual characteristics of unique objects as well as cross-similarities in the visual characteristics of different objects. Moreover, we propose a novel network architecture based on the proposed module termed as ReCal-Net. Experimental results confirm the superiority of ReCal-Net compared to rival state-of-the-art approaches for all relevant objects in cataract surgery. Moreover, ablation studies reveal the effectiveness of the ReCal module in boosting semantic segmentation accuracy.

, ,

Paper accepted – On The Impact of Viewing Distance on Perceived Video Quality

Title: On The Impact of Viewing Distance on Perceived Video Quality

Link: IEEE Visual Communications and Image Processing (VCIP 2021) 5-8 December 2021, Munich, Germany

Authors: Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Raimund Schatz (AIT Austrian Institute of Technology, Austria), Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt)

Abstract: Due to the growing importance of optimizing quality and efficiency of video streaming delivery, accurate assessment of user perceived video quality becomes increasingly relevant. However, due to the wide range of viewing distances encountered in real-world viewing settings, actually perceived video quality can vary significantly in everyday viewing situations. In this paper, we investigate and quantify the influence of viewing distance on perceived video quality.  A subjective experiment was conducted with full HD sequences at three different stationary viewing distances, with each video sequence being encoded at three different quality levels. Our study results confirm that the viewing distance has a significant influence on the quality assessment. In particular, they show that an increased viewing distance generally leads to an increased perceived video quality, especially at low media encoding quality levels. In this context, we also provide an estimation of potential bitrate savings that knowledge of actual viewing distance would enable in practice.
Since current objective video quality metrics do not systematically take into account viewing distance, we also analyze and quantify the influence of viewing distance on the correlation between objective and subjective metrics. Our results confirm the need for distance-aware objective metrics when accurate prediction of perceived video quality in real-world environments is required.

, ,

Paper accepted – Improving Per-title Encoding for HTTP Adaptive Streaming by Utilizing Video Super-resolution

Title: Improving Per-title Encoding for HTTP Adaptive Streaming by Utilizing Video Super-resolution

Link: IEEE Visual Communications and Image Processing (VCIP 2021) 5-8 December 2021, Munich, Germany

Authors: Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Hannaneh Barahouei Pasandi (Virginia Commonwealth University), Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt)

Abstract: In per-title encoding, to optimize a bitrate ladder over spatial resolution, each video segment is downscaled to a set of spatial resolutions and they are all encoded at a given set of bitrates. To find the highest quality resolution for each bitrate, the low-resolution encoded videos are upscaled to the original resolution, and a convex hull is formed based on the scaled qualities. Deep learning-based video super-resolution (VSR) approaches show a significant gain over traditional approaches and they are becoming more and more efficient over time.  This paper improves the per-title encoding over the upscaling methods by using deep neural network-based VSR algorithms as they show a significant gain over traditional approaches. Utilizing a VSR algorithm by improving the quality of low-resolution encodings can improve the convex hull. As a result, it will lead to an improved bitrate ladder. To avoid bandwidth wastage at perceptually lossless bitrates a maximum threshold for the quality is set and encodings beyond it are eliminated from the bitrate ladder. Similarly, a minimum threshold is set to avoid low-quality video delivery. The encodings between the maximum and minimum thresholds are selected based on one Just Noticeable Difference. Our experimental results show that the proposed per-title encoding results in a 24% bitrate reduction and 53% storage reduction compared to the state-of-the-art method.

, ,

Paper accepted – INTENSE: In-depth Studies on Stall Events and Quality Switches and Their Impact on the Quality of Experience in HTTP Adaptive Streaming

Title: INTENSE: In-depth Studies on Stall Events and Quality Switches and Their Impact on the Quality of Experience in HTTP Adaptive Streaming

Link: IEEE Access, A Multidisciplinary, Open-access Journal of the IEEE

[PDF]

Babak Taraghi (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Minh Nguyen (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Hadi Amirpour (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt)

Abstract: With the recent growth of multimedia traffic over the Internet and emerging multimedia streaming service providers, improving Quality of Experience (QoE) for HTTP Adaptive Streaming (HAS) becomes more important. Alongside other factors, such as the media quality, HAS relies on the performance of the media player’s Adaptive Bitrate (ABR) algorithm to optimize QoE in multimedia streaming sessions. QoE in HAS suffers from weak or unstable internet connections and suboptimal ABR decisions. As a result of imperfect adaptiveness to the characteristics and conditions of the internet connection, stall events and quality level switches could occur and with different durations that negatively affect the QoE. In this paper, we address various identified open issues related to the QoE for HAS, notably (i) the minimum noticeable duration for stall events in HAS;(ii) the correlation between the media quality and the impact of stall events on QoE; (iii) the end-user preference regarding multiple shorter stall events versus a single longer stall event; and (iv) the end-user preference of media quality switches over stall events. Therefore, we have studied these open issues from both objective and subjective evaluation perspectives and presented the correlation between the two types of evaluations. The findings documented in this paper can be used as a baseline for improving ABR algorithms and policies in HAS.

Keywords: Crowdsourcing; HTTP Adaptive Streaming; Quality of Experience; Quality Switches; Stall Events; Subjective Evaluation; Objective Evaluation.

,

Special issue on Open Media Compression: Overview, Design Criteria, and Outlook on Emerging Standards

Special issue on Open Media Compression: Overview, Design Criteria, and Outlook on Emerging Standards

Proceedings of the IEEE, vol. 109, no. 9, Sept. 2021

By CHRISTIAN TIMMERER, Senior Member IEEE
Guest Editor
MATHIAS WIEN, Member IEEE
Guest Editor
LU YU, Senior Member IEEE
Guest Editor
AMY REIBMAN, Fellow IEEE Guest Editor

Abstract: Multimedia content (i.e., video, image, audio) is responsible for the majority of today’s Internet traffic and numbers are expecting to grow beyond 80% in the near future. For more than 30 years, international standards provide tools for interoperability and are both source and sink for challenging research activities in the domain of multimedia compression and system technologies. The goal of this special issue is to review those standards and focus on (i) the technology developed in the context of these standards and (ii) research questions addressing aspects of these standards which are left open for competition by both academia and industry.

Index Terms—Open Media Standards, MPEG, JPEG, JVET, AOM, Computational Complexity

C. Timmerer, M. Wien, L. Yu and A. Reibman, “Special issue on Open Media Compression: Overview, Design Criteria, and Outlook on Emerging Standards,” in Proceedings of the IEEE, vol. 109, no. 9, pp. 1423-1434, Sept. 2021, doi: 10.1109/JPROC.2021.3098048.

Read more

,

Paper accepted at IEEE eScience 2021: Where to Encode: A Performance Analysis of Intel x86 and Arm-based Amazon EC2 Instances

Title: Where to Encode: A Performance Analysis of Intel x86 and Arm-based Amazon EC2 Instances

20-23 September 2021 // Innsbruck, Austria // Online Conference

Link: IEEE eScience 2021

Authors: Roland Matha´∗, Dragi Kimovski*, Anatoliy Zabrovskiy*‡, Christian Timmerer*†, Radu Prodan*
Institute of Information Technology (ITEC), University of Klagenfurt, Austria*
Bitmovin, Klagenfurt, Austria†
Petrozavodsk State University, Petrozavodsk, Russia‡

Abstract: Video streaming became an undivided part of the Internet. To efficiently utilise the limited network bandwidth it is essential to encode the video content. However, encoding is a computationally intensive task, involving high-performance resources provided by private infrastructures or public clouds. Public clouds, such as Amazon EC2, provide a large portfolio of services and instances optimized for specific purposes and budgets. The majority of Amazon’s instances use x86 processors, such as Intel Xeon or AMD EPYC. However, following the recent trends in computer architecture, Amazon introduced Arm based instances that promise up to 40% better cost performance
ratio than comparable x86 instances for specific workloads. We evaluate in this paper the video encoding performance of x86 and Arm instances of four instance families using the latest FFmpeg version and two video codecs. We examine the impact of the encoding parameters, such as different presets and bitrates, on the time and cost for encoding. Our experiments reveal that Arm instances show high time and cost saving potential of up to
33.63% for specific bitrates and presets, especially for the x264 codec. However, the x86 instances are more general and achieve low encoding times, regardless of the codec.

Index Terms—Amazon EC2, Arm instances, AVC, Cloud computing, FFmpeg, Graviton2, HEVC, Performance analysis, Video encoding.

, ,

Paper accepted at IEEE GLOBECOM: Quality Optimization of Live Streaming Services over HTTP with Reinforcement Learning

Link: IEEE Global Communications Conference 2021

7-11 December 2021 // Madrid, Spain // Hybrid: In-Person and Virtual Conference Connecting Cultures around the Globe

Authors: F. Tashtarian*, R. Falanji‡, A. Bentaleb+, A. Erfanian*, P. S. Mashhadi§,
C. Timmerer*, H. Hellwagner*, R. Zimmermann+
Christian Doppler Laboratory ATHENA, Institute of Information Technology, Alpen-Adria-Universität Klagenfurt, Austria*
Department of Mathematical Science, Sharif University of Technology, Tehran, Iran‡
Department of Computer Science, School of Computing, National University of Singapore (NUS)+
Center for Applied Intelligent Systems Research (CAISR), Halmstad University, Sweden§

Abstract: Recent years have seen tremendous growth in HTTP adaptive live video traffic over the Internet. In the presence of highly dynamic network conditions and diverse request patterns, existing yet simple hand-crafted heuristic approaches for serving client requests at the network edge might incur a large overhead and significant increase in time complexity. Therefore, these approaches might fail in delivering acceptable Quality of Experience (QoE) to end users. To bridge this gap, we propose ROPL, a learning-based client request management solution at the edge that leverages the power of the recent breakthroughs in deep reinforcement learning, to serve requests of concurrent users joining various HTTP-based live video channels. ROPL is able to react quickly to any changes in the environment, performing accurate decisions to serve clients requests, which results in achieving satisfactory user QoE. We validate the efficiency of ROPL through trace-driven simulations and a real-world setup. Experimental results from real-world scenarios confirm that ROPL outperforms existing heuristic-based approaches in terms of QoE, with a factor up to 3.7×.

Index Terms—Network Edge; Request Serving; HTTP Live Streaming; Low Latency; QoE; Deep Reinforcement Learning.

, ,

Paper accepted: End-to-end Quality of Experience Evaluation for HTTP Adaptive Streaming

Title: End-to-end Quality of Experience Evaluation for HTTP Adaptive Streaming

ACM MM’21: The 29th ACM International Conference on Multimedia

October  20-24, 2021,  Chengdu, China

Babak Taraghi (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt)

Abstract: Exponential growth in multimedia streaming traffic over the Internet motivates the research and further investigation of the user’s perceived quality of such services. Enhancement of experienced quality by the users becomes more substantial when service providers compete on establishing superiority by gaining more subscribers or customers. Quality of Experience (QoE) enhancement would not be possible without an authentic and accurate assessment of the streaming sessions. HTTP Adaptive Streaming (HAS) is today’s prevailing technique to deliver the highest possible audio and video content quality to the users. An end-to-end evaluation of QoE in HAS covers the precise measurement of the metrics that affect the perceived quality, eg. startup delay, stall events, and delivered media quality. Mentioned metrics improvements could limit the service’s scalability, which is an important factor in real-world scenarios. In this study, we will investigate the stated metrics, best practices and evaluations methods, and available techniques with an aim to (i) design and develop practical and scalable measurement tools and prototypes, (ii) provide a better understanding of current technologies and techniques (eg. Adaptive Bitrate algorithms), (iii) conduct in-depth research on the significant metrics in a way that improvements of QoE with scalability in mind would be feasible, and finally, (iv) provide a comprehensive QoE model which outperforms state-of-the-art models.

Keywords: HTTP Adaptive Streaming; Quality of Experience; Subjective Evaluation; Objective Evaluation; Adaptive Bitrate; QoE model.

, ,

Paper accepted: CTU Depth Decision Algorithms for HEVC: A Survey

Title: CTU Depth Decision Algorithms for HEVC: A Survey

Link: Signal Processing: Image Communication

[PDF]

Ekrem Çetinkaya* (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Hadi Amirpour*, (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Mohammad Ghanbari (Christian Doppler Laboratory ATHENA, University of Essex),  and Christian Timmerer (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt)

*These authors contributed equally to this work.

Abstract: High Efficiency Video Coding (HEVC) surpasses its predecessors in encoding efficiency by introducing new coding tools at the cost of an increased encoding time-complexity. The Coding Tree Unit (CTU) is the main building block used in HEVC. In the HEVC standard, frames are divided into CTUs with the predetermined size of up to 64 × 64 pixels. Each CTU is then divided recursively into a number of equally sized square areas, known as Coding Units (CUs). Although this diversity of frame partitioning increases encoding efficiency, it also causes an increase in the time complexity due to the increased number of ways to find the optimal partitioning. To address this complexity, numerous algorithms have been proposed to eliminate unnecessary searches during partitioning CTUs by exploiting the correlation in the video. In this paper, existing CTU depth decision algorithms for HEVC are surveyed. These algorithms are categorized into two groups, namely statistics and machine learning approaches. Statistics approaches are further subdivided into neighboring and inherent approaches. Neighboring approaches exploit the similarity between adjacent CTUs to limit the depth range of the current CTU, while inherent approaches use only the available information within the current CTU. Machine learning approaches try to extract and exploit similarities implicitly. Traditional methods like support vector machines or random forests use manually selected features, while recently proposed deep learning methods extract features during training. Finally, this paper discusses extending these methods to more recent video coding formats such as Versatile Video Coding (VVC) and AOMedia Video 1 (AV1).

Keywords: HEVC, Coding Tree Unit, Complexity, CTU Partitioning, Statistics, Machine Learning

,

Drone researcher Michał Barcis and team win the Drone Bot Contest at Deep Drone Challenge 2021

Agata and Michał Barciś and their fellow researcher from RTB House in Poland, Michał Jagielski, competed in the Drone Bot Contest at the Deep Drone Challenge in Ingolstadt, Germany on Saturday 7 August 2021.

The competition is organised by start-up incubator brigkAIR and Europe’s largest aircraft manufacturer Airbus. The three young scientists were delighted to receive a prize of 25,000 Euros.

Read more about it here.