Christian Timmerer

QUALINET announces its recent White Paper on Definitions of Immersive Media Experience (IMEx)


With the coming of age of virtual/augmented reality and interactive media, numerous definitions, frameworks, and models of immersion have emerged across different fields ranging from computer graphics to literary works. Immersion is oftentimes used interchangeably with presence as both concepts are closely related. However, there are noticeable interdisciplinary differences regarding definitions, scope, and constituents that are required to be addressed so that a coherent understanding of the concepts can be achieved. Such consensus is vital for paving the directionality of the future of immersive media experiences (IMEx) and all related matters. Read more

ACM Multimedia’20: Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks


Authors: Negin Ghamsarian (Alpen-Adria-Universität Klagenfurt), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), Mario Taschwer (Alpen-Adria-Universität Klagenfurt), and Klaus Schöffmann (Alpen-Adria-Universität Klagenfurt)

Abstract: Recorded cataract surgery videos play a prominent role in training and investigating the surgery, and enhancing the surgical outcomes. Due to storage limitations in hospitals, however, the recorded cataract surgeries are deleted after a short time and this precious source of information cannot be fully utilized. Lowering the quality to reduce the required storage space is not advisable since the degraded visual quality results in the loss of relevant information that limits the usage of these videos. To address this problem, we propose a relevance-based compression technique consisting of two modules: (i) relevance detection, which uses neural networks for semantic segmentation and classification of the videos to detect relevant spatio-temporal information, and (ii) content-adaptive compression, which restricts the amount of distortion applied to the relevant content while allocating less bitrate to irrelevant content. The proposed relevance-based compression framework is implemented considering five scenarios based on the definition of relevant information from the target audience’s perspective. Experimental results demonstrate the capability of the proposed approach in relevance detection. We further show that the proposed approach can achieve high compression efficiency by abstracting substantial redundant information while retaining the high quality of the relevant content.

ACM International Conference on Multimedia 2020, Seattle, United States.


Keywords: Video Coding, Convolutional Neural Networks, HEVC, ROI Detection, Medical Multimedia.

BigMM’20: ComplexCTTP: Complexity Class Based Transcoding Time Prediction for Video Sequences using Artificial Neural Network


The Sixth IEEE International Conference on Multimedia Big Data (BigMM 2020)

Authors: Anatoliy Zabrovskiy (Alpen-Adria-Universitat Klagenfurt), Prateek Agrawal (Alpen-Adria-Universitat Klagenfurt, Lovely Professional University), Roland Matha (Alpen-Adria-Universitat Klagenfurt), Christian Timmerer (Alpen-Adria-Universitat Klagenfurt, Bitmovin) and Radu Prodan (Alpen-Adria-Universitat Klagenfurt).

Abstract: HTTP Adaptive Streaming of video content is becoming an integral part of the Internet and accounts for the majority of today’s traffic. Although Internet bandwidth is constantly increasing, video compression technology plays an important role and the major challenge is to select and set up multiple video codecs, each with hundreds of transcoding parameters. Additionally, the transcoding speed depends directly on the selected transcoding parameters and the infrastructure used. Predicting transcoding time for multiple transcoding parameters with different codecs and processing units is a challenging task, as it depends on many factors. This paper provides a novel and considerably fast method for transcoding time prediction using video content classification and neural network prediction. Our artificial neural network (ANN) model predicts the transcoding times of video segments for state-of-the-art video codecs based on transcoding parameters and content complexity. We evaluated our method for two video codecs/implementations (AVC/x264 and HEVC/x265) as part of large-scale HTTP Adaptive Streaming services. The ANN model of our method is able to predict the transcoding time by minimizing the mean absolute error (MAE) to 1.37 and 2.67 for x264 and x265 codecs, respectively. For x264, this is an improvement of 22% compared to the state of the art.

Keywords: Transcoding time prediction, adaptive streaming, video transcoding, neural networks, video encoding, video complexity class, MPEG-DASH