Multimedia Communication


Between Two and Six? Towards Correct Estimation of JND Step Sizes for VMAF-based Bitrate Laddering

14th International Conference on Quality of Multimedia Experience (QoMEX)
September 5-7, 2022 | Lippstadt, Germany

Hadi Amirpour (Alpen-Adria-Universität Klagenfurt)Raimund Schatz (AIT Austrian Institute of Technology, Austria)and Christian Timmerer (Alpen-Adria-Universität Klagenfurt)

Abstract: We currently witness the rapidly growing importance of intelligent video streaming quality optimization and reduction of video delivery costs. Per-Title encoding, in contrast to a fixed bitrate ladder, shows significant promise to deliver higher quality video streams by addressing the trade-off between compression efficiency and video characteristics such as resolution and frame rate. Selecting encodings with noticeable quality differences in between prevents the construction of an inefficient bitrate ladder that suffers from too similar quality representations. In this respect, the VMAF metric represents a promising foundation for bitrate laddering, as it currently yields the highest video quality prediction performance. However, the minimum noticeable quality difference, referred as to just-noticeable-difference (JND), has not been properly validated for VMAF yet, with existing sources proposing highly diverse ΔVMAF step sizes ranging from two to six.


FuRA: Fully Random Access Light Field Image Compression

10th European Workshop on Visual Information Processing (EUVIP)
September 11-14, 2022 | Lisbon, Portugal

Hadi Amirpour (Alpen-Adria-Universität Klagenfurt),  Christine Guillemot (INRIA, France)and Christian Timmerer (Alpen-Adria-Universität Klagenfurt)

Abstract: Light fields are typically represented by multi-view images and enable post-capture actions such as refocusing and perspective shift. To compress a light field image, its view images are typically converted into a pseudo video sequence (PVS) and the generated PVS is compressed using a video codec. However, when using the inter-coding tool of a video codec to exploit the redundancy among view images, the possibility to randomly access any view image is lost. On the other hand, when video codecs independently encode view images using the intra-coding tool, random access to view images is enabled, however, at the expense of a significant drop in the compression efficiency. To address this trade-off, we propose to use neural representations to represent 4D light fields. For each light field, a multi-layer perceptron (MLP) is trained to map the light field four dimensions to the color space, thus enabling random access even to pixels. To achieve higher compression efficiency, neural network compression techniques are deployed. The proposed method outperforms the compression efficiency of HEVC inter-coding, while providing random access to view images and even pixel values.

Fully Random Access Light Field Image Compression

Low Latency Live Streaming Implementation in DASH and HLS

ACM Multimedia Conference – OSS Track

Lisbon, Portugal | 10-14 October 2022

Abdelhak Bentaleb (National University of Singapore), Zhengdao Zhan (National University of Singapore), Farzad Tashtarian (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), May Lim (National University of Singapore), Saad Harous (University of Sharjah), Christian Timmerer (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Hermann Hellwagner (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), and Roger Zimmermann (National University of Singapore)

Low latency live streaming over HTTP using Dynamic Adaptive Streaming over HTTP (LL-DASH) and HTTP Live Streaming} (LL-HLS) has emerged as a new way to deliver live content with respectable video quality and short end-to-end latency. Satisfying these requirements while maintaining viewer experience in practice is challenging, and adopting conventional adaptive bitrate (ABR) schemes directly to do so will not work. Therefore, recent solutions including LoL$^+$, L2A, Stallion, and Llama re-think conventional ABR schemes to support low-latency scenarios. These solutions have been integrated with dash.js  that support LL-DASH. However, their performance in LL-HLS remains in question. To bridge this gap, we implement and integrate existing LL-DASH ABR schemes in the hls.js video player which supports LL-HLS.
Moreover, a series of real-world trace-driven experiments have been conducted to check their efficiency under various network conditions including a comparison with results achieved for LL-DASH in dash.js.

IEEE 14th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP 2022)

June 26-29, 2022 | Nafplio, Greece

Conference Website


Ekrem Çetinkaya (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Minh Nguyen (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), and Christian Timmerer (Christian Doppler LaboratoryATHENA, Alpen-Adria-Universität Klagenfurt)


Abstract: Video is now an essential part of the Internet. The increasing popularity of video streaming on mobile devices and the improvement in mobile displays brought together challenges to meet user expectations. Advancements in deep neural networks have seen successful applications on several computer vision tasks such as super-resolution (SR). Although DNN-based SR methods significantly improve over traditional methods, their computational complexity makes them challenging to apply on devices with limited power, such as smartphones. However, with the improvement in mobile hardware, especially GPUs, it is now possible to use DNN based solutions, though existing DNN based SR solutions are still too complex. This paper proposes LiDeR, a lightweight video SR network specifically tailored toward mobile devices. Experimental results show that LiDeR can achieve competitive SR performance with state-of-the-art networks while improving the execution speed significantly, i.e., 267 % for X4 upscaling and 353 % for X2 upscaling compared to ESPCN.

Keywords: Super-resolution, Mobile machine learning, Video super-resolution.


Detection and Localization of Video Transcoding From AVC to HEVC Based on Deep Representations of Decoded Frames and PU Maps

IEEE Transactions on Multimedia

Haichao Yao (Beijing Jiaotong University), Rongrong Ni (Beijing Jiaotong University), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt)Yao Zhao (Beijing Jiaotong University).


Video Encoding Optimizations for Live Video Streaming

FOKUS Media Web Symposium

20th – 24th June 2022 | Berlin, Germany


Abstract: Live video streaming is expected to become mainstream in the fifth-generation (5G) mobile networks. Optimizing video encoding for live video streaming is challenging due to the latency introduced by any optimization method. In this talk, we introduce low-latency video optimization methods that are utilized to improve the quality of video encodings by predicting optimized encoding parameters.

Hadi Amirpour is a postdoc research fellow at ATHENA  directed by Prof. Christian Timmerer. He received his B.Sc. degrees in Electrical and Biomedical Engineering, and he pursued his M.Sc. in Electrical Engineering. He got his Ph.D. in computer science from the University of Klagenfurt in 2022. He was appointed co-chair of Task Force 7 (TF7) Immersive Media Experience (IMEx) at the 15th Qualinet meeting. He was involved in the project EmergIMG, a Portuguese consortium on emerging imaging technologies, funded by the Portuguese funding agency and H2020. Currently, he is working on the ATHENA project in cooperation with its industry partner Bitmovin. His research interests are image processing and compression, video processing and compression, quality of experience, emerging 3D imaging technology, and medical image analysis.


Authors: Haichao Yao (Beijing Jiaotong University), Rongrong Ni (Beijing Jiaotong University), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt)Yao Zhao (Beijing Jiaotong University).

The 13th ACM Multimedia Systems Conference MMSys’22

14th – 17th June 2022 | Athlone, Ireland.

Conference Website

The ACM Multimedia Systems Conference (MMSys) provides a forum for researchers to present and share their latest research findings in multimedia systems. While research about specific aspects of multimedia systems is regularly published in the various proceedings and transactions of the networking, operating systems, real-time systems, databases, mobile computing, distributed systems, computer vision, and middleware communities, MMSys aims to cut across these domains in the context of multimedia data types.

This year, MMSys hosted around 150 on-site participants from academia and industry. Five ATHENA members travelled to Athlone, Ireland, to present four papers by Reza Farahani, Babak Taraghi, and Vignesh V Menon in two tracks, i.e., Open Dataset & Software Track and Demo & Industry Track.

Moreover, two presentations in Mentoring & Postdoc Networking event by ATHENA postdocs:

  • Hadi Amirpour,  “Video Encoding Optimization for Live Video Streaming” (pdf)
  • Farzad Tashtarian, “QoE Optimization in Live Streaming” (pdf)


We presented the poster “The Power in Your Pocket: Boosting Video Quality with Super-Resolution on Mobile Devices” at the Austrian Computer Science Day 2022 conference. The poster summarizes research about improving the visual quality of video streaming on mobile devices by utilizing deep neural network-based enhancement techniques.

Here is the list of papers that we cover in the poster:

  1. Super-resolution Based Bitrate Adaptation for HTTP Adaptive Streaming for Mobile Devices

  2. MoViDNN: A Mobile Platform for Evaluating Video Quality Enhancement with Deep Neural Networks

This years´s “Lange Nacht der Forschung” took place on May 20, 2022. The LNDF is Austria´s most significant national research event to present the accomplishments to the broad public. ITEC was represented by three stations and involved in the station of Computer Games and Engineering, and it was a fantastic experience for everyone! We tried to make our research easily understandable for everyone.