Multimedia Communication

Vignesh V Menon

2022 IEEE International Conference on Multimedia and Expo (ICME)

July 18-22, 2022 | Taipei, Taiwan

Conference Website

Vignesh V Menon (Alpen-Adria-Universität Klagenfurt),  Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK)and Christian Timmerer (Alpen-Adria-Universität Klagenfurt)

Abstract:

In live streaming applications, typically a fixed set of bitrate-resolution pairs (known as bitrate ladder) is used for simplicity and efficiency in order to avoid the additional encoding run-time required to find optimum resolution-bitrate pairs for every video content. However, an optimized bitrate ladder may result in (i) decreased storage or delivery costs or/and (ii) increased Quality of Experience (QoE). This paper introduces a perceptually-aware per-title encoding (PPTE) scheme for video streaming applications. In this scheme, optimized bitrate-resolution pairs are predicted online based on Just Noticeable Difference (JND) in quality perception to avoid adding perceptually similar representations in the bitrate ladder. To this end, Discrete Cosine Transform(DCT)-energy-based low-complexity spatial and temporal features for each video segment are used. Experimental results show that, on average, PPTE yields bitrate savings of 16.47% and 27.02% to maintain the same PSNR and VMAF, respectively, compared to the reference HTTP Live Streaming (HLS) bitrate ladder without any noticeable additional latency in streaming accompanied by a 30.69% cumulative decrease in storage space for various representations.

 

Architecture of PPTE


Title: A Traffic-sign recognition IoT-based Application
Authors: Narges Mehran, Dragi Kimovski, Zahra Najafabadi Samani, Radu Prodan
The work “A Traffic-sign recognition IoT-based Application” got granted for the presentation in the HiPEAC IoT challenge during CSW Spring 2022.
International data corporation predicts that 21.5  billion connected Internet of Things (IoT) devices will generate 55% of all data by 2025. Nowadays, camera sensors can be embedded in most devices. Therefore, we designed an application to receive a video stream from a camera sensor and perform the video processing. First our designed application pre-processes the sensed data by two high-quality video encoding and framing frameworks. Afterward, we apply the machine learning  (ML) model based on the low and high training accuracies. Because the user devices cannot often perform high-load machine learning training operations, we consider the ML inference operation acting as a lightweight trained ML model. At the end, the processed data is packaged for the consumer such as the driver of a car.

ICME Workshop on Hyper-Realistic Multimedia for Enhanced Quality of Experience (ICMEW)

July 18-22, 2022 | Taipei, Taiwan

Conference Website

Ekrem Çetinkaya (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), Hadi Amirpour (Christian Doppler Laboratory ATHENA, Alpen-Adria-Universität Klagenfurt), and Christian Timmerer (Christian Doppler LaboratoryATHENA, Alpen-Adria-Universität Klagenfurt)

Abstract: Light field imaging enables post-capture actions such as refocusing and changing view perspective by capturing both spatial and angular information. However, capturing richer information about the 3D scene results in a huge amount of data. To improve the compression efficiency of the existing light field compression methods, we investigate the impact of light field super-resolution approaches (both spatial and angular super-resolution) on the compression efficiency. To this end, firstly, we downscale light field images over (i) spatial resolution, (ii) angular resolution, and (iii) spatial-angular resolution and encode them using Versatile Video Coding (VVC). We then apply a set of light field super-resolution deep neural networks to reconstruct light field images in their full spatial-angular resolution and compare their compression efficiency. Experimental results show that encoding the low angular resolution light field image and applying angular super-resolution yield bitrate savings of 51.16 % and 53.41 % to maintain the same PSNR and SSIM, respectively, compared to encoding the light field image in high-resolution.

 

 

 

 

 

 

Keywords: Light field, Compression, Super-resolution, VVC.

MPEG, specifically, ISO/IEC JTC 1/SC 29/WG 3 (MPEG Systems), has been just awarded a Technology & Engineering Emmy® Award for its ground-breaking MPEG-DASH standard. Dynamic Adaptive Streaming over HTTP (DASH) is the first international de-jure standard that enables efficient streaming of video over the Internet and it has changed the entire video streaming industry including — but not limited to —  on-demand, live, and low latency streaming and even for 5G and the next generation of hybrid broadcast-broadband. The first edition has been published in April 2012 and MPEG is currently working towards publishing the 5th edition demonstrating an active and lively ecosystem still being further developed and improved to address requirements and challenges for modern media transport applications and services.

This award belongs to 90+ researchers and engineers from around 60 companies all around the world who participated in the development of the MPEG-DASH standard for over 12 years.

From left to right: Kyung-mo Park, Cyril Concolato, Thomas Stockhammer, Yuriy Reznik, Alex Giladi, Mike Dolan, Iraj Sodagar, Ali Begen, Christian Timmerer, Gary Sullivan, Per Fröjdh, Young-Kwon Lim, Ye-Kui Wang. (Photo © Yuriy Reznik)

Christian Timmerer, director of the Christian Doppler Laboratory ATHENA, chaired the evaluation of responses to the call for proposals and since that served as MPEG-DASH Ad-hoc Group (AHG) / Break-out Group (BoG) co-chair as well as co-editor for Part 2 of the standard. For a more detailed history of the MPEG-DASH standard, the interested reader is referred to Christian Timmerer’s blog post “HTTP Streaming of MPEG Media” (capturing the development of the first edition) and Nicolas Weill’s blog post “MPEG-DASH: The ABR Esperanto” (DASH timeline).

The 13th ACM Multimedia Systems Conference (ACM MMSys 2022) Open Dataset and Software (ODS) track | June 14–17, 2022 |  Athlone, Ireland

Babak Taraghi (Alpen-Adria-Universität Klagenfurt), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt).

Abstract: There exist many applications that produce multimedia traffic over the Internet. Video streaming is on the list, with a rapidly growing desire for more bandwidth to deliver higher resolutions such as Ultra High Definition (UHD) 8K content. HTTP Adaptive Streaming (HAS) technique defines baselines for audio-visual content streaming to balance the delivered media quality and minimize streaming session defects. On the other hand, video codecs development and standardization help the theorem by introducing efficient algorithms and technologies. Versatile Video Coding (VVC) is one of the latest advancements in this area that is still not fully optimized and supported on all platforms. Stated optimization and supporting many platforms require years of research and development. This paper offers a dataset that facilitates the research and development of the aforementioned technologies. Our open-source dataset comprises Dynamic Adaptive Streaming over HTTP (MPEG-DASH) multimedia test assets of encoded Advanced Video Coding (AVC), High Efficiency Video Coding (HEVC), AOMedia Video 1 (AV1), and VVC content with resolutions of up to 7680×4320 or 8K. Our dataset has a maximum media duration of 322 seconds, and we offer our MPEG-DASH packaged content with two segments lengths, 4 and 8 seconds.

The dataset is available here.

Vignesh V Menon

The 13th ACM Multimedia Systems Conference (ACM MMSys 2022) Open Dataset and Software (ODS) track

June 14–17, 2022 |  Athlone, Ireland

Conference Website

Vignesh V Menon (Alpen-Adria-Universität Klagenfurt), Christian Feldmann (Bitmovin, Klagenfurt), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt)
Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt).

Abstract:

VCA in content-adaptive encoding applications

For online analysis of the video content complexity in live streaming applications, selecting low-complexity features is critical to ensure low-latency video streaming without disruptions. To this light, for each video (segment), two features, i.e., the average texture energy and the average gradient of the texture energy, are determined. A DCT-based energy function is introduced to determine the block-wise texture of each frame. The spatial and temporal features of the video (segment) are derived from the DCT-based energy function. The Video Complexity Analyzer (VCA) project aims to provide an
efficient spatial and temporal complexity analysis of each video (segment) which can be used in various applications to find the optimal encoding decisions. VCA leverages some of the x86 Single Instruction Multiple Data (SIMD) optimizations for Intel CPUs and
multi-threading optimizations to achieve increased performance. VCA is an open-source library published under the GNU GPLv3 license.

Github: https://github.com/cd-athena/VCA
Online documentation: https://cd-athena.github.io/VCA/

 

The 13th ACM Multimedia Systems Conference (ACM MMSys 2022)

June 14–17, 2022 |  Athlone, Ireland

Conference Website

Reza Shokri Kalan (Digiturk Company, Istanbul), Reza Farahani (Alpen-Adria-Universität Klagenfurt), Emre Karsli (Digiturk Company, Istanbul), Christian Timmerer (Alpen-Adria-Universität Klagenfurt), and Hermann Hellwagner (Alpen-Adria-Universität Klagenfurt)

Over-the-Top (OTT) service providers need faster, cheaper, and Digital Rights Management (DRM)-capable video streaming solutions. Recently, HTTP Adaptive Streaming (HAS) has become the dominant video delivery technology over the Internet. In HAS, videos are split into short intervals called segments, and each segment is encoded at various qualities/bitrates (i.e., representations) to adapt to the available bandwidth. Utilizing different HAS-based technologies with various segment formats imposes extra cost, complexity, and latency to the video delivery system. Enabling an integrated format for transmitting and storing segments at Content Delivery Network (CDN) servers can alleviate the aforementioned issues. To this end, MPEG Common Media Application Format (CMAF) is presented as a standard format for cost-effective and low latency streaming. However, CMAF has not been adopted by video streaming providers yet and it is incompatible with most legacy end-user players. This paper reveals some useful steps for achieving low latency live video streaming that can be implemented for non-DRM sensitive contents before jumping to CMAF technology. We first design and instantiate our testbed in a real OTT provider environment, including a heterogeneous network and clients, and then investigate the impact of changing format, segment duration, and Digital Video Recording (DVR) window length on a real live event. The results illustrate that replacing the transport stream (.ts) format with fragmented MP4 (.fMP4) and shortening segments’ duration reduces live latency significantly.

 

 

 

 

 

 

 

 

Keywords: HAS, DASH, HLS, CMAF, Live Streaming, Low Latency

 

Hadi

colocated with ACM Multimedia 2022

October, 2022, Lisbon, Portugal

Workshop Chairs:

  • Irene Viola, CWI, Netherlands
  • Hadi Amirpour, Klagenfurt University, Austria
  • Asim Hameed, NTNU, Norway
  • Maria Torres Vega, Ghent University, Belgium

Topics of interest include, but are not limited to:

  • Novel low latency encoding techniques for interactive XR applications
  • Novel networking systems and protocols to enable interactive immersive applications. This includes optimizations ranging from hardware (i.e., millimeter-wave networks or optical wireless), physical and MAC layer up to the network, transport and application layers (such as over the top protocols);
  • Significative advances and optimization in 3D modeling pipelines for AR/VR visualization, accessible and inclusive GUI, interactive 3D models;
  • Compression and delivery strategies for immersive media contents, such as omnidirectional video, light fields, point clouds, dynamic and time varying meshes;
  • Quality of Experience management of interactive immersive media applications;
  • Novel rendering techniques to enhance interactivity of XR applications;
  • Application of interactive XR to different areas of society, such as health (i.e., virtual reality exposure therapy), industry (Industry 4.0), XR e-learning (according to new global aims);

Dates:

  • Submission deadline: 20 June 2022, 23:59 AoE
  • Notifications of acceptance: 29 July 2022
  • Camera ready submission: 21 August 2022
  • Workshop: 10th or 14th October

ALIS’22: Artificial Intelligence for Live Video Streaming


colocated with ACM Multimedia 2022


October 2022, Lisbon, Portugal

Download ALIS’22 Poster/ CfP

 

ACM Mile-High video 2022 (MHV)

March 01-03, 2022 | Denver, CO, USA

Conference Website

After running as an independent event for several years, 2022 was the first year where Mile-High Video Conference (MHV) was organized by the ACM Special Interest Group on Multimedia (SIGMM). ACM MHV is a unique forum for participants from both industry and academia to present, share, and discuss innovations and best practices from multimedia content production to consumption.

This year, MHV hosted around 270 on-site participants and more than 2000 online participants from academia and industry. Five ATHENA members travelled to Denver, USA, to present two full papers and four short papers in MHV 2022.

Ekrem MHV

Here is a list of full papers presented in MHV:

Super-resolution Based Bitrate Adaptation for HTTP Adaptive Streaming for Mobile Devices: The advancement of mobile hardware in recent years made it possible to apply deep neural network (DNN) based approaches on mobile devices. This paper introduces a lightweight super-resolution (SR) network deployed at mobile devices and a novel adaptive bitrate (ABR) algorithm that leverages SR networks at the client to improve the video quality. More …

Take the Red Pill for H3 and See How Deep the Rabbit Hole GoesWith the introduction of HTTP/3 (H3) and QUIC at its core, there is an expectation of significant improvements in Web-based secure object delivery. An important question is what H3 will bring to the table for such services. To answer this question, we present the new features of H3 and QUIC, and compare them to those of H/1.1/2 and TCP. More …

Here is a list of short papers presented in MHV:

  • RICHTER: hybrid P2P-CDN architecture for low latency live video streaming: RICHTER leverages existing works that have combined the characteristics of Peer-to-Peer (P2P) networks and CDN-based systems and introduced a hybrid CDN-P2P live streaming architecture. [PDF]

  • CAdViSE or how to find the sweet spots of ABR systems: CAdViSE provides a Cloud-based Adaptive Video Streaming Evaluation framework for the automated testing of adaptive media players. [PDF]

  • Video streaming using light-weight transcoding and in-network intelligenceLwTE reduces HTTP Adaptive Streaming (HAS) streaming costs by enabling lightweight transcoding at the edge. [PDF]

  • Efficient bitrate ladder construction for live video streaming: This paper introduces an online bitrate ladder construction scheme for live video streaming applications using Discrete Cosine Transform (DCT)-energy-based low-complexity spatial and temporal features. [PDF]