News – Page 6 – ITEC Homepage

DORBINE project @ Der Standard

Farzad recently participated in an interview with the Austrian newspaper Der Standard. The conversation covered a range of topics, and the final article has now been published. You can find the full piece at the following link:

https://www.derstandard.at/story/3000000262214/forscher-aus-klagenfurt-inspizieren-windraeder-mit-drohnenschwaermen

March 24, 2025

Announcement, Publication

Paper accepted: ICME 2025: Neural Representations for Scalable Video Coding

Neural Representations for Scalable Video Coding

IEEE International Conference on Multimedia & Expo (ICME) 2025

Authors: Yiying Wei (AAU, Austria), Hadi Amirpour (AAU, Austria), and Christian Timmerer (AAU, Austria)

Abstract: Scalable video coding encodes a video stream into multiple layers so that it can be decoded at different levels of quality/resolution, depending on the device’s capabilities or the available network bandwidth. Recent advances in implicit neural representation (INR)-based video codecs have shown competitive compression performance to both traditional and other learning-based methods. In INR approaches, a neural network is trained to overfit a video sequence, and its parameters are compressed to create a compact representation of the video content. While they achieve promising results, existing INR-based codecs require training separate networks for each resolution/quality of a video, making them challenging for scalable compression. In this paper, we propose Neural representations for Scalable Video Coding (NSVC) that encodes multi-resolution/-quality videos into a single neural network comprising multiple layers. The base layer (BL) of the neural network encodes video streams with the lowest resolution/quality. Enhancement layers (ELs) encode additional information that can be used to reconstruct a higher resolution/quality video during decoding using the BL as a starting point. This multi-layered structure allows the scalable bitstream to be truncated to adapt to the client’s bandwidth conditions or computational decoding requirements. Experimental results show that NSVC outperforms AVC’s Scalable Video Coding (SVC) extension and surpasses HEVC’s scalable extension (SHVC) in terms of VMAF. Additionally, NSVC achieves comparable decoding speeds at high resolutions/qualities.

ICME 2025: Neural Representations for Scalable Video Coding | ATHENA Christian Doppler (CD) Laboratory

March 21, 2025

Announcement, Publication

Journal article accepted: VQM4HAS: A Real-time Quality Metric for HEVC Videos in HTTP Adaptive Streaming

We are glad that the paper was accepted for publication in IEEE Transactions on Multimedia.

Authors: Hadi Amirpour (AAU, AT), Jingwen Zhu (Nantes University, FR), Wei Zhu (Cardiff University, UK), Patrick Le Callet (Nantes University, FR), and Christian Timmerer (AAU, AT)

Abstract: In HTTP Adaptive Streaming (HAS), a video is encoded at various bitrate-resolution pairs, collectively known as the bitrate ladder, allowing users to select the most suitable representation based on their network conditions. Optimizing this set of pairs to enhance the Quality of Experience (QoE) requires accurately measuring the quality of these representations. VMAF and ITU-T’s P.1204.3 are highly reliable metrics for assessing the quality of representations in HAS. However, in practice, using these metrics for optimization is often impractical for live streaming applications due to their high computational costs and the large number of bitrate-resolution pairs in the bitrate ladder that need to be evaluated. To address their high complexity, our paper introduces a new method called VQM4HAS, which extracts low-complexity features including (i) video complexity features, (ii) frame-level encoding statistics logged during the encoding process, and (iii) lightweight video quality metrics. These extracted features are then fed into a regression model to predict VMAF and P.1204.3, respectively.

The VQM4HAS model is designed to operate on a per bitrate-resolution pair, per-resolution, and cross-representation basis, optimizing quality predictions across different HAS scenarios. Our experimental results demonstrate that VQM4HAS achieves a high correlation with VMAF and P.1204.3, with Pearson correlation coefficients (PCC) ranging from 0.95 to 0.96 for VMAF and 0.97 to 0.99 for P.1204.3, depending on the resolution. Despite achieving a high correlation with VMAF and P.1204.3, VQM4HAS exhibits significantly less complexity than both metrics, with 98% and 99% less complexity for VMAF and P.1204.3, respectively, making it suitable for live streaming scenarios.
We also conduct a feature importance analysis to further reduce the complexity of the proposed method. Furthermore, we evaluate the effectiveness of our method by using it to predict subjective quality scores. The results show that VQM4HAS achieves a higher correlation with subjective scores at various resolutions, despite its minimal complexity.

March 17, 2025

Announcement, Publication

Papers accepted @ Intel4EC Workshop 2025

The following papers have been accepted at the Intel4EC Workshop 2025 which will be held on June 4, 2025 in Milan, Italy in conjunction with 39th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2025)

Title: 6G Infrastructures for Edge AI: An Analytical Perspective

Authors: Kurt Horvath, Shpresa Tuda*, Blerta Idrizi*, Stojan Kitanov*, Fisnik Doko*, Dragi Kimovski (*Mother Teresa University Skopje, North Macedonia)

Abstract: The convergence of Artificial Intelligence (AI) and the Internet of Things has accelerated the development of distributed, network-sensitive applications, necessitating ultra-low latency, high throughput, and real-time processing capabilities. While 5G networks represent a significant technological milestone, their ability to support AI-driven edge applications remains constrained by performance gaps observed in real-world deployments. This paper addresses these limitations and highlights critical advancements needed to realize a robust and scalable 6G ecosystem optimized for AI applications. Furthermore, we conduct an empirical evaluation of 5G network infrastructure in central Europe, with latency measurements ranging from 61 ms to 110 ms across different close geographical areas. These values exceed the requirements of latency-critical AI applications by approximately 270%, revealing significant shortcomings in current deployments. Building on these findings, we propose a set of recommendations to bridge the gap between existing 5G performance and the requirements of next-generation AI applications.

Title: Blockchain consensus mechanisms for democratic voting environments

Authors: Thomas Auer, Kurt Horvath, Dragi Kimovski

Abstract: Democracy relies on robust voting systems to ensure transparency, fairness, and trust in electoral processes. Despite its foundational role, voting mechanisms – both manual and electronic – remain vulnerable to threats such as vote manipulation, data loss, and administrative interference. These vulnerabilities highlight the need for secure, scalable, and cost-efficient alternatives to safeguard electoral integrity. The fully decentralized voting system leverages blockchain technology to overcome critical challenges in modern voting systems, including scalability, cost-efficiency, and transaction throughput. By eliminating the need for a centralized authority, the system ensures transparency, security, and real-time monitoring by integrating Distributed Ledger Technologies. This novel architecture reduces operational costs, enhances voter anonymity, and improves scalability, achieving significantly lower costs for 1,000 votes than traditional voting methods.

The system introduces a formalized decentralized voting model that adheres to constitutional requirements and practical standards, making it suitable for implementation in direct and representative democracies. Additionally, the design accommodates high transaction volumes without compromising performance, ensuring reliable operation even in large-scale elections. The results demonstrate that this system outperforms classical approaches regarding efficiency, security, and affordability, paving the way for broader adoption of blockchain-based voting solutions.

March 17, 2025

Announcement, Publication

Accepted tutorial: Serverless Orchestration on the Edge-Cloud Continuum: From Small Functions to Large Language Models

We are happy to announce that our tutorial “Serverless Orchestration on the Edge-Cloud Continuum: From Small Functions to Large Language Models” (by Reza Farahani and Radu Prodan) has been accepted for IEEE ICDCS 2025, which will take place in Glasgow, Scotland, UK, in July 2025.

Venue: 45th IEEE International Conference on Distributed Computing Systems (ICDCS) (https://icdcs2025.icdcs.org/)

Abstract: Serverless computing simplifies application development by abstracting infrastructure management, allowing developers to focus on functionality while cloud providers handle resource provisioning and scaling. However, orchestrating serverless workloads across the edge-cloud continuum presents challenges, from managing heterogeneous resources to ensuring low-latency execution and maintaining fault tolerance and scalability. These challenges intensify when scaling from lightweight functions to compute-intensive tasks such as large language model (LLM) inferences in distributed environments. This tutorial explores serverless computing’s evolution from small functions to large-scale AI workloads. It introduces foundational concepts like Function-as-a-Service (FaaS) and Backend-as-a-Service (BaaS) before covering advanced edge-cloud orchestration strategies. Topics include dynamic workload distribution, multi-objective scheduling, energy-efficient orchestration, and deploying functions with diverse computational requirments. Hands-on demonstrations with Kubernetes, GCP Functions, AWS Lambda, OpenFaaS, OpenWhisk, and monitoring tools provide participants with practical insights into optimizing performance and energy efficiency in serverless orchestration across distributed infrastructures.

March 11, 2025

Announcement, MMC

The 2nd ACM MM Workshop on Multimedia Computing for Health and Medicine

Co-located with ACM Multimedia 2025

URL: https://weizhou-geek.github.io/workshop/MM2025.html

In health and medicine, an immense amount of data is being generated by distributed sensors and cameras, as well as multimodal digital health platforms that support multimedia, such as audio, video, image, 3D geometry, and text. The availability of such multimedia data from medical devices and digital record systems has greatly increased the potential for automated diagnosis. The past several years have witnessed an explosion of interest, and a dizzyingly fast development, in computer-aided medical investigations using MRI, CT, X-rays, images, point clouds, etc. This proposed workshop focuses on various multimedia computing techniques (including mobile solutions and hardware solutions) for health and medicine, which targets real-world data/problems in healthcare, involves a large number of stakeholders, and is closely connected with people’s health.

March 7, 2025

MMC, Publication

Tutorial accepted: Perceptually Inspired Visual Quality Assessment in Multimedia Communication

ACM MM’25 Tutorial: Perceptually Inspired Visual Quality Assessment in Multimedia Communication

ACM MM 2025, October 27, 2025, Dublin, Ireland

https://acmmm2025.org/tutorial/

Tutorial speakers:

Wei Zhou (Cardiff University)
Hadi Amirpour (University of Klagenfurt)

Tutorial description:

As multimedia services like video streaming, video conferencing, virtual reality (VR), and online gaming continue to expand, ensuring high perceptual quality becomes a priority for maintaining user satisfaction and competitiveness. However, during acquisition, compression, transmission, and storage, multimedia content undergoes various distortions, causing degradation in experienced quality. Thus, perceptual quality assessment, which focuses on evaluating the quality of multimedia content based on human perception, is essential for optimizing user experiences in advanced communication systems. Several challenges are involved in the quality assessment process, including diverse characteristics of multimedia content such as image, video, VR, point cloud, mesh, multimodality, etc., and complex distortion scenarios as well as viewing conditions. The tutorial first presents a detailed overview of principles and methods for perceptually inspired visual quality assessment. This includes both subjective methods, where users directly rate their experience, and objective methods, where algorithms predict human perception based on measurable factors such as bitrate, frame rate, and compression levels. Based on the basics of perceptually inspired visual quality assessment, metrics for different multimedia data are then introduced. Apart from the traditional image and video, immersive multimedia and AI-generated content will also be involved.

March 7, 2025

MMC, Publication

Journal paper accepted: ACM TOMM: Convex Hull Prediction Methods for Bitrate Ladder Construction: Design, Evaluation, and Comparison

URL: https://dl.acm.org/journal/tomm

Authors: Ahmed Telili (INSA, Rennes, France), Wassim Hamidouce (INSA, Rennes, France), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Sid Ahmed Fezza (INPTIC, Algeira), Christian Timmerer (Alpen-Adria-Universität Klagenfurt), and Luce Morin (INSA, Rennes, France)

Abstract:
HTTP adaptive streaming (HAS ) has emerged as a prevalent approach for over-the-top (OTT ) video streaming services due to its ability to deliver a seamless user experience. A fundamental component of HAS is the bitrate ladder, which comprises a set of encoding parameters (e.g., bitrate-resolution pairs) used to encode the source video into multiple representations. This adaptive bitrate ladder enables the client’s video player to dynamically adjust the quality of the video stream in real-time based on fluctuations in network conditions, ensuring uninterrupted playback by selecting the most suitable representation for the available bandwidth. The most straightforward approach involves using a fixed bitrate ladder for all videos, consisting of pre-determined bitrate-resolution pairs known as one-size-fits-all. Conversely, the most reliable technique relies on intensively encoding all resolutions over a wide range of bitrates to build the convex hull, thereby optimizing the bitrate ladder by selecting the representations from the convex hull for each specific video. Several techniques have been proposed to predict content-based ladders without performing a costly, exhaustive search encoding. This paper provides a comprehensive review of various convex hull prediction methods, including both conventional and learning-based approaches. Furthermore, we conduct a benchmark study of several handcrafted- and deep learning ( DL )-based approaches for predicting content-optimized convex hulls across multiple codec settings. The considered methods are evaluated on our proposed large-scale dataset, which includes 300 UHD video shots encoded with software and hardware encoders using three state-of-the-art video standards, including AVC /H.264, HEVC /H.265, and VVC /H.266, at various bitrate points. Our analysis provides valuable insights and establishes baseline performance for future research in this field.
Dataset URL: https://nasext-vaader.insa-rennes.fr/ietr-vaader/datasets/br_ladder

March 7, 2025

Announcement, Publication

Paper accepted: End-to-End Learning-based Video Streaming Enhancement Pipeline: A Generative AI Approach

Authors: Emanuele Artioli (Alpen-Adria Universität Klagenfurt, Austria), Farzad Tashtarian (Alpen-Adria Universität Klagenfurt, Austria), Christian Timmerer (Alpen-Adria Universität Klagenfurt, Austria)

Venue: ACM 35th Workshop on Network and Operating System Support for Digital Audio and Video (NOSSDAV’25)

Abstract: The primary challenge of video streaming is to balance high video quality with smooth playback. Traditional codecs are well tuned for this trade-off, yet their inability to use context means they must encode the entire video data and transmit it to the client.
This paper introduces ELVIS (\textbf{E}nd-to-end \textbf{L}earning-based \textbf{VI}deo \textbf{S}treaming Enhancement Pipeline), an end-to-end architecture that combines server-side encoding optimizations with client-side generative in-painting to remove and reconstruct redundant video data. Its modular design allows ELVIS to integrate different codecs, in-painting models, and quality metrics, making it adaptable to future innovations.
Our results show that current technologies achieve improvements of up to 11 VMAF points over baseline benchmarks, though challenges remain for real-time applications due to computational demands. ELVIS represents a foundational step toward incorporating generative AI into video streaming pipelines, enabling higher quality experiences without increased bandwidth requirements.
By leveraging generative AI, we aim to develop a client-side tool, to incorporate in a dedicated video streaming player, that combines the accessibility of multilingual dubbing with the authenticity of the original speaker’s performance, effectively allowing a single actor to deliver their voice in any language. To the best of our knowledge, no current streaming system can capture the speaker’s unique voice or emotional tone.

March 4, 2025

Announcement, Publication

Journal article accepted: Energy-Time Modeling of Distributed Multi-Population Genetic Algorithms with Dynamic Workload in HPC Clusters

We are glad that the paper was accepted for publication in Future Generation Computer Systems. This journal publishes cutting-edge research on high-performance computing, distributed systems, and advanced computing technologies for future computing environments.

Authors: Juan José Escobar, Pablo Sánchez-Cuevas, Beatriz Prieto, Rukiye Savran Kızıltepe, Fernando Díaz-del-Río, Dragi Kimovski

Abstract: Time and energy efficiency is a highly relevant objective in high-performance computing systems, with high costs for executing the tasks. Among these tasks, evolutionary algorithms are of consideration due to their inherent parallel scalability and usually costly fitness evaluation functions. In this respect, several scheduling strategies for workload balancing in heterogeneous systems have been proposed in the literature, with runtime and energy consumption reduction as their goals. Our hypothesis is that a dynamic workload distribution can be fitted with greater precision using metaheuristics, such as genetic algorithms, instead of linear regression. Therefore, this paper proposes a new mathematical model to predict the energy-time behaviour of applications based on multi-population genetic algorithms, which dynamically distributes the evaluation of individuals among the CPU-GPU devices of heterogeneous clusters. An accurate predictor would save time and energy by selecting the best resource set before running such applications. The estimation of the workload distributed to each device has been carried out by simulation, while the model parameters have been fitted in a two-phase run using another genetic algorithm and the experimental energy-time values of the target application as input. When the new model is analysed and compared with another based on linear regression, the one proposed in this work significantly improves the baseline approach, showing normalised prediction errors of 0.081 for runtime and 0.091 for energy consumption, compared to 0.213 and 0.256 shown in the baseline approach.

March 4, 2025