Dr. Reza Farahani presented a 3-hour tutorial titled “Serverless Orchestration on the Edge-Cloud Continuum: From Small Functions to Large Language Models” at the 45th IEEE International Conference on Distributed Computing Systems (ICDCS) on 20 July 2025.

Abstract: Serverless computing simplifies application development by abstracting infrastructure management, allowing developers to focus on functionality while cloud providers handle resource provisioning and scaling. However, orchestrating serverless workloads across the edge-cloud continuum presents challenges, from managing heterogeneous resources to ensuring low-latency execution and maintaining fault tolerance and scalability. These challenges intensify when scaling from lightweight functions to compute-intensive tasks such as large language model (LLM) inferences in distributed environments. This tutorial explores serverless computing’s evolution from small functions to large-scale AI workloads. It introduces foundational concepts like Function-as-a-Service (FaaS) and Backend-as-a-Service (BaaS) before covering advanced edge-cloud orchestration strategies. Topics include dynamic workload distribution, multi-objective scheduling, energy-efficient orchestration, and deploying functions with diverse computational requirments. Hands-on demonstrations with Kubernetes, GCP Functions, AWS Lambda, OpenFaaS, OpenWhisk, and monitoring tools provide participants with practical insights into optimizing performance and energy efficiency in serverless orchestration across distributed infrastructures.

GenStream: Semantic Streaming Framework for Generative Reconstruction of Human-centric Media

ACM Multimedia 2025

October 27 – October 31, 2025

Dublin, Ireland

[PDF]

Emanuele Artioli (AAU, Austria), Daniele Lorenzi (AAU, Austria), Shivi Vats (AAU, Austria), Farzad Tashtarian (AAU, Austria), Christian Timmerer (AAU, Austria)

Abstract: Video streaming dominates global internet traffic, yet conventional pipelines remain inefficient for structured, human-centric content such as sports, performance, or interactive media. Standard codecs re-encode entire frames, foreground and background alike, treating all pixels uniformly and ignoring the semantic structure of the scene. This leads to significant bandwidth waste, particularly in scenarios where backgrounds are static and motion is constrained to a few salient actors. We introduce GenStream, a semantic streaming framework that replaces dense video frames with compact, structured metadata. Instead of transmitting pixels, GenStream encodes each scene as a combination of skeletal keypoints, camera viewpoint parameters, and a static 3D background model. These elements are transmitted to the client, where a generative model reconstructs photorealistic human figures and composites them into the 3D scene from the original viewpoint. This paradigm enables extreme compression, achieving over 99.9% bandwidth reduction compared to HEVC. We partially validate GenStream on Olympic figure skating footage and demonstrate potential high perceptual fidelity under minimal data. Looking forward, GenStream opens new directions in volumetric avatar synthesis, canonical 3D actor fusion across views, personalized and immersive viewing experiences at arbitrary viewpoints, and lightweight scene reconstruction, laying the groundwork for scalable, intelligent streaming in the post-codec era.

Receiving Kernel-Level Insights via eBPF: Can ABR Algorithms Adapt Smarter?

Würzburg Workshop on Next-Generation Communication Networks (WueWoWAS) 2025

6 – 8 Oct 2025, Würzburg, Germany

[PDF]

Mohsen Ghasemi (Sharif University of Technology, Iran); Daniele Lorenzi (Alpen-Adria-Universität Klagenfurt, Austria); Mahdi Dolati (Sharif University of Technology, Iran); Farzad Tashtarian (Alpen-Adria Universität Klagenfurt, Austria); Sergey Gorinsky (IMDEA Networks Institute, Spain); Christian Timmerer (Alpen-Adria-Universität Klagenfurt & Bitmovin, Austria)

Abstract: The rapid rise of video streaming services such as Netflix and YouTube has made video delivery the largest driver of global Internet traffic, including mobile networks such as 5G or the upcoming 6G network. To maintain playback quality, client devices employ Adaptive Bitrate (ABR) algorithms that adjust video quality based on metrics like available bandwidth and buffer occupancy. However, these algorithms often react slowly to sudden bandwidth fluctuations due to limited visibility into network conditions, leading to stall events that significantly degrade the user’s Quality of Experience (QoE). In this work, we introduce CaBR, a Congestion-aware adaptive BitRate decision module designed to operate on top of existing ABR algorithms. CaBR enhances video streaming performance by leveraging real-time, in-kernel network telemetry collected via the extended Berkeley Packet Filter (eBPF). By utilizing congestion metrics such as queue lengths observed at network switches, CaBR refines the bitrate selection of the underlying ABR algorithms for upcoming segments, enabling faster adaptation to changing network conditions. Our evaluation shows that CaBR significantly reduces the playback stalls and improves QoE by up to 25% compared to state-of-the-art approaches in a congested environment.

On Thursday, July 30, 2025, Daniele Lorenzi successfully defended his PhD thesis (QoE- and Energy-aware Content Consumption for HTTP Adaptive Streaming) under the supervision of Prof. Hermann Hellwagner and Prof. Christian Timmerer. The defense was chaired by Assoc.-Prof. DI Dr. Klaus Schöffmann and the examiners were Assoc. – Prof. Luca De Cicco and Dr.-Ing. habil. Christian Herglotz.

We are pleased to congratulate Dr. Daniele Lorenzi on successfully passing his Ph.D. examination!

Paper Title: STEP-MR: A Subjective Testing and Eye-Tracking Platform for Dynamic Point Clouds in Mixed Reality

Conference Details:  EuroXR 2025; Sep 03 – Sep 05, 2025; Winterthur, Switzerland

Authors: Shivi Vats (AAU, Austria), Christian Timmerer (AAU, Austria), Hermann Hellwagner (AAU, Austria)

Abstract: 

The use of point cloud (PC) streaming in mixed reality (MR) environments is of particular interest due to the immersiveness and the six degrees of freedom (6DoF) provided by the 3D content. However, this immersiveness requires significant bandwidth. Innovative solutions have been developed to address these challenges, such as PC compression and/or spatially tiling the PC to stream different portions at different quality levels. This paper presents a brief overview of a Subjective Testing and Eye-tracking Platform for dynamic point clouds in Mixed Reality (STEP-MR) for the Microsoft HoloLens 2. STEP-MR was used to conduct subjective tests (described in another work) with 41 participants, yielding over 2000 responses and more than 150 visual attention maps, the results of which can be used, among other things, to improve dynamic (animated) point cloud streaming solutions mentioned above. Building on our previous platform, the new version now enables eye-tracking tests, including calibration and heatmap generation. Additionally, STEP-MR features modifications to the subjective tests’ functionality, such as a new rating scale and adaptability to participant movement during the tests, along with other user experience changes.

Content-adaptive encoder preset prediction for adaptive live streaming

US Patent

[PDF]

Vignesh Menon (Alpen-Adria-Universität Klagenfurt, Austria), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt, Austria), and Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Austria)

 

Abstract: Techniques for content-adaptive encoder preset prediction for adaptive live streaming are described herein. A method for content-adaptive encoder preset prediction for adaptive live streaming includes performing video complexity feature extraction on a video segment to extract complexity features such as an average texture energy, an average temporal energy, and an average lumiscence. These inputs may be provided to an encoding time prediction model, along with a bitrate ladder, a resolution set, a target video encoding speed, and a number of CPU threads for the video segment, to predict an encoding time, and an optimized encoding preset may be selected for the video segment by a preset selection function using the predicted encoding time. The video segment may be encoded according to the optimized encoding preset.

SDART: Spatial Dart AR Simulation with Hand-Tracked Input

ACM Multimedia 2025

October 27 – October 31, 2025

Dublin, Ireland

[PDF]

Milad Ghanbari (AAU, Austria), Wei Zhou (Cardiff, UK), Cosmin Stejerean (Meta, US), Christian Timmerer (AAU, Austria), Hadi Amirpour (AAU, Austria)

Abstract: We present a physics-driven 3D dart-throwing interaction system for Apple Vision Pro (AVP), developed using Unity 6 engine and running in augmented reality (AR) mode on the device. The system utilizes the PolySpatial and Apple’s ARKit software development kits (SDKs) to ensure hand input and tracking in order to intuitively spawn, grab, and throw virtual darts similar to real darts. The application benefits from physics simulations alongside the innovative no-controller input system of AVP to manipulate objects realistically in an unbounded spatial volume. By implementing spatial distance measurement, scoring logic, and recording user performance, this project enables user studies on quality of experience in interactive experiences. To evaluate the perceived quality and realism of the interaction, we conducted a subjective study with 10 participants using a structured questionnaire. The study measured various aspects of the user experience, including visual and spatial realism, control fidelity, depth perception, immersiveness, and enjoyment. Results indicate high mean opinion scores (MOS) across key dimensions. Link to video: Link

Hadi

VQualA 2025 Challenge on Image Super-Resolution Generated Content Quality Assessment: Methods and Results

ICCV VQualA 2025

October 19 – October 23, 2025

Hawai’i, USA

[PDF]

Hadi Amirpour (AAU, Austria), et al.

Abstract: This paper presents the ISRGC-Q Challenge, built upon the Image Super-Resolution Generated Content Quality Assessment (ISRGen-QA) dataset, and organized as part of the Visual Quality Assessment (VQualA) Competition at the ICCV 2025 Workshops. Unlike existing Super-Resolution Image Quality Assessment (SR-IQA) datasets, ISRGen-QA places greater emphasis on SR images generated by the latest generative approaches, including Generative Adversarial Networks (GANs) and diffusion models. The primary goal of this challenge is to analyze the unique artifacts introduced by modern super-resolution techniques and to evaluate their perceptual quality effectively. A total of 108 participants registered for the challenge, with 4 teams submitting valid solutions and fact sheets for the final testing phase. These submissions demonstrated state-of-the-art (SOTA) performance on the ISRGen-QA dataset. The project is publicly available at: https://github.com/Lighting-YXLI/ISRGen-QA.

VQualA 2025 Challenge on Face Image Quality Assessment: Methods and Results

ICCV VQualA 2025

October 19 – October 23, 2025

Hawai’i, USA

[PDF]

MohammadAli Hamidi (University of Cagliari, Italy), Hadi Amirpour (AAU, Austria), et al.

Abstract: Face images have become integral to various applications. but real-world capture conditions often lead to degradations such as noise, blur, compression artifacts, and poor lighting. These degradations negatively impact image quality and downstream tasks. To promote advancements in face image quality assessment (FIQA), we introduce the VQualA 2025 Challenge on Face Image Quality Assessment, part of ICCV 2025 Workshops. Participants developed efficient models (≤0.5 GFLOPs, ≤5M parameters) predicting Mean Opinion Scores (MOS) under realistic degradations. Submissions were rigorously evaluated using objective metrics and human perceptual judgments. The challenge attracted 127 participants, resulting in 1519 valid final submissions. Detailed methodologies and results are presented, contributing to practical FIQA solutions.

.

A Lightweight Ensemble-Based Face Image Quality Assessment Method with Correlation-Aware Loss

ICCV VQualA 2025

October 19 – October 23, 2025

Hawai’i, USA

 

MohammadAli Hamidi (University of Cagliari, Italy), Hadi Amirpour (AAU, Austria), Luigi Atzori (University of Cagliari, Italy)Christian Timmerer (AAU, Austria),

Abstract:Face image quality assessment (FIQA) plays a critical role in face recognition and verification systems, especially in uncontrolled, real-world environments. Although several methods have been proposed, general-purpose no-reference image quality assessment techniques often fail to capture face-specific degradations. Meanwhile, state-of-the-art FIQA models tend to be computationally intensive, limiting their practical applicability. We propose a lightweight and efficient method for FIQA, designed for the perceptual evaluation of face images in the wild. Our approach integrates an ensemble of two compact convolutional neural networks, MobileNetV3-Small and ShuffleNetV2, with prediction-level fusion via simple averaging. To enhance alignment with human perceptual judgments, we employ a correlation-aware loss (MSECorrLoss), combining mean squared error (MSE) with a Pearson correlation regularizer. Our method achieves a strong balance between accuracy and computational cost, making it suitable for real-world deployment. Experiments on the VQualA FIQA benchmark demonstrate that our model achieves a Spearman rank correlation coefficient (SRCC) of 0.9829 and a Pearson linear correlation coefficient (PLCC) of 0.9894, remaining within competition efficiency constraints.