Hadi

Depth-Enabled Inspection of Medical Videos

ACM Multimedia 2025

October 27 – October 31, 2025

Dublin, Ireland

Hadi Amirpour (AAU, Austria), Doris Putzgruber-Adamitsch (AAU, Austria), Yosuf El-Shabrawi (Kabeg, Austria), Klaus Schoeffmann (AAU, Austria)

Abstract: Cataract surgery is the most frequently performed surgical procedure worldwide, involving the replacement of a patient’s clouded eye lens with a synthetic intraocular lens to restore visual acuity. Although typically brief, the operation consists of distinct phases that demand precision and extensive training, traditionally constrained by the limitations of real-time observation under a microscope. To enhance learning and procedural accuracy, modern advancements in stereoscopic video capture and head-mounted displays (HMDs) offer a promising solution. This paper demonstrates the application of stereoscopic cataract surgery videos, visualized through Apple Vision Pro (AVP) and Meta Quest 3, to provide immersive 3D perspectives that enhance depth perception and spatial awareness. An expert evaluation study with experienced surgeons indicates that stereoscopic visualization significantly improves comprehension of spatial relationships and procedural maneuvers, suggesting its potential to revolutionize surgical education and real-time guidance in ophthalmic surgery. Demo video: Link

A Tutorial at  ACM SIGCOMM 2025

Optimizing Low-Latency Video Streaming: AI-Assisted Codec-Network Coordination

[Link]

Coimbra, Portugal, September 8 – 11, 2025.


Tutorial speakers:

  • Farzad Tashtarian (Alpen-Adria-Universität – AAU)
  • Zili Meng (Hong Kong University of Science and Technology – HKUST)
  • Abdelhak Bentaleb (Concordia University)
  • Mahdi Dolati (Sharif University of Technology)

This tutorial focuses on the emerging need for ultra-low-latency video streaming and how AI-assisted coordination between codecs and network infrastructure can significantly improve performance. Traditional end-to-end streaming pipelines are often disjointed, leading to inefficiencies under tight latency constraints. We present a cross-layer approach that leverages AI for real-time encoding parameter adaptation, network-aware bitrate selection, and joint optimization across codec behavior and transport protocols. The tutorial examines the integration of AI models with programmable network architectures (e.g., SDN, P4) and modern transport technologies such as QUIC and Media over QUIC (MoQ) to minimize startup delay, stall events, and encoding overhead. Practical use cases and experimental insights illustrate how aligning codec dynamics with real-time network conditions enhances both QoE and system efficiency. Designed for both researchers and engineers, this session provides a foundation for developing next-generation intelligent video delivery systems capable of sustaining low-latency performance in dynamic environments.

Conference Talk:

On 20 June 2025, Dr Felix Schniz held the talk “Plain Walking? Navigating Space in Trading Card Games” for the Card Game Conference at AAU, Klagenfurt (AT). The Card Game Conference was a mini conference organised by students of the master’s programme Game Studies and Engineering, intended as a low-barrier science to science and science to public event.

 

Workshop:

With „Plain Walking? The Workshop”, Dr Felix Schniz held a science-to-science workshop on 21 June 2025 as a follow-up to his presentation at the Card Game Conference.

 

Conference Accept:

With his talk “In Cardboard Space, No One Can Hear Your Scream: The Alien Universe Between Digital and Analogue Game Experiences”, Dr Felix Schniz has been accepted for the Video Game Cultures Conference 2025 at Charles University, Prague (CZ).

 

Paper Published:

Together with his colleagues Thomas Faller, Armin Lippitz, and René Reinhold Schallegger, Dr Felix Schniz has published the paper “Teaching (With) Canadian Videogames in the Classroom” in the anthology Teaching Canada II: Identities, Cultures, Regions.

 

 

 

Real-Time AI-Driven Avatar Generation for Sign Language in HTTP Adaptive Streaming

The 3rd ACM SIGCOMM Workshop on Emerging Multimedia Systems (ACM EMS 2025)

https://conferences.sigcomm.org/sigcomm/2025/workshop/ems/

8 September 2025 // Coimbra, Portugal

 

Daniele Lorenzi (AAU, Austria), Emanuele Artioli (AAU, Austria), Farzad Tashtarian (AAU, Austria), Christian Timmerer (AAU, Austria)

Abstract: As digital media consumption over the Internet surges globally, ensuring accessibility for all users becomes paramount. For people with hearing impairments, this means providing inclusion beyond classic captioning, which does not convey the full emotional and contextual depth of spoken content. This work addresses this accessibility gap by exploring the use of AI-generated avatars capable of translating speech into sign language in real-time. After defining the multifaceted challenges in this domain, we propose a novel AI-driven task partition to animate avatars for accurate and expressive sign language interpretations in live streaming.

Hadi

Unlocking Implicit Motion for Evaluating Image Complexity
Displays

 

Yixiao Lia (Beihang University, China), Xiaoyuan Yang (Beihang University, China), Yanda Meng (University of Exeter, UK), Hadi Amirpour (AAU, AT), Jiang Liu (Cardiff University, UK), Yuqing Luo (Cardiff University, UK), Hantao Liu (Cardiff University, UK), and Wei Zhou (Cardiff University, UK)

Abstract: Image complexity (IC) plays a critical role in both cognitive science and multimedia computing, influencing visual aesthetics, emotional responses, and tasks such as image classification and enhancement. However, defining and quantifying IC remains challenging due to its multifaceted nature, which encompasses both objective attributes (e.g., detail, structure) and subjective human perception. While traditional methods rely on entropy-based or multidimensional approaches, and recent advances employ machine learning and shallow neural networks, these techniques often fail to fully capture the subjective aspects of IC. Inspired by the fact that the human visual system inherently perceives implicit motion in static images, we propose a novel approach to address this gap by explicitly incorporating hidden motion into IC assessment. We introduce the motion-inspired image complexity assessment metric (MICM) as a new framework for this purpose. MICM introduces a dual-branch architecture: One branch extracts spatial features from static images, while the other generates short video sequences to analyze latent motion dynamics. To ensure meaningful motion representation, we design a hierarchical loss function that aligns video features with text prompts derived from image-to-text models, refining motion semantics at both local (i.e., frame and word) and global levels. Experiments on three public image complexity assessment (ICA) databases demonstrate that our approach, MICM, significantly outperforms state-of-the-art methods, validating its effectiveness. The code will be publicly available upon acceptance of the paper.

 

On Monday, 16 June 2025, Sabrina Größing, BEd and Dr Felix Schniz presented the Master’s Programme Game Studies and Engineering to an audience of bachelor students from the novel Liberal Arts programme.

 

The event aimed to showcase how easily they can continue their academic career within AAU, the opportunities and support structures awaiting them by doing so, and provide a gateway into studying at the university’s technical faculty.

On Friday, 13 June 2025, The Pioneers of Game Development Austria, the game development association focused on supporting, showcasing, and accelerating Austrian games, developers, and business, visited ITEC.

 

In an event organised by Dr Felix Schniz, Martin Filipp (Mi’pu’mi Games and PGDA representative) brought a delegation of industry representatives to campus, including Michael Benda (Zeppelin Studio), Raffael Moser (reignite games), and Manuel Bonell (Immerea). They held presentations on the status quo of the Austrian games industry and entertained a developer café in the afternoon, where students could sign up for discussion slots to ask questions about how to get into the industry and how to start your own gaming studio.

 

The event was visited by over 40 visitors, most of them students of the master’s programme Game Studies and Engineering.

 

My dear colleagues,

The PGDA, the Austrian Game Developers Association, is going to visit us with a delegation

On Friday, 13 June, between 11.45am and 06.00pm in S.2.42, they are going to offer talks and opportunities for individual Developer Café chat sessions.

The schedule looks as follows:

11.45am: Room Opens
12.00pm: Introduction and PGDA session on the Austrian game industry today
12.45pm: PGDA talks on various topics, including genre
13.30pm: Break
02.00pm: Developer Café
06.00pm: Ending

You can register for a time slot to meet with and talk to one of the developers visiting us on the day of the event.

We are looking forward to seeing you there!

All the best,
Felix

Hadi

Authors: Ahmed Telili (TII, UAE), Wassim Hamidouche (TII, UAE), Brahim Farhat (TII, UAE), Hadi Amirpour (AAU, Austria), Christian Timmerer (AAU, Austria), Ibrahim Khadraoui (TII, UAE), Jiajie Lu (Politecnico di Milano, Italy), The Van Le (IVCL, South Korea), Jeonneung Baek (IVCL, South Korea), Jin Young Lee (IVCL, South Korea), Yiying Wei (AAU, Austria), Xiaopeng Sun (Meituan Inc. China), Yu Gao (Meituan Inc. China), JianCheng Huang (Meituan Inc. China) and Yujie Zhong (Meituan Inc. China)

Journal: Signal Processing: Image Communication

Abstract: Omnidirectional (360-degree) video is rapidly gaining popularity due to advancements in immersive technologies like virtual reality (VR) and extended reality (XR). However, real-time streaming of such videos, particularly in live mobile scenarios such as unmanned aerial vehicles (UAVs), is hindered by limited bandwidth and strict latency constraints. While traditional methods such as compression and adaptive resolution are helpful, they often compromise video quality and introduce artifacts that diminish the viewer’s experience. Additionally, the unique spherical geometry of 360-degree video, with its wide field of view, presents challenges not encountered in traditional 2D video. To address these challenges, we initiated the 360-degree Video Super Resolution and Quality Enhancement challenge. This competition encourages participants to develop efficient machine learning (ML)-powered solutions to enhance the quality of low-bitrate compressed 360-degree videos, under two tracks focusing on 2× and 4× super-resolution (SR). In this paper, we outline the challenge framework, detailing the two competition tracks and highlighting the SR solutions proposed by the top-performing models. We assess these models within a unified framework, (i) considering quality enhancement, (ii) bitrate gain, and (iii) computational efficiency. Our findings show that lightweight single-frame models can effectively balance visual quality and runtime performance under constrained conditions, setting strong baselines for future research. These insights offer practical guidance for advancing real-time 360-degree video streaming, particularly in bandwidth-limited immersive applications.

 

On 10 June 2025, Dr Felix Schniz presented the Virtual Campus Environment, a central achievement of the UNESCO-funded project Global Campus Online (GLOCO). The project is led by the UNESCO-Chair Univ.-Prof. Dr Hans Karl Peterlini and revolves around the organisation of global meeting platforms to foster supportive environments and knowledge exchange.

The Virtual Campus Environment was fully developed and designed by ITEC staff members affiliated with Game Studies and Engineering, including Tom Tuček, Felix Schniz, and several generations of GSE students who supported the project as a part of their research internship.

Present for the public presentation were rector Ada Pellert and State Governor Peter Kaiser.