Hadi

On 25.06.2026, Hadi Amirpourazarian defended his habilitation thesis, “The Predictive Video Encoding Using Visual Complexity Analysis”

Congratulations!

Committee members:

Prof. Wolfgang Faber (Chairperson), Prof. Eckehard Steinbach (External Member), Prof. Wilfried Elmenreich, Prof. Barbara Kaltenbacher, Katharina Stengg, Yuliia Lomonosova, Christoph Rauter

Dr Felix Schniz participated in the podcast “Rock my Worlds of English” to promote the Master’s Programme in Game Studies and Engineering.

The full podcast: https://open.spotify.com/episode/4f4wSWb8SD48yFIsTDnG1I?si=Ha53h2m8Txe1LZuiTjJ-EA

 

Hadi

Title: GNS-GAN: A novel GAN model based on gradient noise suppression

Authors: Hongyou Chen, Lingfeng Qu, Baodan Tian, Yutong He, Yong Fan, Hadi Amirpour, Christian Timmerer and Yao Xin

Journal: Applied Soft Computing

Abstract: Generative adversarial networks (GANs) are widely applicable generative models. However, ensuring stability in adversarial learning remains a significant challenge in current GAN training. Gradient noise, among other factors, significantly impacts the stability of adversarial learning in GAN training. To improve the stability of adversarial learning, a gradient noise suppression generative adversarial network model (GNS-GAN) is proposed. This novel GAN addresses gradient noise by establishing stochastic differential equations (SDEs) for gradient noise in both the discriminator and the generator. The factors affecting the stability of adversarial learning are then analyzed using the assumed gradient noise distribution. Subsequently, an adversarial learning method is designed for the discriminator and generator to suppress gradient noise, thereby completing the adversarial training of GNS-GAN. To verify the performance of GNS-GAN, the experimental results are compared and analyzed using CELEBA, BEDROOM, and CIFAR10 datasets. The FID (Fréchet Inception Distance) values are 23.04 for CELEBA, 18.04 for BEDROOM, and 26.59 for CIFAR10. The GNS-GAN model has stable training performances in the tested datasets. These results demonstrate that the novel GAN model enhances the stability of adversarial learning and the quality of the generated images.

On 10 June 2026, Dr Felix Schniz hosted a session on the video game Bloodborne for AAU’s Media Club. Following this semester’s Media Club leitmotif of ‘the fantastic,’ Felix delved into the game’s depiction of arcane architecture, dream spaces, and the sublime in virtual realms. With 15 attendees and even guests from Salzburg on campus who came by just for this specific date, the session was a fantastic conclusion to this semester’s Media Club schedule.

Last week, Francesco Marchetto and Klaus Schoeffmann presented their work on synthetic data generation for surgical image synthesis at IEEE CBMS 2026 (Computer-Based Medical Systems) conference in Limassol, Cyprus. Their paper, entitled “Hybrid Semantic Augmentation for Cataract Surgery Image Synthesis with GANs and Diffusion-based Models”, investigated how augmenting semantics in conditional generative models can be used to overcome the critical shortage of annotated training data in surgical AI.

The work introduced a student-teacher augmentation framework in which a trained generative model acts as a teacher to produce synthetic surgical images for a student model. Two augmentation strategies were evaluated: a naive mask re-generation approach that varies image appearance while preserving semantic layout, and a novel Hybrid Anatomy Injection strategy that procedurally generates new semantic masks by compositing surgical instruments onto real anatomical backgrounds. Experiments on the Cataract-1K dataset showed that the proposed semantic augmentation achieves up to 24% improvement in Fréchet Inception Distance over the baseline. By exposing the model to novel instrument-anatomy configurations never seen during training, the semantic augmentation breaks the performance plateau that texture-only variation cannot overcome, enabling the model to continue learning beyond the limits of the original data distribution. For diffusion-based models, which carry strong pretraining biases from large-scale natural image datasets, mask re-generation proves more effective: providing more examples of how surgical scenes look helps these models gradually adapt their pretrained priors to the target domain. Together, these strategies demonstrate that meaningful performance gains in surgical image synthesis can be achieved entirely without collecting new patient data, offering a practical and privacy-friendly path toward more capable generative models in clinical settings.

Title: Can Swarms Be Trusted? Showcasing Swarm Intelligence and Privacy Preservation Through AR 

Conference: SIMULTECH 2026, Porto, Portugal, 18.-20.07.2026

Authors:  Melanie Schranz, M. Gojkovic, Horia Vulcu, Kseniia Harshina, 

Abstract: Swarm intelligence provides a robust approach for decentralized coordination in nowadays systems, yet its algorithmic principles, like local decision-making, role differentiation, and emergent global behavior are often difficult to convey to individuals without prior experience in swarm-based control. This creates practical barriers when deploying swarm-enabled solutions in domains such as shared electric vehicle charging, energy management, or mobility systems, where engineers, operators, and stakeholders must reliably understand how decentralized processes produce system-level outcomes. To address this challenge, we developed an Augmented Reality (AR) game that operationalizes a swarm model inspired by the Artificial Bee Colony algorithm and exposes key algorithmic elements, including information propagation, neighborhood interactions, and collective resource allocation—Swarm AR. The system also illustrates how decentralization can reduce data concentration, which may support privacy advantages under certain assumptions about information flow and system design, without requiring explicit protection mechanisms. A shared electric vehicle charging scenario serves as a use case to demonstrate load balancing and the necessity of distributed coordination. We evaluate the tool through a mixed-method user study using pre/post quantitative measures and qualitative analysis. Results indicate modest improvements in participants’ understanding of swarm coordination logic, decentralized decision processes, and emergent behavior relevant for infrastructure control. These findings suggest that AR-based interactive visualization can serve as an effective technical aid for communicating, validating, and reasoning about the operational characteristics of self-organizing systems, supporting informed engineering design and deployment of decentralized, privacy-aware coordination strategies.

Hadi

Title: Advances in Imaging, Perception, and Reasoning for High-Dimensional Visual Data

Conference: VCIP 2026

Abstract: Recent advances in visual sensing, computational imaging, neural representations, and multimodal learning are transforming the way visual data are acquired, processed, communicated, and understood. Modern visual systems increasingly rely on high-dimensional visual data that extend beyond conventional RGB images and videos to include event streams, light fields, hyperspectral and polarization imagery, LiDAR, time-of-flight sensing, neural scene representations, 3D Gaussian splats, and hybrid multimodal sensing modalities. These data capture rich spatial, temporal, geometric, spectral, and cross-modal information, enabling more robust visual processing under challenging conditions such as fast motion, low light, occlusion, missing modalities, and distribution shift. At the same time, the growing complexity and volume of high-dimensional visual data create new challenges in acquisition, restoration, compression, representation, quality assessment, perception, and reasoning. Emerging solutions increasingly integrate imaging, communication, perception, and multimodal intelligence to support reliable visual understanding and decision making. In line with these developments, we invite contributions on computational imaging and novel sensing systems, event-based and multimodal vision, high-dimensional visual restoration and enhancement, learned compression, implicit and neural representations, quality assessment, cross-modal fusion and alignment, robust visual perception, vision-language reasoning, trustworthy AI, and efficient visual communication for next-generation visual systems.

ORGANIZERS

  • Haowen Bai (Nanyang Technological University, SG)
  • Rui Zhao (Nanyang Technological University, SG)
  • Zeyu Xiao (National University of Singapore, SG)
  • Taewoo Kim (INSAIT, BG)
  • Hadi Amirpour (University of Klagenfurt, AT)
  • Tae Hyun Kim (Hanyang University, KR)
Hadi

Title: An HEVC-based Known-Plaintext Attack for Video Selective Encryption

Authors: Lingfeng Qu, Chen Chen, Jinghan Xu, Yuan Yuan, Ningxiong Mao, Hadi Amirpour

Publication: Springer Nature

Hadi

Title: Perceptual Reliability in Multimedia: Quality Assessment and Anomaly Analysis

Event: ACM MM 2026, Rio de Janeiro, Brazil — 10–14 November 2026.

Presenters: Wei Zhou, Hadi Amirpour, Yang Liu, Patrick Le Callet

Hadi

Title: Asymmetry-Aware No-Reference Video Quality Assessment via Dual-Region Temporal Modeling

Authors: MohammadAli Hamidi, Hadi Amirpour, Christian Timmerer, Luigi Atzori

Abstract: Saliency and semantic-driven asymmetric encoding enable significant bitrate savings while maintaining a comparable viewing experience. This paper presents a No-Reference (NR) Video Quality Assessment (VQA) model for evaluating Asymmetrically Encoded Videos (AEV), addressing challenges such as varying compression levels, scaling artifacts, and asymmetric encoding strategies. The proposed approach combines compression-aware features derived from Quantization Parameters (QPs) with spatio-temporal perceptual descriptors capturing blur, motion, and temporal consistency. A hybrid regression framework based on XGBoost and Ridge regression is employed, where a weighted ensemble improves overall performance. Experimental results conducted on the dataset provided by the QoMEX VQA-AEV Grand Challenge, evaluated under a Leave-One-Source-Out (LOSO) protocol, show that the proposed method outperforms state-of-the-art NR-VQA models in terms of correlation coefficients (Pearson and Spearman) and root mean square error (RMSE).