Title: Can Swarms Be Trusted? Showcasing Swarm Intelligence and Privacy Preservation Through AR 

Conference: SIMULTECH 2026, Porto, Portugal, 18.-20.07.2026

Authors:  Melanie Schranz, M. Gojkovic, Horia Vulcu, Kseniia Harshina, 

Abstract: Swarm intelligence provides a robust approach for decentralized coordination in nowadays systems, yet its algorithmic principles, like local decision-making, role differentiation, and emergent global behavior are often difficult to convey to individuals without prior experience in swarm-based control. This creates practical barriers when deploying swarm-enabled solutions in domains such as shared electric vehicle charging, energy management, or mobility systems, where engineers, operators, and stakeholders must reliably understand how decentralized processes produce system-level outcomes. To address this challenge, we developed an Augmented Reality (AR) game that operationalizes a swarm model inspired by the Artificial Bee Colony algorithm and exposes key algorithmic elements, including information propagation, neighborhood interactions, and collective resource allocation—Swarm AR. The system also illustrates how decentralization can reduce data concentration, which may support privacy advantages under certain assumptions about information flow and system design, without requiring explicit protection mechanisms. A shared electric vehicle charging scenario serves as a use case to demonstrate load balancing and the necessity of distributed coordination. We evaluate the tool through a mixed-method user study using pre/post quantitative measures and qualitative analysis. Results indicate modest improvements in participants’ understanding of swarm coordination logic, decentralized decision processes, and emergent behavior relevant for infrastructure control. These findings suggest that AR-based interactive visualization can serve as an effective technical aid for communicating, validating, and reasoning about the operational characteristics of self-organizing systems, supporting informed engineering design and deployment of decentralized, privacy-aware coordination strategies.

Hadi

Title: An HEVC-based Known-Plaintext Attack for Video Selective Encryption

Authors: Lingfeng Qu, Chen Chen, Jinghan Xu, Yuan Yuan, Ningxiong Mao, Hadi Amirpour

Publication: Springer Nature

Hadi

Title: Asymmetry-Aware No-Reference Video Quality Assessment via Dual-Region Temporal Modeling

Authors: MohammadAli Hamidi, Hadi Amirpour, Christian Timmerer, Luigi Atzori

Abstract: Saliency and semantic-driven asymmetric encoding enable significant bitrate savings while maintaining a comparable viewing experience. This paper presents a No-Reference (NR) Video Quality Assessment (VQA) model for evaluating Asymmetrically Encoded Videos (AEV), addressing challenges such as varying compression levels, scaling artifacts, and asymmetric encoding strategies. The proposed approach combines compression-aware features derived from Quantization Parameters (QPs) with spatio-temporal perceptual descriptors capturing blur, motion, and temporal consistency. A hybrid regression framework based on XGBoost and Ridge regression is employed, where a weighted ensemble improves overall performance. Experimental results conducted on the dataset provided by the QoMEX VQA-AEV Grand Challenge, evaluated under a Leave-One-Source-Out (LOSO) protocol, show that the proposed method outperforms state-of-the-art NR-VQA models in terms of correlation coefficients (Pearson and Spearman) and root mean square error (RMSE).

Hadi

Title: Asymmetry-Aware No-Reference Video Quality Assessment via Dual-Region Temporal Modeling

Authors: Yeganeh Chatri, Hadi Amirpour

Abstract: Modern content-adaptive video encoding increasingly relies on asymmetric compression, where semantically important regions are preserved at higher quality than background areas. This results in spatially and temporally heterogeneous distortion patterns that challenge conventional no-reference video quality assessment (NR-VQA) models, which typically assume spatial homogeneity.

In this work, we propose a lightweight dual-region NR-VQA framework that explicitly models distortion heterogeneity by jointly analyzing global context and a content-focused region using a shared ResNet-18 backbone with temporal mean aggregation. To address limited training data, a two-stage freeze–unfreeze optimization strategy is employed for stable learning.

Experiments on the QoMEX Grand Challenge dataset show that the proposed method achieves an SROCC of 0.881, the highest among the evaluated NR-VQA baselines in our experiments, including NIQE, BRISQUE, DOVER, and Q-Align. Additional evaluations on KoNViD-1k and LIVE-VQC indicate consistent generalization across datasets. These results highlight that explicit modeling of spatial heterogeneity is an effective and practical design principle for NR-VQA under asymmetric compression scenarios.

Hadi

Title: Quality of Multimedia Experience Meets Machine Intelligence

Authors: Wei Zhou, Hadi Amirpour, Tobias Hossfeld

Abstract: Multimedia systems are evolving towards AI-driven, adaptive services, leading to a natural convergence of QoE and machine intelligence. In this context, machine intelligence can empower QoE through learning-based, context-aware, and semantic-driven modelling and optimization. At the same time, QoE can guide machine intelligence by providing a human-centred objective for AI system design and evaluation; see also [11]. Looking beyond human perception, toward agent-centric and hybrid QoE, future multimedia systems increasingly require unified experience objectives that support human-AI co-experience. QoMEX’26 in Cardiff stands as a major milestone highlighting the convergence of Quality of Multimedia Experience with Machine Intelligence. This column reflects on this evolution and outlines the key challenges ahead.

Hadi

Title: DAP-Adapter: Enhancing Few-Shot CLIP with Dynamically Diverse and Context-Aware Prompt Generation

Authors: Zongjian Li, Hongyou Chen, Lingfeng Qu, Yongjie Zhu, Ya Pan, Baodan Tian, Yong Fan, Hadi Amirpour

Abstract: Contrastive language-image pretraining (CLIP) has demonstrated powerful zero-shot and few-shot classification capabilities by training on large-scale image-text pairs. However, in the CLIP training paradigm, data augmentation strategies are applied primarily to the image inputs, whereas the text prompts remain fixed throughout the training process. Existing approaches typically rely on static text templates or use a limited number of learnable soft prompts with categories, which restricts the expressiveness of the model in capturing category semantics. In this paper, we propose a novel approach called the dynamic attribute prompt adapter (DAP-Adapter), which leverages large language models to generate diverse textual descriptions. Our approach introduces attributes as intermediate bridges that link categories to their specific descriptions. During training, a batch-level dynamic language mode sampling mechanism is adopted in combination with learnable soft prompts to dynamically construct rich text prompts. To further enhance its ability to capture semantics, DAP-Adapter also integrates a nontrainable CLIP adapter. To evaluate the model performance, experiments were conducted on ten datasets. The experimental results demonstrate that the proposed DAP-Adapter outperforms the state-of-the-art Tip-Adapter-F method.

Hadi

Title: QoMEX 2026 Grand Challenge on Video Quality Assessment for Asymmetric Encoded Videos: Methods and Results

Authors: Jingwen Zhu, Hadi Amirpour, Christian Timmerer, et al.

Abstract: This paper presents the results of the Grand Challenge on Video Quality Assessment for Asymmetric Encoded Videos, held at QoMEX 2026 in Cardiff, UK. The challenge addresses the growing need for video quality metrics (VQM) capable of accurately predicting the perceptual quality of asymmetrically encoded videos, where saliency-driven or semantic-based encoding allocates different quality levels to different spatial regions. Participants were provided with the Sport-ROI dataset containing subjective quality scores and were invited to develop both full-reference (FR) and no-reference (NR) VQM models. We describe the challenge design, the dataset, the evaluation methodology, and summarize the submitted approaches and their performance.

Cross-Layer Dynamics in Live Low-Latency: A Dataset of ABR, CC, and AQM Interactions

18th International Conference on Quality of Multimedia Experience

Cardiff, UK, June 29th – July 3rd, 2026

[PDF]
Md Tariqul Islam (UNICAMP, Brazil),  Farzad Tashtarian (AAU, Austria),  Christian Esteve Rothenberg (UNICAMP, Brazil), Christian Timmerer (AAU, Austria).

Low-latency video streaming, such as Low-Latency DASH (LL-DASH), requires maintaining high Quality of Experience (QoE) under varying network conditions. In LL-DASH, QoE is jointly influenced not only by Adaptive Bitrate (ABR) decisions, but also by transport-layer Congestion Control (CC) and network-layer Active Queue Management (AQM), whose interactions remain insufficiently characterized due to limited cross-layer experimentation. Therefore, we present a large-scale LL-DASH dataset comprising approximately 2,000 controlled sessions across three dash.js ABR algorithms (L2A, Dynamic, LoLP), three CC schemes (CUBIC, BBRv1, Prague) across both TCP and QUIC transport protocols, four AQM configurations (FIFO, FQ-CoDel, CAKE, DualPI2), and multiple congestion scenarios. The dataset supports QoE-aware cross-layer analysis and ABR benchmarking under diverse network configurations and is available at: https://github.com/cd-athena/ ll-dash-crosslayer-dataset

 

 

Paper title: EVLM: Intent-Driven Edge Vision Language Model for UAV-Based Power Line Inspection

Authors: Reza Farahani (DSG, TU Wien, Austria), Zoha Azimi (Christian Doppler Laboratory ATHENA, ITEC, University of Klagenfurt, Austria), Ilir Murturi (Department of Mechatronics, University of Prishtina, Kosova), Arda Goknil (SINTEF, Oslo, Norway), Sagar Sen (SINTEF, Oslo, Norway), Christian Timmerer (Christian Doppler Laboratory ATHENA, ITEC, University of Klagenfurt, Austria), Schahram Dustdar (DSG, TU Wien, Austria)

 

Conference: 2026 IEEE International Conference on Edge Computing and Communications (IEEE EDGE 2026)

 

Abstract: 

Inspection of critical infrastructure, such as power lines, is increasingly conducted using unmanned aerial vehicles (UAVs) that capture aerial video for subsequent human review. Although recent edge-based approaches deploy onboard object
detectors to identify predefined defect classes, these pipelines remain closed-set, task-specific, and largely decoupled from operator intent and edge resource constraints. This paper introduces EVLM, an intent-driven vision-language framework for onboard UAV-based power line inspection. Given a high-level operator intent, EVLM (i) leverages lightweight histogram-based frame filtering to extract salient key frames under bounded compute budgets, (ii) executes a domain-adapted vision language model (VLM) directly on the UAV for intent-conditioned multimodal reasoning, and (iii) synthesizes structured inspection reports together with a minimal set of evidence frames, replacing continuous raw video transmission with compact semantic outputs. To align the VLM with infrastructure inspection semantics while preserving edge efficiency, we perform parameter-efficient fine-tuning using Low-Rank Adaptation (LoRA), enabling domain specialization without updating the full model parameters. We implement and fully deploy EVLM on an NVIDIA Jetson device representative of UAV-class onboard hardware and evaluate it using 20 publicly released power line inspection video sequences spanning 8 heterogeneous environments and 5 operational intent categories. Experimental results show a data reduction of 94.8 %, with transmitted data decreasing from 485 kB to 25 kB per 4 s segment, corresponding to 72.75 MB versus 3.75 MB over a 10 min inspection mission. EVLM operates feasibly on embedded hardware, maintaining moderate CPU/GPU utilization and bounded power consumption (5.6 W), while producing interpretable, intent-aligned inspection outputs with richer semantic insights than detection-centric baselines.

Quantifying Inter-City Network Latency in Europe: A Measurement based Study for Time-Critical Cloud Services

3rd Workshop on Engineering Techniques for Distributed Computing Continuum Systems (EDCCS), 22-25 June 2026, Seoul, South Korea

Authors: Thomas Schleicher, Kurt Horvath, Dragi Kimovski, Bernd Spiess, Oliver Hohlfeld

Abstract: Time-critical cloud and edge services depend on predictable and low-latency wide-area connectivity, yet inter-city network behavior often deviates from expectations based on geographic distance alone. This paper presents an evaluation framework and results on inter-city network latency across major European metropolitan areas, treating latency as a non-functional property relevant to benchmarking and service placement in cloud computing. We develop a scalable measurement framework based on a distributed probing infrastructure, analyze round-trip latency, and assess spatial efficiency and temporal stability. Initial results reveal unexpectedly high latency on long-distance paths from the Iberian Peninsula toward Turkey. Distance-normalized analysis further exposes pronounced inefficiencies on short-distance paths between Greece and Turkey, suggesting non-distance-related network effects beyond geographic proximity. Temporal analysis shows elevated latency variance and instability on paths involving Turkey, while most other inter-city connections closely follow distance-based expectations and remain stable over time. These findings highlight the importance of distance-normalized and stability-aware metrics for evaluating wide-area cloud connectivity. The presented methodology and results provide practical insight for benchmarking, placement, and operation of latency-sensitive cloud services across geographically distributed infrastructures.