News – Page 2 – ITEC Homepage

Journal of Visual Communication and Image Representation Special Issue – Call for Papers

Journal of Visual Communication and Image Representation Special Issue on

Multimodal Learning for Visual Intelligence: From Emerging Techniques to Real-World Applications

In recent years, the integration of vision with complementary modalities such as language, audio, and sensor signals has emerged as a key enabler for intelligent systems that operate in unstructured environments. The emergence of foundation models and cross-modal pretraining has brought a paradigm shift to the field, making it timely to revisit the core challenges and innovative techniques in multimodal visual understanding.

This Special Issue aims to collect cutting-edge research and engineering practices that advance the understanding and development of visual intelligence systems through multimodal learning. The focus is on the deep integration of visual information with complementary modalities such as text, audio, and sensor data, enabling more comprehensive perception and reasoning in real-world environments. We encourage contributions from both academia and industry that address current challenges and propose novel methodologies for multimodal visual understanding.

Topics of interest include, but are not limited to:

Multimodal data alignment and fusion strategies with a focus on visual-centric modalities
Foundation models for multimodal visual representation learning
Generation and reconstruction techniques in visually grounded multimodal scenarios
Spatiotemporal modeling and relational reasoning of visual-centric multimodal data
Lightweight multimodal visual models for resource-constrained environments
Key technologies for visual-language retrieval and dialogue systems
Applications of multimodal visual computing in healthcare, transportation, robotics, and surveillance

Guest editors:

Guanghui Yue, PhD
Shenzhen University, Shenzhen, China
Email: yueguanghui@szu.edu.cn

Weide Liu, PhD
Harvard University, Cambridge, Massachusetts, USA
Emai: weide001@e.ntu.edu.sg

Ziyang Wang, PhD
The Alan Turing Institute, London, UK
Emai: zwang@turing.ac.uk

Hadi Amirpour, PhD
Alpen-Adria University, Klagenfurt, Austria
Emai: hadi.amirpour@aau.at

Zhedong Zheng, PhD
University of Macau, Macau, China
Email: zhedongzheng@um.edu.mo

Wei Zhou, PhD
Cardiff University, Cardiff, UK
Email: zhouw26@cardiff.ac.uk

Timeline:

Submission Open Date 30/05/2025

Final Manuscript Submission Deadline 30/11/2025

Editorial Acceptance Deadline 30/05/2026

Keywords: Multimodal Learning, Visual-Language Models, Cross-Modal Pretraining, Multimodal Fusion and Alignment, Spatiotemporal Reasoning, Lightweight Multimodal Models, Applications in Healthcare and Robotics

May 28, 2025

Announcement, DPS

Reza Farahani gave a tutorial at the 16th ACM/SPEC International Conference on Performance Engineering (ICPE)

Dr. Reza Farahani presented 3-hour tutorial titled “Serverless Orchestration on the Edge-Cloud Continuum: Challenges and Solutions” at the 16th ACM/SPEC International Conference on Performance Engineering (ICPE) on May 5.

Abstract: Serverless computing simplifies application development by abstracting infrastructure management, allowing developers to focus on building application functionality while infrastructure providers handle tasks, such as resource scaling and provisioning. Orchestrating serverless applications across the edge-cloud continuum, however, poses challenges such as managing heterogeneous resources with varying computational capacities and energy constraints, ensuring low-latency execution, dynamically allocating workloads based on real-time metrics, and maintaining fault tolerance and scalability across multiple edge and cloud instances. This tutorial first explores foundational serverless computing concepts, including Function-as-a-Service (FaaS), Backend-as-a-Service (BaaS), and their integration into distributed edge-cloud systems. It then introduces advancements in multi-cloud orchestration, edge-cloud integration strategies, and resource allocation techniques, focusing on their applicability in real-world scenarios. It addresses the challenges of orchestrating serverless applications across edge-cloud environments, mainly using dynamic workload distribution models, multi-objective scheduling algorithms, and energy-optimized orchestration. Practical demonstrations employ Kubernetes, serverless platforms such as GCP Functions, AWS Lambda, AWS Step Functions, OpenFaaS, and OpenWhisk, along with monitoring tools like Prometheus and Grafana, to deploy and execute real-world application workflows, providing participants with hands-on experience and insights into evaluating and refining energy- and performance-aware serverless orchestration strategies.

May 27, 2025

Announcement, itec

IEEE ICIP’25: Machine Learning-Based Decoding Energy Modeling for VVC Streaming

Machine Learning-Based Decoding Energy Modeling for VVC Streaming

2025 IEEE International Conference on Image Processing (ICIP)

14-17 September, Anchorage, Alaska, USA

https://2025.ieeeicip.org/

Reza Farahani (AAU Klagenfurt, Austria), Vignesh V Menon (Fraunhofer HHI, Germany), and Christian Timmerer (AAU Klagenfurt, Austria)

Abstract: Efficient video streaming requires jointly optimizing encoding parameters (bitrate, resolution, compression efficiency) and decoding constraints (computational load, energy consumption) to balance quality and power efficiency, particularly for resource-constrained devices. However, hardware heterogeneity, including differences in CPU/GPU architectures, thermal management, and dynamic power scaling, makes absolute energy models unreliable, particularly for predicting decoding consumption. This paper introduces the Relative Decoding Energy Index (RDEI), a metric that normalizes decoding energy consumption against a baseline encoding configuration, eliminating device-specific dependencies to enable cross-platform comparability and guide energy-efficient streaming adaptations. We use a dataset of 1000 video sequences to extract complexity features capturing spatial and temporal variations, employ Versatile Video Coding (VVC) open-source toolchain using VVenC/VVdeC with various resolutions, framerate, encoding preset and quantization parameter (QP) sets, and model RDEI using Random Forest (RF), XGBoost, Linear Regression (LR), and Shallow Neural Networks (NN) for decoding energy prediction. Experimental results demonstrate that RDEI-based predictions provide accurate decoding energy estimates across different hardware, ensuring cross-device comparability in VVC streaming.

Keywords: Video Streaming; Energy Prediction; Versatile Video Coding (VVC); Video Complexity Analysis.

May 22, 2025

Announcement, Publication

Paper accepted: SCAREY: Location-Aware Service Lifecycle Management

Authors: Kurt Horvath, Dragi Kimovski, Radu Prodan

Venue: 2025 IEEE International Conference on Edge Computing and Communications (IEEE EDGE 2025), July 7-12 Helsinki, Finland

Abstract: Scheduling services within the computing continuum is complex due to the dynamic interplay of the Edge, Fog, and Cloud resources, each offering distinct computational and networking advantages. This paper introduces SCAREY, a user location-aided service lifecycle management framework based on state machines. SCAREY addresses critical service discovery, provisioning, placement, and monitoring challenges by providing unified dynamic state machine-based lifecycle management, allowing instances to transition between discoverable and non-discoverable states based on demand. It incorporates a scalable service deployment algorithm to adjust the number of instances and employs network measurements to optimize service placement, ensuring minimal latency and enhancing sustainability. Real-world evaluations demonstrate a 73% improvement in service discovery and acquisition times, 45% cheaper operating costs and over 57% lesser power consumption and lower CO2 emissions compared to existing related methods.

May 19, 2025

itec

ICCV 2025 Workshop: Visual Quality Assessment Competition

Visual Quality Assessment Competition

VQualA

co-located with ICCV 2025

https://vquala.github.io/

ICCV 2025 Workshop: Visual Quality Assessment Competition | ATHENA Christian Doppler (CD) Laboratory

Visual quality assessment plays a crucial role in computer vision, serving as a fundamental step in tasks such as image quality assessment (IQA), image super-resolution, document image enhancement, and video restoration. Traditional visual quality assessment techniques often rely on scalar metrics like Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM), which, while effective in certain contexts, fall short in capturing the perceptual quality experienced by human observers. This gap emphasizes the need for more perceptually aligned and comprehensive evaluation methods that can adapt to the growing demands of applications such as medical imaging, satellite remote sensing, immersive media, and document processing. In recent years, advancements in deep learning, generative models, and multimodal large language models (MLLMs) have opened up new avenues for visual quality assessment. These models offer capabilities that extend beyond traditional scalar metrics, enabling more nuanced assessments through natural language explanations, open-ended visual comparisons, and enhanced context awareness. With these innovations, VQA is evolving to better reflect human perceptual judgments, making it a critical enabler for next-generation computer vision applications.

The VQualA Workshop aims to bring together researchers and practitioners from academia and industry to discuss and explore the latest trends, challenges, and innovations in visual quality assessment. We welcome original research contributions addressing, but not limited to, the following topics:

Image and video quality assessment
Perceptual quality assessment techniques
Multi-modal quality evaluation (image, video, text)
Visual quality assessment for immersive media (VR/AR)
Document image enhancement and quality analysis
Quality assessment under adverse conditions (low light, weather distortions, motion blur)
Robust quality metrics for medical and satellite imaging
Perceptual-driven image and video super-resolution
Visual quality in restoration tasks (denoising, deblurring, upsampling)
Human-centric visual quality assessment
Learning-based quality assessment models (CNNs, Transformers, MLLMs)
Cross-domain visual quality adaptation
Benchmarking and datasets for perceptual quality evaluation
Integration of large language models for quality explanation and assessment
Open-ended comparative assessments with natural language reasoning
Emerging applications of VQA in autonomous driving, surveillance, and smart cities

May 15, 2025

Announcement, MMIS

Invited Guest Talk

On 12 May, Dr Felix Schniz held an invited guest talk at the Department of Culturology, Faculty of Arts, University of Ljubljana.

In his talk, Felix discussed to challenges and necessities of Critical Theory thinking on an age of digital pessimism with a focus in game studies:

The Frankfurt School is a long-standing moral pillar of cultural studies. As the lines between humankind, virtuality, and technology blur more and more, however, the Frankfurt School’s stance towards the cultural industry incrementally enters a fundamental crisis of purpose. In this guest lecture, I elaborate on the crisis of digi-pessimism by drawing a bridge from the early days of Frankfurt School thinking to its contemporary work on interactive virtual artworks on the example of video games. I outline the origins of the school and its vital key terms along with its most prominent thinkers, its importance in the analysis of cultural artefacts, and its challenges when confronted with a medium that is as highly capitalistic in its conception as it is subversive. Focusing on Bloodborne (FromSoftware 2015), I portray the difficulties of navigating a virtual world that fuses pop-culture gothic horror with practices of anti-capitalist resistance and the meaning of such spaces in the face of contemporary political and technological raptures.

May 13, 2025

Announcement, Publication

Paper accepted: EnergyLess: An Energy-Aware Serverless Workflow Batch Orchestration on the Computing Continuum

We are happy to announce that our paper “EnergyLess: An Energy-Aware Serverless Workflow Batch Orchestration on the Computing Continuum” (by Reza Farahani and Radu Prodan) has been accepted for IEEE CLOUD 2025, which will take place in Helsinki, Finland, in July 2025.

Venue: IEEE International Conference on Cloud Computing 2025 (IEEE CLOUD 2025)

Abstract: Serverless cloud computing is increasingly adopted for workflow management, optimizing resource utilization for providers while lowering costs for customers. Integrating edge computing into this paradigm enhances scalability and efficiency, enabling seamless workflow distribution across geographically dispersed resources on the computing continuum. However, existing serverless workflow orchestration methods on the computing continuum often prioritize time and cost objectives, neglecting energy consumption and carbon footprint. This paper introduces EnergyLess, a multi-objective concurrent serverless workflow batch orchestration service for the computing continuum. EnergyLess decomposes workflow functions within a batch into finer-grained sub-functions and schedules either the original or sub-function versions to appropriate regions and instances on the continuum, improving energy consumption, carbon footprint, economic cost, and completion time while considering individual workflow requirements and resource constraints. We formulate the problem as a mixed-integer nonlinear programming (MINLP) model and propose three lightweight heuristic algorithms for function decomposition and scheduling. Evaluations on a large- scale computing continuum testbed with realistic workflows, spanning AWS Lambda, Google Cloud Functions (GCF), and 325 fog and edge instances across six regions demonstrate that EnergyLess improves cost efficiency by 75 %, completion time by 6%, energy consumption by 15%, and CO2 emissions by 20% for a batch size of 300, compared to three baseline methods.

May 13, 2025

Announcement, games

A Webbing Journey is launching in Early Access on May 19th, on PC via Steam and got a new trailer too!

Big news for spider fans: The viral sensation and wacky physics game A Webbing Journey is launching in Early Access on May 19th, on PC via Steam and got a new trailer too!

Congrats to Sebastian and his team!

A Webbing Journey has been drumming up excitement over the past months with gathering millions of views in regular viral social media posts! The demo is currently available on Steam and has a stunning, overwhelmingly positive review rating of 99% (over 500 reviews!). It comes from indie dev team Fire Totem Games and publisher Future Friends Games (Exo One, SUMMERHOUSE, The Cabin Factory).

May 8, 2025

games

Game Studies and Engineering at HaruCon

From 3 to 4 May, Game Studies and Engineering has been present at the HaruCon in Klagenfurt. As every year, the programme direction hosted an info-booth about Game Studies and Engineering @ AAU, enabled students to showcase their current game projects, and organised a workshop for those interested in the programme’s research.

The annual HaruCon is the biggest gaming, pop-culture, and fandom event in Carinthia. Attracting several thousand visitors over the weekend, it has ever since been a great opportunity for the master’s programme Game Studies and Engineering to present itself to an excited crowd of tech and play enthusiasts and to show them that their interest plays an important role in the academic landscape of the region.

May 7, 2025

Announcement

Pattern Recognition Special Issue – Call for Papers

Pattern Recognition Special Issue on

Advances in Multimodal-Driven Video Understanding and Assessment

The rapid growth of video content across various domains has led to an increasing demand for more intelligent and efficient video understanding and assessment techniques. This Special Issue focuses on the integration of multimodal information, such as audio, text, and sensor data, with video to enhance processing, analysis, and interpretation. Multimodal-driven approaches are crucial for numerous real-world applications, including automated surveillance, content recommendation, and healthcare diagnostics.

This Special Issue invites cutting-edge research on topics such as video capture, compression, transmission, enhancement, and quality assessment, alongside advancements in deep learning, multimodal fusion, and real-time processing frameworks. By exploring innovative methodologies and emerging applications, we aim to provide a comprehensive perspective on the latest developments in this dynamic and evolving field.

Topics of interest include but are not limited to:

Multimodal-driven video capture techniques
Video compression and efficient transmission for/using multimodal data
Deep learning-based video enhancement and super-resolution
Multimodal action and activity recognition
Audio-visual and text-video fusion methods
Video quality assessment with multimodal cues
Video captioning and summarization using multimodal data
Real-time multimodal video processing frameworks
Explainability and interpretability in multimodal video models
Applications in surveillance, healthcare, and autonomous systems

Guest editors:

Wei Zhou, PhD
Cardiff University, Cardiff, United Kingdom
Email: zhouw26@cardiff.ac.uk

Yakun Ju, PhD
University of Leicester, Leicester, United Kingdom
Email: yj174@leicester.ac.uk

Hadi Amirpour, PhD
University of Klagenfurt, Klagenfurt, Austria
Email: hadi.amirpour@aau.at

Bruce Lu, PhD
University of Western Australia, Perth, Australia
Email: bruce.lu@uwa.edu.au

Jun Liu, PhD
Lancaster University, Lancaster, United Kingdom
Email: j.liu81@lancaster.ac.uk

Important dates

Submission Portal Open: April 04, 2025

Submission Deadline: October 30, 2025

Acceptance Deadline: May 30, 2026

Keywords:

Multimodal video analysis, video understanding, deep learning, video quality assessment, action recognition, real-time video processing, audio-visual learning, text-video processing

April 9, 2025