Medical Multimedia Information Systems

Paper accepted: A channel allocation algorithm for cognitive radio users based on channel state predictors


Authors: Nakisa Shams (ETS, Montreal, Canada), Hadi Amirpour (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), and Mohammad Ghanbari (School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK)

Abstract: Cognitive radio networks by utilizing the spectrum holes in licensed frequency bands are able to efficiently manage the radio spectrum. A significant improvement in spectrum use can be achieved by giving secondary users access to these spectrum holes. Predicting spectrum holes can save significant energy that is consumed to detect spectrum holes. This is because the secondary users can only select the channels that are predicted to be idle channels. However, collisions can occur either between a primary user and secondary users or among the secondary users themselves. This paper introduces a centralized channel allocation algorithm in a scenario with multiple secondary users to control both primary and secondary collisions. The proposed allocation algorithm, which uses a channel status predictor, provides a good performance with fairness among the secondary users while they have the minimal interference with the primary user. The simulation results show that the probability of a wrong prediction of an idle channel state in a multi-channel system is less than 0.9%. In addition, the channel state prediction saves the sensing energy up to 73%, and the utilization of the spectrum can be improved more than 77%.

Keywords: Cognitive radio, Biological neural networks, Prediction, Idle channel.

International Congress on Information and Communication Technology

25-26 February 2021, London, UK


Grand Challenge Keynote on “Deep Video Understanding and the User” at ACMMM2020


Today, Klaus Schöffmann will present his keynote talk on “Deep Video Understanding and the User” at the ACM Multimedia 2020 Grand Challenge (GC) on “Deep Video Understanding”. The talk will highlight user aspects of automatic video content search, based on deep neural networks, and show several examples where users have serious issues in finding the correct content scene, when video search systems rely too much on the “automatic search” scenario and ignore the user behind. Registered users of ACMMM2020 can watch the talk online; the corresponding GC is scheduled for October 14 from 21:00-22:00 (15:00-16:00 NY Time).


Natalia Sokolova

Paper “Pixel-Based Iris and Pupil Segmentation in Cataract Surgery Videos Using Mask R-CNN” was accepted at the workshop of the International Symposium on Biomedical Imaging


Authors: Natalia Sokolova, Mario Taschwer, Stephanie Sarny, Doris Putzgruber-Adamitsch and Klaus Schoeffmann

Abstract: Automatically detecting clinically relevant events in surgery video recordings is becoming increasingly important for documentary, educational, and scientific purposes in the medical domain. From a medical image analysis perspective, such events need to be treated individually and associated with specific visible objects or regions. In the field of cataract surgery (lens replacement in the human eye), pupil reaction (dilation or restriction) during surgery may lead to complications and hence represents a clinically relevant event. Its detection requires automatic segmentation and measurement of pupil and iris in recorded video frames. In this work, we contribute to research on pupil and iris segmentation methods by (1) providing a dataset of 82 annotated images for training and evaluating suitable machine learning algorithms, and (2) applying the Mask R-CNN algorithm to this problem, which – in contrast to existing techniques for pupil segmentation – predicts free-form pixel-accurate segmentation masks for iris and pupil.

The proposed approach achieves consistent high segmentation accuracies on several metrics while delivering an acceptable prediction efficiency, establishing a promising basis for further segmentation and event detection approaches on eye surgery videos.


Philipp Moll

How Players Play Games: Observing the Influences of Game Mechanics


Authors: Philipp Moll, Veit Frick, Natascha Rauscher, Mathias Lux (Alpen-Adria-Universität Klagenfurt)
Abstract: The popularity of computer games is remarkably high and is still growing. Despite the popularity and economical impact of games, data-driven research in game design, or to be more precise, in-game mechanics – game elements and rules defining how a game works – is still scarce. As data on user interaction in games is hard to get by, we propose a way to analyze players’ movement and action based on video streams of games. Utilizing this data we formulate four hypotheses focusing on player experience, enjoyment, and interaction patterns, as well as the interrelation thereof. Based on a user study for the popular game Fortnite, we discuss the interrelation between game mechanics, enjoyment of players, and different player skill levels in the observed data.
Keywords: Online Games; Game Mechanics; Game Design; Video Analysis
Links: International Workshop on Immersive Mixed and Virtual Environment Systems (MMVE)

Paper “Pixel-Based Tool Segmentation in Cataract Surgery Videos with Mask R-CNN” has been accepted at CBMS2020.


Our Paper “Pixel-Based Tool Segmentation in Cataract Surgery Videos with Mask R-CNN” has been accepted for publication at IEEE 33rd International Symposium on Computer Based Medical Systems (CBMS –
Authors: Markus Fox, Klaus Schöffmann, Mario Taschwer
Automatically detecting surgical tools in recorded surgery videos is an important building block of further content-based video analysis. In ophthalmology, the results of such methods can support training and teaching of operation techniques and enable investigation of medical research questions on a dataset of recorded surgery videos. While previous methods used frame-based classification techniques to predict the presence of surgical tools — but did not localize them, we apply a recent deep-learning segmentation method (Mask R-CNN) to localize and segment surgical tools used in ophthalmic cataract surgery. We add ground-truth annotations for multi-class instance segmentation to two existing datasets of cataract surgery videos and make resulting datasets publicly available for research purposes. In the absence of comparable results from literature, we tune and evaluate the Mask R-CNN approach on these datasets for instrument segmentation/localization and achieve promising results (61\% mean average precision on 50\% intersection over union for instance segmentation, working even better for bounding box detection or binary segmentation), establishing a reasonable baseline for further research. Moreover, we experiment with common data augmentation techniques and analyze the achieved segmentation performance with respect to each class (instrument), providing evidence for future improvements of this approach.
Pixel-Based Tool Segmentation in Cataract Surgery
This work was funded by the FWF Austrian Science Fund under grant P 31486-N31.

Natalia Sokolova

A 1-page abstract “Pixel-Based Iris and Pupil Segmentation in Cataract Surgery Videos Using Mask R-CNN” was accepted at the workshop of the International Symposium on Biomedical Imaging


The 1-page abstract “Pixel-Based Iris and Pupil Segmentation in Cataract Surgery Videos Using Mask R-CNN” was accepted at the workshop “Deep Learning for Biomedical Image Reconstruction” of the International Symposium on Biomedical Imaging that will take place in Iowa-City, Iowa, USA, 3-7 April.

Natalia Sokolova, Mario Taschwer, Klaus Schoeffmann

This work was funded by the FWF Austrian Science Fund under grant P 31486-N31

MMM and VBS 2020 in Daejeon (South Korea)

At the 26th International Conference on MultiMedia Modeling (MMM 2020) in Daejeon, Korea, researchers from ITEC have successfully presented several scientific contributions to the multimedia community. First, Natalia Sokolova presented her first paper on “Evaluating the Generalization Performance of Instrument Classification in Cataract Surgery Videos”. Next, Sabrina Kletz presented her work on “Instrument Recognition in Laparoscopy for Technical Skill Assessment”. Finally, Andreas Leibetseder talked about “GLENDA: Gynecologic Laparoscopy Endometriosis Dataset”.

Natalia Sokolova

ITEC Staff at VBS 2020 in Daejeon (South Korea)

VBS 2020 in Daejeon (South Korea) was an amazing event with a lot of fun! Eleven teams, each consisting of two users (coming from 11 different countries) competed against each other in both a private session for about 5 hours and a public session for almost 3 hours. ITEC did also participate with two teams. In total all teams had to solve 22 challenging video retrieval tasks, issued on a shared dataset consisting of 1000 hours of content (V3C1)! Many thanks go to the VBS teams but also to the VBS organizers as well as the local organizers, who did a great job and made VBS2020 a wonderful and entertaining event!

Sabrina Kletz presented her work at the MIAR Workshop @ MICCAI 2019


Sabrina Kletz presented the paper “Learning the Representation of Instrument Images in Laparoscopy Video” at the MIAR Workshop @ MICCAI 2019 in Shenzhen, China.

Authors: Sabrina Kletz, Klaus Schoeffmann, Heinrich Husslein

Abstract: Automatic recognition of instruments in laparoscopy videos poses many challenges that need to be addressed like identifying multiple instruments appearing in various representations and in different lighting conditions which in turn may be occluded by other instruments, tissue, blood or smoke. Considering these challenges it may be beneficial for recognition approaches that instrument frames are first detected in a sequence of video frames for further investigating only these frames. This pre-recognition step is also relevant for many other classification tasks in laparoscopy videos such as action recognition or adverse event analysis. In this work, we address the task of binary classification to recognize video frames as either instrument or non-instrument images. We examine convolutional neural network models to learn the representation of instrument frames in videos and take a closer look at learned activation patterns. For this task, GoogLeNet together with batch normalization is trained and validated using a publicly available dataset for instrument count classifications. We compare transfer learning with learning from scratch and evaluate on datasets from cholecystectomy and gynecology. The evaluation shows that fine-tuning a pre-trained model on the instrument and non-instrument images is much faster and more stable in learning than training a model from scratch.

Conference: 2019 International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), October 13–17, 2018, Shenzhen, China

Track: Medical Imaging and Augmented Reality (MIAR) Workshop @MICCAI

Natalia Sokolova

MMM’20: Evaluating the Generalization Performance of Instrument Classification in Cataract Surgery Videos


Our paper has been accepted for publication at the MMM 2020 Conference on Multimedia Modeling. The work was conducted in the context of the ongoing OVID project.

Authors: Natalia Sokolova, Klaus Schoeffmann, Mario Taschwer (AAU Klagenfurt); Doris
Putzgruber-Adamitsch, Yosuf El-Shabrawi (Klinikum Klagenfurt)

In the field of ophthalmic surgery, many clinicians nowadays record their microscopic procedures with a video camera and use the recorded footage for later purpose, such as forensics, teaching, or training. However, in order to efficiently use the video material after surgery, the video content needs to be analyzed automatically. Important semantic content to be analyzed and indexed in these short videos are operation instruments, since they provide an indication of the corresponding operation phase and surgical action. Related work has already shown that it is possible to accurately detect instruments in cataract surgery videos. However, their underlying dataset (from the CATARACTS challenge) has very good visual quality, which is not reflecting the typical quality of videos acquired in general hospitals. In this paper, we therefore analyze the generalization performance of deep learning models for instrument recognition in terms of dataset change. More precisely, we trained such models as ResNet-50, Inception v3 and NASNet Mobile using a dataset of high visual quality (CATARACT) and test it on another dataset with low visual quality (Cataract-101), and vice versa. Our results show that the generalizability is rather low in general, but clearly worse for the model trained on the high-quality dataset. Another important observation is the fact that the trained models are able to detect similar instruments in the other dataset even if their appearance is different.