Our Paper “Pixel-Based Tool Segmentation in Cataract Surgery Videos with Mask R-CNN” has been accepted for publication at IEEE 33rd International Symposium on Computer Based Medical Systems (CBMS – http://cbms2020.org).
Authors: Markus Fox, Klaus Schöffmann, Mario Taschwer
Abstract:
Automatically detecting surgical tools in recorded surgery videos is an important building block of further content-based video analysis. In ophthalmology, the results of such methods can support training and teaching of operation techniques and enable investigation of medical research questions on a dataset of recorded surgery videos. While previous methods used frame-based classification techniques to predict the presence of surgical tools — but did not localize them, we apply a recent deep-learning segmentation method (Mask R-CNN) to localize and segment surgical tools used in ophthalmic cataract surgery. We add ground-truth annotations for multi-class instance segmentation to two existing datasets of cataract surgery videos and make resulting datasets publicly available for research purposes. In the absence of comparable results from literature, we tune and evaluate the Mask R-CNN approach on these datasets for instrument segmentation/localization and achieve promising results (61\% mean average precision on 50\% intersection over union for instance segmentation, working even better for bounding box detection or binary segmentation), establishing a reasonable baseline for further research. Moreover, we experiment with common data augmentation techniques and analyze the achieved segmentation performance with respect to each class (instrument), providing evidence for future improvements of this approach.
Pixel-Based Tool Segmentation in Cataract Surgery
Acknowledgments:
This work was funded by the FWF Austrian Science Fund under grant P 31486-N31.

IEEE Communications Society extends its appreciation of Hermann Hellwagner as a distingguished member of the IEEE INFOCOM 2020.
See more Information here.
IEEE INFOCOM 2020 – Online Conference July 6-9, 2020

Christian Timmerer

Authors: Venkata Phani Kumar M (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin) and Hermann Hellwagner  (Alpen-Adria-Universität Klagenfurt)

Abstract: Video delivery over the Internet has become more and more established in recent years due to the widespread use of Dynamic Adaptive Streaming over HTTP (DASH). The current DASH specification defines a hierarchical data model for Media Presentation Descriptions (MPDs) in terms of periods, adaptation sets, representations and segments. Although multi-period MPDs are widely used in live streaming scenarios, they are not fully utilized in Video-on-Demand (VoD) HTTP adaptive streaming (HAS) scenarios. In this paper, we introduce MiPSO, a framework for MultiPeriod per-Scene Optimization, to examine multiple periods in VoD HAS scenarios. MiPSO provides different encoded representations of a video at either (i) maximum possible quality or (ii) minimum possible bitrate, beneficial to both service providers and subscribers. In each period, the proposed framework adjusts the video representations (resolution-bitrate pairs) by taking into account the complexities of the video content, with the aim of achieving streams at either higher qualities or lower bitrates. The experimental evaluation with a test video data set shows that the MiPSO reduces the average bitrate of streams with the same visual quality by approximately 10% or increases the visual quality of streams by at least 1 dB in terms of Peak Signal-to-Noise (PSNR) at the same bitrate compared to conventional approaches to video content delivery.

Keywords: Adaptive Streaming, Video-on-Demand, Per-Scene Encoding, Media Presentation Description

IEEE International Conference on Multimedia and Expo. July 06 – 10, London, United Kingdom

Link:https://www.2020.ieeeicme.org/

Authors: Babak Taraghi (Alpen-Adria-Universität Klagenfurt), Anatoliy Zabrovskiy (Alpen-Adria-Universität Klagenfurt), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin) and Hermann Hellwagner (Alpen-Adria-Universität Klagenfurt)

Abstract: Attempting to cope with fluctuations of network conditions in terms of available bandwidth, latency and packet loss, and to deliver the highest quality of video (and audio) content to users, research on adaptive video streaming has attracted intense efforts from the research community and huge investments from technology giants. How successful these efforts and investments are, is a question that needs precise measurements of the results of those technological advancements. HTTP-based Adaptive Streaming (HAS) algorithms, which seek to improve video streaming over the Internet, introduce video bitrate adaptivity in a way that is scalable and efficient. However, how each HAS implementation takes into account the wide spectrum of variables and configuration options, brings a high complexity to the task of measuring the results and visualizing the statistics of the performance and quality of experience. In this paper, we introduce CAdViSE, our Cloud-based Adaptive Video Streaming Evaluation framework for the automated testing of adaptive media players. The paper aims to demonstrate a test environment which can be instantiated in a cloud infrastructure, examines multiple media players with different network attributes at defined points of the experiment time, and finally concludes the evaluation with visualized statistics and insights into the results.

Keywords: HTTP Adaptive Streaming, Media Players, MPEG-DASH, Network Emulation, Automated Testing, Quality of Experience

Link: ACM Multimedia Systems Conference 2020 (MMSys 2020)

Christian Timmerer

Abstract: HTTP adaptive streaming with chunked transfer encoding can offer low-latency streaming without sacrificing the coding efficiency.This allows media segments to be delivered while still being packaged. However, conventional schemes often make widely inaccurate bandwidth measurements due to the presence of idle periods between the chunks and hence this is causing sub-optimal adaptation decisions. To address this issue, we earlier proposed ACTE (ABR for Chunked Transfer Encoding), a bandwidth prediction scheme for low-latency chunked streaming. While ACTE was a significant step forward, in this study we focus on two still remaining open areas, namely (i) quantifying the impact of encoding parameters, including chunk and segment durations, bitrate levels, minimum interval between IDR-frames and frame rate onACTE, and (ii) exploring the impact of video content complexity on ACTE. We thoroughly investigate these questions and report on our findings. We also discuss some additional issues that arise in the context of pursuing very low latency HTTP video streaming.

Authors: Abdelhak Bentaleb (National University of Singapore), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), Ali C. Begen (Ozyegin University, Networked Media), Roger Zimmermann (National University of Singapore)

Keywords: HAS; ABR; DASH; CMAF; low-latency; HTTP chunked transfer encoding; bandwidth measurement and prediction; RLS; encoding parameters; FFmpeg

Christian Timmerer

Abstract: Volumetric media has the potential to provide the six degrees of freedom (6DoF) required by truly immersive media. However, achieving 6DoF requires ultra-high bandwidth transmissions, which real-world wide area networks cannot provide economically. Therefore, recent efforts have started to target efficient delivery of volumetric media, using a combination of compression and adaptive streaming techniques. It remains, however, unclear how the effects of such techniques on the user perceived quality can be accurately evaluated. In this paper, we present the results of an extensive objective and subjective quality of experience (QoE) evaluation of volumetric 6DoF streaming. We use PCC-DASH, a standards-compliant means for HTTP adaptive streaming of scenes comprising multiple dynamic point cloud objects. By means of a thorough analysis we investigate the perceived quality impact of the available bandwidth, rate adaptation algorithm, viewport prediction strategy and user’s motion within the scene. We determine which of these aspects has more impact on the user’s QoE, and to what extent subjective and objective assessments are aligned.

Authors:Jeroen van der Hooft (Ghent University), Maria Torres Vega (Ghent University), Christian Timmerer (Alpen-Adria-Universität Klagenfurt, Bitmovin), Ali C. Begen (Ozyegin University, Networked Media), Filip De Turck (Ghent University), Raimund Schatz (Alpen-Adria Universität Klagenfurt & AIT Austrian Institute of Technology, Austria)

Keywords: Volumetric Media; HTTP Adaptive Streaming; 6DoF; MPEG V-PCC; QoE Assessment; Objective Metrics

International Conference on Quality of Multimedia Experience (QoMEX)
May 26-28, 2020, Athlone, Ireland
http://qomex2020.ie/

The manuscript “The Workflow Trace Archive: Open-Access Data from Public and Private Computing Infrastructures” has been accepted for publication in the A* ranked IEEE Transactions on Parallel and Distributed Systems (TPDS) journal.

Authors: Laurens Versluis, Roland Mathá, Sacheendra Talluri, Tim Hegeman, Radu Prodan, Ewa Deelman, and Alexandru Iosup

Abstract: Realistic, relevant, and reproducible experiments often need input traces collected from real-world environments. We focus in this work on traces of workflows—common in datacenters, clouds, and HPC infrastructures. We show that the state-of-the-art in using workflow-traces raises important issues: (1) the use of realistic traces is infrequent, and (2) the use of realistic, open-access traces even more so. Alleviating these issues, we introduce the Workflow Trace Archive (WTA), an open-access archive of workflow traces from diverse computing infrastructures and tooling to parse, validate, and analyze traces. The WTA includes >48 million workflows captured from >10 computing infrastructures, representing a broad diversity of trace domains and characteristics. To emphasize the importance of trace diversity, we characterize the WTA contents and analyze in simulation the impact of trace diversity on experiment results. Our results indicate significant differences in characteristics, properties, and workflow structures between workload sources, domains, and fields.

Acknowledgments: This work is supported by the projects Vidi MagnaData, Commit, the European Union’s Horizon 2020 Research and Innovation Programme, grant agreement number 801091 “ASPIDE”, and the National Science Foundation award number 1664162.

Abstract: Real-time video streaming traffic and related applications have witnessed significant growth in recent years. However, this has been accompanied by some challenging issues, predominantly resource utilization. IP multicasting, as a solution to this problem, suffers from many problems. Using scalable video coding could not gain wide adoption in the industry, due to reduced compression efficiency and additional computational complexity. The emerging software-defined networking (SDN)and network function virtualization (NFV) paradigms enable re-searchers to cope with IP multicasting issues in novel ways. In this paper, by leveraging the SDN and NFV concepts, we introduce a cost-aware approach to provide advanced video coding (AVC)-based real-time video streaming services in the network. In this study, we use two types of virtualized network functions (VNFs): virtual reverse proxy (VRP) and virtual transcoder (VTF)functions. At the edge of the network, VRPs are responsible for collecting clients’ requests and sending them to an SDN controller. Then, executing a mixed-integer linear program (MILP) determines an optimal multicast tree from an appropriate set of video source servers to the optimal group of transcoders. The desired video is sent over the multicast tree. The VTFs transcode the received video segments and stream to the requested VRPs over unicast paths. To mitigate the time complexity of the proposed MILPmodel, we propose a heuristic algorithm that determines a near-optimal solution in a reasonable amount of time. Using theMiniNet emulator, we evaluate the proposed approach and show it achieves better performance in terms of cost and resource utilization in comparison with traditional multicast and unicast approaches.

Authors: Alireza Erfanian, Farzad Tashtarian, Reza Farahani, Christian Timmerer, Hermann Hellwagner

IEEE Conference on Network Softwarization 29 June-3 July 2020 // Ghent, Belgium http://netsoft2020.netsoft-ieee.org

Keywords—Dynamic Adaptive Streaming over HTTP (DASH), Real-time Video Streaming, Software Defined Networking (SDN), Video Transcoding, Network Function Virtualization (NFV).

Natalia Sokolova

The 1-page abstract “Pixel-Based Iris and Pupil Segmentation in Cataract Surgery Videos Using Mask R-CNN” was accepted at the workshop “Deep Learning for Biomedical Image Reconstruction” of the International Symposium on Biomedical Imaging that will take place in Iowa-City, Iowa, USA, 3-7 April.

Authors:
Natalia Sokolova, Mario Taschwer, Klaus Schoeffmann

Acknowledment:
This work was funded by the FWF Austrian Science Fund under grant P 31486-N31

The first review of the ASPIDE project took place on 25.02.2020 in the premises of the European Commission in Luxemburg. During the project review, a live demo of the platform for supporting extreme scale applications was presented and future research and developing activities were discussed with the reviewers.

Aspide-Review-2020

Aspide Review 2020