Nishant Saurabh

The manuscript ”Expelliarmus: Semantic-Centric Virtual Machine Image Management in IaaS Clouds” is accepted for publication at the Journal of Parallel and Distributed Computing (JPDC) (https://www.journals.elsevier.com/journal-of-parallel-and-distributed-computing).

Authors: Nishant Saurabh (University of Klagenfurt), Shajulin Benedict (Indian Institute of Information Technology, Kottayam), Jorge G. Barbosa (LIACC, Faculdade de Engenharia da Universidade do Porto), Radu Prodan (University of Klagenfurt).

Abstract: Infrastructure-as-a-service (IaaS) Clouds concurrently accommodate diverse sets of user requests, requiring an efficient strategy for storing and retrieving virtual machine images (VMIs) at a large scale. The VMI storage management require dealing with multiple VMIs, typically in the magnitude of gigabytes, which entails VMI sprawl issues hindering the elastic resource management and provisioning. Nevertheless, existing techniques to facilitate VMI management overlook VMI semantics (i.e at the level of base image and software packages) with either restricted possibility to identify and extract reusable functionalities or with higher VMI publish and retrieval overheads. In this paper, we design, implement and evaluate Expelliarmus, a novel VMI management system that helps to minimize storage, publish and retrieval overheads. To achieve this goal, Expelliarmus incorporates three complementary features. First, it makes use of VMIs modelled as semantic graphs to expedite the similarity computation between multiple VMIs. Second, Expelliarmus provides a semantic aware VMI decomposition and base image selection to extract and store non-redundant base image and software packages. Third, Expelliarmus can also assemble VMIs based on the required software packages upon user request. We evaluate Expelliarmus through a representative set of synthetic Cloud VMIs on the real test-bed. Experimental results show that our semantic-centric approach is able to optimize repository size by 2.3-22 times compared to state-of-the-art systems (e.g. IBM’s Mirage and Hemera) with significant VMI publish and slight retrieval performance improvement.

Acknowledgements:

This work is supported by:

  • European Union’s Horizon 2020 research and innovation programme, grant agreement 825134, “Smart Social Media Ecosytstem in a Blockchain Federated Environment (ARTICONF)”;
  • Austrian Agency for International Cooperation in Education and Research (OeAD-GmbH) and Indian Department of Science and Technology (DST), project number, IN 20/2018, “Energy Aware Workflow Compiler for Future Heterogeneous Systems”

The manuscript “The Workflow Trace Archive: Open-Access Data from Public and Private Computing Infrastructures” has been accepted for publication in the A* ranked IEEE Transactions on Parallel and Distributed Systems (TPDS) journal.

Authors: Laurens Versluis, Roland Mathá, Sacheendra Talluri, Tim Hegeman, Radu Prodan, Ewa Deelman, and Alexandru Iosup

Abstract: Realistic, relevant, and reproducible experiments often need input traces collected from real-world environments. We focus in this work on traces of workflows—common in datacenters, clouds, and HPC infrastructures. We show that the state-of-the-art in using workflow-traces raises important issues: (1) the use of realistic traces is infrequent, and (2) the use of realistic, open-access traces even more so. Alleviating these issues, we introduce the Workflow Trace Archive (WTA), an open-access archive of workflow traces from diverse computing infrastructures and tooling to parse, validate, and analyze traces. The WTA includes >48 million workflows captured from >10 computing infrastructures, representing a broad diversity of trace domains and characteristics. To emphasize the importance of trace diversity, we characterize the WTA contents and analyze in simulation the impact of trace diversity on experiment results. Our results indicate significant differences in characteristics, properties, and workflow structures between workload sources, domains, and fields.

Acknowledgments: This work is supported by the projects Vidi MagnaData, Commit, the European Union’s Horizon 2020 Research and Innovation Programme, grant agreement number 801091 “ASPIDE”, and the National Science Foundation award number 1664162.

The first review of the ASPIDE project took place on 25.02.2020 in the premises of the European Commission in Luxemburg. During the project review, a live demo of the platform for supporting extreme scale applications was presented and future research and developing activities were discussed with the reviewers.

Aspide-Review-2020

Aspide Review 2020

ARTICONF: EU first review

ARTICONF: EU first review

The ITEC team participated in the HiPeac 2020 International Workshop on Exascale programing models for extreme data with a presentation with title “Monitoring data collection and mining for Exascale systems”. The ITEC team also attended the collocated ASPIDE meeting and actively participated in the decision of the next research activities in the project.

Dragi Kimovski

Title of the talk: Mobility-Aware Scheduling of Extreme Data Workflows across the Computing Continuum

Abstract: The appearance of the Fog/Edge computing paradigm, as an emanation of the computing continuum closer to the edge of the network, unravels important opportunities for execution of complex business and scientific workflows near the data sources. The main characteristics of these workflows are (i) their distributed nature, (ii) the vast amount of data (in the order of petabytes) they generate and (iii) the strict latency requirements. Current workflow management approaches rely exclusively on the Cloud Data Centers, which due to their geographical distance in relation to the data sources, could negatively influence the latency and cause violation of workflow requirements. It is therefore essential to research novel concepts for partial offloading of complex workflows closer to where the data is generated, thus reducing the communication latency and the need for frequent data transfers.

In this talk we will explore the  potential  of  the computing continuum  for  scheduling and partial offloading  of  complex  workflows  with  strict  response time requirements and expose the resource provisioning challenges related to the heterogeneity and mobility of the Fog/Edge environment. Consequently, we will discuss a novel mobility-aware Pareto-based approach for task offloading across the continuum, which considers three optimization objectives, namely response time, reliability, and financial cost. Besides, the approach introduces a Markov model to perform a single-step predictive analysis on the mobility of the Fog/Edge devices, thus constraining the task offloading optimization problem to devices that do not frequently move (roam) within the computing continuum. As a conclusion to the talk, we will discuss the efficiency of the presented approach, based on both a simulated and a real-world testbed environment tailored for a set real-world biomedical, meteorological and astronomy workflows.

IWCoCo 2020 in Bologna
Prof. Radu Prodan