Automated Software Engineering Group

We are a group of 36 researchers in the Department of Computer Science at the University of York, developing ground-breaking methods and tools for automated analysis, design, development, deployment, and management of complex software-intensive systems. We collaborate closely with companies such as Rolls-Royce, IBM, Altran, and Volkswagen on projects co-funded by the European Commission, RCUK, InnovateUK and DSTL.

Members

Professor Dimitris Kolovos
Model-based software engineering, software repository mining and big-data persistence and processing architectures.
Professor Richard Paige
Model-based software engineering, agile development, service-oriented architectures, formal methods, object-oriented programming, systems engineering.
Dr Radu Calinescu
Formal methods for adaptive, autonomic, secure and dependable IT systems, automated, model- and metadata-driven software engineering, formal specification, modelling and verification. Leading the Trustworthy Adaptive and Autonomous Systems & Processes team.
Dr Javier Camara Moreno
Software engineering, self-adaptive systems, software architectures, applied formal methods, cyber-physical systems.
Dr Nicholas Matragkas
Model-based software engineering, software repository mining and software testing.
Dr Simos Gerasimou
Self-adaptive and autonomous systems with a focus on methods that enable dependable system adaptation, runtime quantitative verification, search-based software engineering, model-driven engineering, robotics and artificial intelligence.
Dr Thanos Zolotas
Model-based software engineering, big data architectures
Dr Kostas Barmpis
Model-based software engineering, mining software repositories.
Dr Alfonso de la Vega
Model-based software engineering, model visualisation and comparison
Dr Colin Paterson
Tool-supported formal approaches for engineering of adaptive and autonomous systems and processes, probabilistic model checking.
Dr Alfa Yohannis
Model-based software engineering, change-based model persistence.
Justin Cooper
Domain-specific languages, embedded at Rolls-Royce.
Jon Co
Model-based spreadsheet analysis, embedded at IBM.
Betty Sanchez
Model-based software engineering, Simulink, reactive modelling workflows.
Sultan Almutairi
Model-based software engineering, model-to-text transformation.
Nikos Fountoulakis
Software repository mining, code repository indexing.
Qurat Ul Ain Ali
Low-code software engineering
Sorour Jahanbin
Low-code software engineering
Panagiotis Kourouklidis
Low-code software engineering for machine learning
Emad Alharbi
Metaheuristics for protein model synthesis from electron-density maps.
Ana Markovic
Multi-language distributed stream processing
Premathas Somasekaram
Autonomous systems, cloud computing, high availability cluster and grid computing, machine learning, statistical analysis, Bayesian networks.
Ioannis Stefanakos
Formal methods, model-driven software engineering
Saud Yonbawi
Self-adaptation in distributed systems, runtime quantitative verification.
Patrick Neubauer
Model-based software engineering, mining software repositories.

Recent Publications

Maintaining driver attentiveness in shared-control autonomous driving

Calinescu, R., Alasmari, N. & Gleirscher, M., 12 Mar 2021, (Accepted/In press) Software Engineering for Adaptive and Self-Managing Systems. IEEE, (IEEE Conference Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publicationSoftware Engineering for Adaptive and Self-Managing Systems
DateAccepted/In press - 12 Mar 2021
PublisherIEEE
Original languageEnglish

Publication series

NameIEEE Conference Proceedings
PublisherIEEE

Abstract

We present a work-in-progress approach to improving driver attentiveness in cars provided with automated driving systems. The approach is based on a control loop that monitors the driver’s biometrics (eye movement, heart rate, etc.) and the state of the car; analyses the driver’s attentiveness using a deep neural network; plans driver alerts and changes in the speed of the car using a formally verified controller; and executes this plan using acoustic, visual and haptic actuators. The paper presents (i) the self-adaptive system formed by this monitor-analyse-plan-execute (MAPE) control loop, the car and the monitored driver, and (ii) the use of probabilistic model checking to synthesise the controller for the planning step of the MAPE loop.

Publication details

JournalFormal Aspects of Computing
DateAccepted/In press - 11 Mar 2021
Original languageEnglish

Abstract

Machines, such as mobile robots and delivery drones, incorporate controllers responsible for a task while handling risk (e.g. anticipating and mitigating hazards; and preventing and alleviating accidents). We refer to machines with this capability as risk-aware machines. Risk awareness includes robustness and resilience, and complicates monitoring (i.e., introspection, sensing, prediction), decision making, and control. From an engineering perspective, risk awareness adds a range of dependability requirements to system assurance. Such assurance mandates a correct-by-construction approach to controller design, based on mathematical theory. We introduce RiskStructures, an algebraic framework for risk modelling intended to support the design of safety controllers for risk-aware machines. Using the concept of a risk factor as a modelling primitive, this framework provides facilities to construct, examine, and assure these controllers. We prove desirable algebraic properties of these facilities, and demonstrate their applicability by using them to specify key aspects of safety controllers for risk-aware automated driving and collaborative robots.

Publication details

JournalACM Computing Surveys
DateAccepted/In press - 5 Mar 2021
Number of pages37
Original languageEnglish

Abstract

Machine learning has evolved into an enabling technology for a wide range of highly successful applications. The potential for this success to continue and accelerate has placed machine learning (ML) at the top of research, economic and political agendas. Such unprecedented interest is fuelled by a vision of ML applicability extending to healthcare, transportation, defence and other domains of great societal importance. Achieving this vision requires the use of ML in safety-critical applications that demand levels of assurance beyond those needed for current ML applications. Our paper provides a comprehensive survey of the state-of-the-art in the assurance of ML, i.e. in the generation of evidence that ML is sufficiently safe for its intended use. The survey covers the methods capable of providing such evidence at different stages of the machine learning lifecycle, i.e. of the complex, iterative process that starts with the collection of the data used to train an ML component for a system, and ends with the deployment of that component within the system. The paper begins with a systematic presentation of the ML lifecycle and its stages. We then define assurance desiderata for each stage, review existing methods that contribute to achieving these desiderata, and identify open challenges that require further research.

Bibliographical note

This is an author-produced version of the published paper. Uploaded in accordance with the publisher’s self-archiving policy. Further copying may not be permitted; contact the publisher for details

Publication details

DatePublished - 3 Feb 2021
Original languageEnglish

Reinforcement Learning with Quantitative Verification for Assured Multi-Agent Policies

Riley, J., Calinescu, R., Paterson, C., Kudenko, D. & Banks, A., Feb 2021, 13th International Conference on Agents and Artificial Intelligence.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publication13th International Conference on Agents and Artificial Intelligence
DateAccepted/In press - 12 Nov 2020
DatePublished (current) - Feb 2021
Original languageEnglish

Efficiently Querying Large-Scale Heterogeneous Models

Ali, Q. U. A., Kolovos, D. & Barmpis, K., 27 Oct 2020, Proceedings of the 23rd ACM/IEEE International Conference on Model Driven Engineering Languages and Systems: Companion Proceedings. New York, NY, USA: Association for Computing Machinery (ACM), (MODELS '20).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publicationProceedings of the 23rd ACM/IEEE International Conference on Model Driven Engineering Languages and Systems: Companion Proceedings
DateAccepted/In press - 22 Aug 2020
DatePublished (current) - 27 Oct 2020
PublisherAssociation for Computing Machinery (ACM)
Place of PublicationNew York, NY, USA
Original languageEnglish
ISBN (Print)9781450381352

Publication series

NameMODELS '20
PublisherAssociation for Computing Machinery

Abstract

With the increase in the complexity of software systems, the size and the complexity of underlying models also increases proportionally. In a low-code system, models can be stored in different backend technologies and can be represented in various formats. Tailored high-level query languages are used to query such heterogeneous models, but typically this has a significant impact on performance. Our main aim is to propose optimization strategies that can help to query large models in various formats efficiently. In this paper, we present an approach based on compile-time static analysis and specific query optimizers/translators to improve the performance of complex queries over large-scale heterogeneous models. The proposed approach aims to bring efficiency in terms of query execution time and memory footprint, when compared to the naive query execution for low-code platforms.

Towards Model-Based Development of Decentralised Peer-to-Peer Data Vaults

Yohannis, A., De La Vega, A., Kahrobaei, D. & Kolovos, D., 18 Oct 2020, ACM / IEEE 23rd International Conference on Model Driven Engineering Languages and Systems (MODELS). 8 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publicationACM / IEEE 23rd International Conference on Model Driven Engineering Languages and Systems (MODELS)
DateAccepted/In press - 2020
DatePublished (current) - 18 Oct 2020
Number of pages8
Original languageEnglish

Abstract

Using centralised data storage systems has been the standard practice followed by online service providers when managing the personal data of their users.
This method requires users to trust these providers and, to some extent, users are not in full control over their data.
The development of applications around decentralised data vaults, i.e., encrypted storage systems located in user-managed devices, can give this control back to the users as sole owners of the data.
However, the development of such applications is not effort-free, and it requires developers to have specialised knowledge, such as how to deploy secure and peer-to-peer communication systems.
We present Vaultage, a model-based framework that can simplify the development of data vault applications.
We demonstrate its core features through a social network application case study and include some initial evaluation results, showing Vaultage's code generation capabilities and some profiling analysis of the generated network components.

Bibliographical note

© 2020 Association for Computing Machinery. This is an author-produced version of the published paper. Uploaded in accordance with the publisher’s self-archiving policy. Further copying may not be permitted; contact the publisher for details.

To build, or not to build: ModelFlow, a build solution for MDE projects

Sanchez, B., Kolovos, D. & Paige, R., 16 Oct 2020, Proceedings - 23rd ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, MODELS 2020. Association for Computing Machinery, Inc, p. 1-11 11 p. (Proceedings - 23rd ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, MODELS 2020).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publicationProceedings - 23rd ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, MODELS 2020
DatePublished - 16 Oct 2020
Pages1-11
Number of pages11
PublisherAssociation for Computing Machinery, Inc
Original languageEnglish
ISBN (Electronic)9781450370196

Publication series

NameProceedings - 23rd ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, MODELS 2020

Abstract

Conservative execution, end-to-end traceability, and context-aware resource handling are desirable features in model management build processes. Yet, none of the existing MDE-dedicated build tools (e.g. MTC-Flow, MWE2) support such features. An initial investigation of general-purpose build tools (e.g. ANT, Gradle) to assess whether we could build a workflow engine with support for these desirable features on top of it revealed limitations that could act as roadblocks for our work. As such, we decided to design and implement a new MDE-focused build tool (ModelFlow) from scratch to avoid being constrained by assumptions and technical constraints of these tools. We evaluated whether this decision was sensible by attempting to replicate its behaviour with Gradle in a typical model-driven engineering scenario. The evaluation highlighted scenarios where Gradle could not be extended to achieve the desirable behaviour which validates the decision to not base ModelFlow on top of it.

Polyglot and Distributed Software Repository Mining with Crossflow

Matragkas, N., Kolovos, D., Barmpis, K., Neubauer, P. & Paige, R., Oct 2020, MSR '20: Proceedings of the 17th International Conference on Mining Software Repositories. p. 374-384 11 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publicationMSR '20: Proceedings of the 17th International Conference on Mining Software Repositories
DatePublished - Oct 2020
Pages374-384
Number of pages11
Original languageEnglish

Publication details

JournalSoftware and Systems Modeling
DateAccepted/In press - 10 Jun 2020
DatePublished (current) - 11 Aug 2020
Number of pages24
Original languageEnglish

Abstract

UML profiles offer an intuitive way for developers to build domain-specific modelling languages by reusing and extending UML concepts. Eclipse Papyrus is a powerful open-source UML modelling tool which supports UML profiling. However, with power comes complexity, implementing non-trivial UML profiles and their supporting editors in Papyrus typically requires the developers to handcraft and maintain a number of interconnected models through a loosely guided, labour-intensive and error-prone process. We demonstrate how metamodel annotations and model transformation techniques can help manage the complexity of Papyrus in the creation of UML profiles and their supporting editors. We present Jorvik, an open-source tool that implements the proposed approach. We illustrate its functionality with examples, and we evaluate our approach by comparing it against manual UML profile specification and editor implementation using a non-trivial enterprise modelling language (Archimate) as a case study. We also perform a user study in which developers are asked to produce identical editors using both Papyrus and Jorvik demonstrating the substantial productivity and maintainability benefits that Jorvik delivers.

Bibliographical note

© The Author(s) 2020

Efficient Generation of Graphical Model Views via Lazy Model-to-Text Transformation

Kolovos, D., De La Vega, A. & Cooper, J., 13 Jul 2020, (Accepted/In press) ACM/IEEE 23rd International Conference on Model Driven Engineering Languages and Systems (MODELS ’20).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publicationACM/IEEE 23rd International Conference on Model Driven Engineering Languages and Systems (MODELS ’20)
DateAccepted/In press - 13 Jul 2020
Original languageEnglish

Abstract

Producing graphical views from software and system models is often desirable for communication and comprehension purposes, even when graphical model editing capabilities are not required -- because the preferred editable concrete syntax of the models is text-based, or for models extracted via reverse engineering. To support such scenarios, we present a novel approach for efficient rule-based generation of transient graphical views from models using lazy model-to-text transformation, and an implementation of the proposed approach in the form of an open-source Eclipse plugin named Picto. Picto builds on top of mature visualisation software such as Graphviz and PlantUML and supports, among others, composite views, layers, and multi-model visualisation. We illustrate how Picto can be used to produce various forms of graphical views such as node-edge diagrams, tables and sequence-like diagrams, and we demonstrate the efficiency benefits of lazy view generation approach against batch model-to-text transformation for generating views from large models.

Bibliographical note

This is an author-produced version of the published paper. Uploaded in accordance with the publisher’s self-archiving policy. Further copying may not be permitted; contact the publisher for details.

Supporting Robotic Software Migration Using Static Analysis and Model-Driven Engineering

Gerasimou, S., Wood, S., Matragkas, N., Kolovos, D. & Paige, R. F., 13 Jul 2020, (Accepted/In press) ACM/IEEE 23rd International Conference on Model Driven Engineering Languages and Systems (MODELS ’20).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publicationACM/IEEE 23rd International Conference on Model Driven Engineering Languages and Systems (MODELS ’20)
DateAccepted/In press - 13 Jul 2020
Original languageEnglish

Abstract

The wide use of robotic systems contributed to developing robotic software highly coupled to the hardware platform running the robotic system. Due to increased maintenance cost or changing business priorities, the robotic hardware is infrequently upgraded, thus increasing the risk for technology stagnation. Reducing this risk entails migrating the system and its software to a new hardware platform. Conventional software engineering practices such as complete re-development and code-based migration, albeit useful in mitigating these obsolescence issues, they are time-consuming and overly expensive. Our RoboSMi model-driven approach supports the migration of the software controlling a robotic system between hardware platforms. First, RoboSMi executes static analysis on the robotic software of the source hardware platform to identify platform-dependent and platform-agnostic software constructs. By analysing a model that expresses the architecture of robotic components on the target platform, RoboSMi establishes the hardware configuration of those components and suggests software libraries for each component whose execution will enable the robotic software to control the components. Finally, RoboSMi through code-generation produces software for the target platform and indicates areas that require manual intervention by robotic engineers to complete the migration. We evaluate the applicability of RoboSMi and analyse the level of automation and performance provided from its use by migrating two robotic systems deployed for an environmental monitoring and a line following mission from a Propeller Activity Board to an Arduino Uno.

Empirical Analysis of 1-edit Degree Patches in Syntax-Based Automatic Program Repair

Dziurzanski, P., Gerasimou, S., Kolovos, D. & Matragkas, N., 20 Mar 2020, (Accepted/In press) IEEE Congress on Evolutionary Computation.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publicationIEEE Congress on Evolutionary Computation
DateAccepted/In press - 20 Mar 2020
Original languageEnglish

Abstract

In this paper, software patches modifying a single line (aka 1-edit degree patches) of buggy Java open-source projects have been generated automatically using computational search and experimentally evaluated. We carried out the presumably largest to date experiment related to 1-edit degree patches, consisting of almost 27,000 computational jobs upper bounded with 107,000 computational hours. Our experiments show the benefits and drawbacks of such kind of patches. In particular, the search space size has been shown to be reduced by several orders of magnitude. The volume of tests that can be filtered out without any negative impact while generating 1-edit degree patches has been increased by about 97%.
Finally, the effectiveness of finding 1-edit plausible patches is compared with multi-line plausible patches found with state-of-the-art syntax-based Automatic Program Repair tools. It is shown that despite patching fewer bugs in total, 1-edit degree patches have potential to patch some extra bugs.

Intelligent Run-Time Partitioning of Low-Code System Models

Jahanbin, S., Kolovos, D. & Gerasimou, S., 2020, Proceedings of the 23rd ACM/IEEE International Conference on Model Driven Engineering Languages and Systems: Companion Proceedings. New York, NY, USA: Association for Computing Machinery (ACM), (MODELS '20).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publicationProceedings of the 23rd ACM/IEEE International Conference on Model Driven Engineering Languages and Systems: Companion Proceedings
DatePublished - 2020
PublisherAssociation for Computing Machinery (ACM)
Place of PublicationNew York, NY, USA
Original languageEnglish
ISBN (Print)9781450381352

Publication series

NameMODELS '20
PublisherAssociation for Computing Machinery

Abstract

Over the last 2 decades, several dedicated languages have been proposed to support model management activities such as model validation, transformation, and code generation. As software systems become more complex, underlying system models grow proportionally in both size and complexity. To keep up, model management languages and their execution engines need to provide increasingly more sophisticated mechanisms for making the most efficient use of the available system resources. Efficiency is particularly important when model-driven technologies are used in the context of low-code platforms where all model processing happens in pay-per-use cloud resources. In this paper, we present our vision for an approach that leverages sophisticated static program analysis of model management programs to identify, load, process and transparently discard relevant model partitions - instead of naively loading the entire models into memory and keeping them loaded for the duration of the execution of the program. In this way, model management programs will be able to process system models faster with a reduced memory footprint, and resources will be freed that will allow them to accommodate even larger models.

Publication details

JournalSoftware and Systems Modeling
DateAccepted/In press - 1 Jan 2020
DatePublished (current) - 18 May 2020
Original languageEnglish

Publication details

JournalSoftware and Systems Modeling
DateAccepted/In press - 4 Dec 2019
DatePublished (current) - 1 Jan 2020
Issue number1
Volume19
Number of pages9
Pages (from-to)5-13
Original languageEnglish

Abstract

In 2017 and 2018, two events were held—in Marburg, Germany, and San Vigilio di Marebbe, Italy, respectively—focusing on an analysis of the state of research, state of practice, and state of the art in model-driven engineering (MDE). The events brought together experts from industry, academia, and the open-source community to assess what has changed in research in MDE over the last 10 years, what challenges remain, and what new challenges have arisen. This article reports on the results of those meetings, and presents a set of grand challenges that emerged from discussions and synthesis. These challenges could lead to research initiatives for the community going forward.

Bibliographical note

© The Author(s) 2020

Fast Parametric Model Checking through Model Fragmentation

Fang, X., Calinescu, R., Gerasimou, S. & Alhwikem, F., 18 Dec 2020, (Accepted/In press) 43rd International Conference on Software Engineering. ACM

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publication43rd International Conference on Software Engineering
DateAccepted/In press - 18 Dec 2020
PublisherACM
Original languageEnglish

Abstract

Parametric model checking (PMC) computes algebraic formulae that express key non-functional properties of a system (reliability, performance, etc.) as rational functions of the system and environment parameters. In software engineering, PMC formulae can be used during design, e.g., to analyse the sensitivity of different system architectures to parametric variability, or to find optimal system configurations. They can also be used at runtime, e.g., to check if non-functional requirements are still satisfied after environmental changes, or to select new configurations after such changes. However, current PMC techniques do not scale well to systems with complex behaviour and more than a few parameters. Our paper introduces a fast PMC (fPMC) approach that overcomes this limitation, extending the applicability of PMC to a broader class of systems than previously possible. To this end, fPMC partitions the Markov models that PMC operates with into fragments whose reachability properties are analysed independently, and obtains PMC reachability formulae by combining the results of these fragment analyses. To demonstrate the effectiveness of fPMC, we show how our fPMC tool can analyse three systems (taken from the research literature, and belonging to different application domains) with which current PMC techniques and tools struggle.

Bibliographical note

This is an author-produced version of the published paper. Uploaded in accordance with the publisher’s self-archiving policy. Further copying may not be permitted; contact the publisher for details.

Safety Controller Synthesis for Collaborative Robots

Gleirscher, M. & Calinescu, R., 28 Oct 2020, Proceedings of the 25th International Conference on Engineering of Complex Computer Systems (ICECCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publicationProceedings of the 25th International Conference on Engineering of Complex Computer Systems (ICECCS)
DatePublished - 28 Oct 2020
Original languageEnglish

Publication details

JournalActa Crystallographica Section D: Structural Biology
DateAccepted/In press - 31 Jul 2020
DateE-pub ahead of print - 19 Aug 2020
DatePublished (current) - 1 Sep 2020
Issue number9
Volume76
Number of pages10
Pages (from-to)814-823
Early online date19/08/20
Original languageEnglish

Abstract

For the last two decades, researchers have worked independently to automate protein model building, and four widely used software pipelines have been developed for this purpose: ARP/wARP, Buccaneer, Phenix AutoBuild and SHELXE. Here, the usefulness of combining these pipelines to improve the built protein structures by running them in pairwise combinations is examined. The results show that integrating these pipelines can lead to significant improvements in structure completeness and Rfree. In particular, running Phenix AutoBuild after Buccaneer improved structure completeness for 29% and 75% of the data sets that were examined at the original resolution and at a simulated lower resolution, respectively, compared with running Phenix AutoBuild on its own. In contrast, Phenix AutoBuild alone produced better structure completeness than the two pipelines combined for only 7% and 3% of these data sets.

Towards Formal Verification of Control Algorithms for Autonomous Marine Vehicles

Foster, S. D., Gleirscher, M. & Calinescu, R., 2 Aug 2020, (Accepted/In press) Proceeding of the 25th International Conference on Engineering of Complex Computer Systems (ICECCS 2020). IEEE, 6 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publicationProceeding of the 25th International Conference on Engineering of Complex Computer Systems (ICECCS 2020)
DateAccepted/In press - 2 Aug 2020
Number of pages6
PublisherIEEE
Original languageEnglish

Abstract

The use of autonomous vehicles in real-world applications is often precluded by the difficulty of providing safety guarantees for their complex controllers. The simulation-based testing of these controllers cannot deliver sufficient safety guarantees, and the use of formal verification is very challenging due to the hybrid nature of the autonomous vehicles. Our work-in-progress paper introduces a formal verification approach that addresses this challenge by integrating the numerical computation of such a system (in GNU/Octave) with its hybrid system verification by means of a proof assistant (Isabelle). To show the effectiveness of our approach, we use it to verify differential invariants of an Autonomous Marine Vehicle with a controller switching between multiple modes.

Bibliographical note

This is an author-produced version of the published paper. Uploaded in accordance with the publisher’s self-archiving policy. Further copying may not be permitted; contact the publisher for details.

Understanding Uncertainty in Self-adaptive Systems

Calinescu, R., Mirandola, R., Perez-Palacin, D. & Weyns, D., 17 Jul 2020, 1st IEEE International Conference on Autonomic Computing and Self-Organizing Systems.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publication1st IEEE International Conference on Autonomic Computing and Self-Organizing Systems
DateAccepted/In press - 21 Jun 2020
DatePublished (current) - 17 Jul 2020
Original languageEnglish

Bibliographical note

This is an author-produced version of the published paper. Uploaded in accordance with the publisher’s self-archiving policy. Further copying may not be permitted; contact the publisher for details.

Importance-Driven Deep Learning System Testing

Gerasimou, S., Eniser, H. F. & Sen, A., 2020, 42nd International Conference on Software Engineering.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publication42nd International Conference on Software Engineering
DateAccepted/In press - 9 Dec 2019
DatePublished (current) - 2020
Original languageEnglish

Abstract

Deep Learning (DL) systems are key enablers for engineering intelligent applications due to their ability to solve complex tasks such as image recognition and machine translation. Nevertheless, using DL systems in safety- and security-critical applications requires to provide testing evidence for their dependable operation. Recent research in this direction focuses on adapting testing criteria from traditional software engineering as a means of increasing confidence for their correct behaviour. However, they are inadequate in capturing the intrinsic properties exhibited by these systems. We bridge this gap by introducing DeepImportance, a systematic testing methodology accompanied by an Importance-Driven (IDC) test adequacy criterion for DL systems. Applying IDC enables to establish a layer-wise functional understanding of the importance of DL system components and use this information to guide the generation of semantically-diverse test sets. Our empirical evaluation on several DL systems, across multiple DL datasets and with state-of-the-art adversarial generation techniques demonstrates the usefulness and effectiveness of DeepImportance and its ability to guide the engineering of more robust DL systems.

Interval Change-Point Detection for Runtime Probabilistic Model Checking

Zhao, X., Calinescu, R., Gerasimou, S., Robu, V. & Flynn, D., 2020, 35th IEEE/ACM International Conference on Automated Software Engineering.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publication35th IEEE/ACM International Conference on Automated Software Engineering
DateAccepted/In press - 30 Jul 2020
DatePublished (current) - 2020
Original languageEnglish

Publication details

JournalCEUR Workshop Proceedings
DatePublished - 6 Dec 2019
Volume2513
Number of pages14
Pages (from-to)67-80
Original languageEnglish

Abstract

Domain-specific languages enable concise and precise formalization of domain concepts and promote direct employment by domain experts. Therefore, syntactic constructs are introduced to empower users to associate concepts and relationships with visual textual symbols. Model-based language engineering facilitates the description of concepts and relationships in an abstract manner. However, concrete representations are commonly attached to abstract domain representations, such as annotations in metamodels, or directly encoded into language grammar and thus introduce redundancy between metamodel elements and grammar elements. In this work we propose an approach that enables autonomous development and maintenance of domain concepts and textual language notations in a distinctive and metamodel-agnostic manner by employing style models containing grammar rule templates and injection-based property selection. We provide an implementation and showcase the proposed notationspecification language in a comparison with state of the art practices during the creation of notations for an executable domain-specific modeling language based on the Eclipse Modeling Framework and Xtext.

Bibliographical note

© 2019 The Authors.

On-the-fly Translation and Execution of OCL-like Queries on Simulink Models

Sanchez Pina, B. A., Zolotas, A., Hoyos Rodriguez, H., Kolovos, D. & Paige, R. F., 19 Jun 2019, (Accepted/In press) Proceedings of the ACM/IEEE 22th International Conference on Model Driven Engineering Languages and Systems.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publicationProceedings of the ACM/IEEE 22th International Conference on Model Driven Engineering Languages and Systems
DateAccepted/In press - 19 Jun 2019
Original languageEnglish

Publication details

JournalSoftware and Systems Modeling
DateAccepted/In press - 12 Apr 2019
DateE-pub ahead of print (current) - 11 May 2019
Number of pages37
Early online date11/05/19
Original languageEnglish

Abstract

While the majority of research on Model-Based Software Engineering revolves around open-source modelling frameworks such as the Eclipse Modelling Framework (EMF), the use of commercial and closed-source modelling tools such as RSA, Rhapsody, MagicDraw and Enterprise Architect appears to be the norm in industry at present. This technical gap can prohibit industrial users from reaping the benefits of state-of-the-art research-based tools in their practice. In this paper, we discuss an attempt to bridge a proprietary UML modelling tool (PTC Integrity Modeller), which is used for model-based development of safety-critical systems at Rolls-Royce, with an open-source family of languages for automated model management (Epsilon). We present the architecture of our solution, the challenges we encountered in developing it, and
a performance comparison against the tool's built-in scripting interface. In addition, we use the bridge in a real-world industrial case study that involves the co-ordination with other bridges between proprietary tools and Epsilon.

Bibliographical note

© The Author(s) 2019

Crossflow: A framework for distributed mining of software repositories

Kolovos, D., Neubauer, P., Barmpis, K., Matragkas, N. & Paige, R., 1 May 2019, Proceedings - 2019 IEEE/ACM 16th International Conference on Mining Software Repositories, MSR 2019. IEEE Computer Society Press, p. 155-159 5 p. 8816734. (IEEE International Working Conference on Mining Software Repositories; vol. 2019-May).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publicationProceedings - 2019 IEEE/ACM 16th International Conference on Mining Software Repositories, MSR 2019
DatePublished - 1 May 2019
Pages155-159
Number of pages5
PublisherIEEE Computer Society Press
Original languageEnglish
ISBN (Electronic)9781728134123

Publication series

NameIEEE International Working Conference on Mining Software Repositories
Volume2019-May
ISSN (Print)2160-1852
ISSN (Electronic)2160-1860

Abstract

Large-scale software repository mining typically requires substantial storage and computational resources, and often involves a large number of calls to (rate-limited) APIs such as those of GitHub and StackOverflow. This creates a growing need for distributed execution of repository mining programs to which remote collaborators can contribute computational and storage resources, as well as API quotas (ideally without sharing API access tokens or credentials). In this paper we introduce Crossflow, a novel framework for building distributed repository mining programs. We demonstrate how Crossflow can delegate mining jobs to remote workers and cache their results, and how workers can implement advanced behaviour such as load balancing and rejecting jobs they cannot perform (e.g. due to lack of space, credentials for a specific API).

Publication details

JournalInternational Journal on Software & Systems Modelling
DateAccepted/In press - 11 Jan 2018
DateE-pub ahead of print - 23 Jan 2018
DatePublished (current) - 8 Feb 2019
Issue number1
Volume18
Number of pages23
Pages (from-to)345-366
Early online date23/01/18
Original languageEnglish

Abstract

Flexible or bottom-up model-driven engineering (MDE) is an emerging approach to domain and systems modelling. Domain experts, who have detailed domain knowledge, typically lack the technical expertise to transfer this knowledge using traditional MDE tools. Flexible MDE approaches tackle this challenge by promoting the use of simple drawing tools to increase the involvement of domain experts in the language definition process. In such approaches, no metamodel is created upfront, but instead the process starts with the definition of example models that will be used to infer the metamodel. Pre-defined metamodels created by MDE experts may miss important concepts of the domain and thus restrict their expressiveness. However, the lack of a metamodel, that encodes the semantics of conforming models has some drawbacks, among others that of having models with elements that are unintentionally left untyped. In this paper, we propose the use of classification algorithms to help with the inference of such untyped elements. We evaluate the proposed approach in a number of random generated example models from various domains. The correct type prediction varies from 23 to 100% depending on the domain, the proportion of elements that were left untyped and the prediction algorithm used.

Socio-Cyber-Physical Systems: Models, Opportunities, Open Challenges

Calinescu, R. C., Camara Moreno, J. & Paterson, C., 2019, (Accepted/In press) 5th International Workshop on Software Engineering for Smart Cyber-Physical Systems.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publication5th International Workshop on Software Engineering for Smart Cyber-Physical Systems
DateAccepted/In press - 2019
Original languageEnglish

Abstract

Almost without exception, cyber-physical systems operate alongside, for the benefit of, and supported by humans. Unsurprisingly, disregarding their social aspects during
development and operation renders these systems ineffective. In this paper, we explore approaches to modelling and reasoning about the human involvement in socio-cyber-physical systems (SCPS). To provide an unbiased perspective, we describe both the opportunities afforded by the presence of human agents, and the challenges associated with ensuring that their modelling is sufficiently accurate to support decision making during SCPS development and, if applicable, at run-time. Using SCPS examples from emergency management and assisted living, we illustrate how recent advances in stochastic modelling, analysis and synthesis can be used to exploit human observations about the impact of natural and man-made disasters, and to support the efficient provision of assistive care.

Towards systematic engineering of collaborative heterogeneous robotic systems

Gerasimou, S., Matragkas, N. & Calinescu, R., 27 May 2019, 2019 IEEE/ACM 2nd International Workshop on Robotics Software Engineering (RoSE). IEEE, p. 25-28 4 p.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publication2019 IEEE/ACM 2nd International Workshop on Robotics Software Engineering (RoSE)
DatePublished - 27 May 2019
Pages25-28
Number of pages4
PublisherIEEE
Original languageEnglish
ISBN (Electronic)9781728122496

On Learning in Collective Self-adaptive Systems: State of Practice and a 3D Framework

Gerasimou, S., D’Angelo, M., Ghahremani, S., Grohmann, J., Nunes, I., Pournaras, E. & Tomforde, S., 22 Mar 2019, (Accepted/In press) 14th International Symposium on Software Engineering for Adaptive and Self-Managing Systems.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publication14th International Symposium on Software Engineering for Adaptive and Self-Managing Systems
DateAccepted/In press - 22 Mar 2019
Original languageEnglish

Abstract

Collective self-adaptive systems (CSAS) are distributed and interconnected systems composed of multiple agents that can perform complex tasks such as environmental data collection, search and rescue operations, and discovery of natural resources. By providing individual agents with learning capabilities, CSAS can cope with challenges related to distributed sensing and decision-making and operate in uncertain environments. This unique characteristic of CSAS enables the collective to exhibit robust behaviour while achieving system-wide and agent-specific goals. Although learning has been explored in many CSAS applications, selecting suitable learning models and techniques remains a significant challenge that is heavily influenced by expert knowledge. We address this gap by performing a multifaceted analysis of existing CSAS with learning capabilities reported in the literature. Based on this analysis, we introduce a 3D framework that illustrates the learning aspects of CSAS considering the dimensions of autonomy, knowledge access, and behaviour, and facilitates the selection of learning techniques and models. Finally, using example applications from this analysis, we derive open challenges and highlight the need for research on collaborative, resilient and privacy-aware mechanisms for CSAS.

DeepFault: Fault Localization for Deep Neural Networks

Gerasimou, S., Eniser, H. F. & Sen, A., 15 Feb 2019, 22nd International Conference on Fundamental Approaches to Software Engineering. Springer-Verlag

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Publication details

Title of host publication22nd International Conference on Fundamental Approaches to Software Engineering
DateE-pub ahead of print - 15 Feb 2019
PublisherSpringer-Verlag
Original languageEnglish

Abstract

Deep Neural Networks (DNNs) are increasingly deployed in safety-critical applications including autonomous vehicles and medical diagnostics. To reduce the residual risk for unexpected DNN behaviour and provide evidence for their trustworthy operation, DNNs should be thoroughly tested. The DeepFault white box DNN testing approach presented in our paper addresses this challenge by employing suspiciousness measures inspired by fault localization to establish the hit spectrum of neurons and identify suspicious neurons whose weights have not been calibrated correctly and thus are considered responsible for inadequate DNN performance. DeepFault also uses a suspiciousness-guided algorithm to synthesize new inputs, from correctly classified inputs, that increase the activation values of suspicious neurons. Our empirical evaluation on several DNN instances trained on MNIST and CIFAR-10 datasets shows that DeepFault is effective in identifying suspicious neurons. Also, the inputs synthesized by DeepFault closely resemble the original inputs, exercise the identified suspicious neurons and are highly adversarial.

Funded Projects

Responsible Data Science by Design, EUR 956,754.00

Kahrobaei, D., Kolovos, D. & Matragkas, N.

1/01/2031/12/22

Project: Research project (funded)Research

Description

York Maastricht Partnership Investment Fund
StatusActive
Effective start/end date1/01/2031/12/22

Description

Marie Skłodowska-Curie training network of 15 Early Stage Researchers across Europe investigating aspects of scalability in low-code software engineering platforms. Network members include British Telecom, Intecs, B2T Concept, CLMS, IncQuery Labs and the Universities of Nantes (IMT), Madrid (Autonoma), L'Aquila and (TU) Wien.
StatusActive
Effective start/end date1/01/1931/12/22

KTP with Rolls Royce 2 - Industry Funding

Kolovos, D.

1/10/1830/09/21

Project: Research project (funded)Research

Description

Knowledge Transfer Partnership with Rolls-Royce on Model-Based Development of Aerospace Systems, co-funded by InnovateUK
StatusActive
Effective start/end date1/10/1830/09/21

StatusActive
Effective start/end date1/11/2030/04/24

Engineering Assured Autonomous Systems

Calinescu, R. & Gerasimou, S.

EPSRC

19/11/1930/04/21

Project: Research project (funded)Research

StatusActive
Effective start/end date19/11/1930/04/21

KTP With IBM (Innovate)

Kolovos, D., Manandhar, S. & Paige, R. F.

1/04/1831/03/21

Project: Research project (funded)Research

Description

Knowledge Transfer Partnership with IBM UK on automated knowledge extraction and re-engineering of financial planning spreadsheets, co-funded by InnovateUK
StatusFinished
Effective start/end date1/04/1831/03/21

TYPHON - Polyglot Persistence and Processing of Big Data

Kolovos, D.

EUROPEAN COMMISSION

1/01/1831/12/20

Project: Research project (funded)Research

Description

Horizon 2020 project on polyglot (relational/document/graph) data persistence and processing architectures with Volkswagen, GMV, Alpha Bank, OTE, the Open Group, and the Universities of L'Aquila, Edge Hill, Namur and Amsterdam (CWI)
StatusFinished
Effective start/end date1/01/1831/12/20

Description

Horizon 2020 project on knowledge mining from open-source software repositories with the Eclipse Foundation, the Open Group, OW2, Bitergia, FrontEndArt, Softeam, Unparallel Innovation, Castalia and the Universities of L'Aquila, Athens (AUEB), Amsterdam (CWI), and Edge Hill
StatusFinished
Effective start/end date1/01/1731/12/19

AcronymScalable Modelling and Model Management on the Cloud
StatusFinished
Effective start/end date1/11/1330/04/16

OSSMETER (EU ICT Bid)

Paige, R. F. & Kolovos, D.

EUROPEAN COMMISSION

1/10/1230/03/15

Project: Research project (funded)Research

StatusFinished
Effective start/end date1/10/1230/03/15

Bridging the Gap Between Programming and Modelling

Paige, R. F.

THE ROYAL SOCIETY

1/03/1829/02/20

Project: Research project (funded)Research

StatusFinished
Effective start/end date1/03/1829/02/20

CyPhERS

McDermid, J. A. & Paige, R. F.

EUROPEAN COMMISSION

1/07/1328/02/15

Project: Research project (funded)Research

StatusFinished
Effective start/end date1/07/1328/02/15

DSTL PhD Studentship - Radu Calinescu

Calinescu, R. & Paige, R. F.

1/10/1230/09/16

Project: Research project (funded)Research

StatusFinished
Effective start/end date1/10/1230/09/16

COMPASS: Automated Safety Warnings (SESAR)

Paige, R. F.

SESAR JOINT UNDERTAKING

1/04/1130/11/13

Project: Research project (funded)Research

StatusFinished
Effective start/end date1/04/1130/11/13

StatusFinished
Effective start/end date1/02/1031/07/12

Development of Collaborations with the Weizmann Institute of Science and IBM Haifa

Paige, R. F.

EPSRC

1/11/0731/10/08

Project: Research project (funded)Research

StatusFinished
Effective start/end date1/11/0731/10/08

DSTL TDS Studentship: Assured Reinforcement Learning

Calinescu, R. & Kudenko, D.

1/10/1330/09/17

Project: Research project (funded)Research

StatusFinished
Effective start/end date1/10/1330/09/17

Cloud Computing for LSCITS

Calinescu, R.

EPSRC

1/05/1231/03/14

Project: Research project (funded)Research

StatusFinished
Effective start/end date1/05/1231/03/14

Automatic Repair Of Natural Source Code (MANATEE)

Matragkas, N.

Project: Research project (funded)Research

StatusNot started

Secure and Safe Multi-Robot Systems

Matragkas, N. & Gerasimou, S.

Project: Research project (funded)Research

Short titleSESAME
StatusNot started