Towards a UML Profile for the Simulation Domain

,


Introduction
Recent research works recommended the practice of the Model Driven Engineering (MDE) in the field of the simulation [2,3,4]. The works in [5,6,7,11] adopted approaches based on three steps: A conceptual modelling step where scientists or engineers build models, called Computational Independent Models (CIM) capturing the phenomena under study, a design step where simulation engineers build models called Platform Independent Simulation Model (PISM), and an implementation step where software engineers develop models called Platform Specific Simulation Model (PSSM). A series of model transformations allow to derive PISM models from CIM models, and PSSM models from PISM models. Furthermore the Model Driven Architecture (MDA), a variant of the MDE approach standardized by the Object Management Group (OMG) and targeting the software engineering community, emphasizes also another kind of model called Platform Description Model (PDM) [1]. This kind of model is used for the description of platforms that host the developed software applications.
The MDA approach has been adopted, for instance, in the field of the real-time and embedded systems. UML profiles, like UML-MARTE [8], or Domain Specific Languages like AADL [27] have been defined to support the MDA practice in this field. Both provide mechanisms to describe PDM models, and, seem to be good candidates for the modelling of (software) simulation platforms. In a previous work [22], we proposed a software engineering methodology for the development of multiscale modelling and simulation framework based on the UML-MARTE profile where an attempt to model the multiscale platform MUSCLE using the ingredients of the SRM (Software Resource Model) sub-profile of UML-MARTE have been conducted. The SRM subprofile is dedicated for the modelling of real time operating systems and middleware. Although it offers a wide range of (software) modelling elements and capabilities, most of them target the specific needs of the real time and embedded systems platforms, and with regard to our previous attempt [22], do not meet specific simulation engineering needs. According to our opinion, it is more natural and comfortable for the simulation engineering community to treat and manipulate their specific native entities and concepts as first class modelling elements.
To the best of our knowledge, none of the current works on the MDE practices in the simulation field, addresses the issue of the Simulation Platform Description Model (SPDM), i.e., the description of simulation platforms that support the execution of simulation experiments. The sole work targeting the modelling of simulation platforms is reported in [9]. Discovering the commonalities and variations among a sample of open source multi-physics simulation platforms has been the main motivation of its authors. Although the work in [9] may serve as a reference architecture for simulation platforms developers, it does not offer, in our opinion, explicit mechanisms to develop models describing simulation platforms in the spirit of the MDA approach.
The objective of this work is to define a UML profile for the simulation field intended to support the MDA practices in this field. The proposed profile particularly provides a set of appropriate modelling mechanisms for the description of simulation platforms.
The contribution of this work is threefold: First we review and synthesize recent contributions in modelling M. Mourad et al. and simulation approaches, practices and platforms; second we adopt a resource-oriented approach for the modelling of simulation platform elements; third we consider both component-and workflow-based simulation platforms. These contributions are illustrated by a set of UML stereotype classes capturing core simulation concepts and platforms elements.
The rest of this paper is organized as follows. Section 2 is devoted to the recent developments in modelling and simulation field. Section 3 presents the simulation field from the workflow perspective. Related works are discussed in Section 4. Our contribution is detailed in Section 5. Section 6 outlines a simple example. Finally conclusions and future works are given in Section 7.

Recent developments in simulation engineering
Simulation engineering, an emerging discipline that applies the principles of both simulation science and engineering fields, has been widely used to address various complex real-world problems. It mainly involves two complementary activities: 1) a modelling activity where simulation models of physics phenomena or engineering artefacts-are built, 2) a simulation activity where experiments are performed on these simulation models to achieve specific objectives such as understanding of phenomena, predictions, and performance study. The simulation engineering community developed a lot of specific software tools allowing not only to build such models but also to conduct experiments on them. The literature reports various terminology to designate such tools, like simulation frameworks or simulation platforms; simulation platform is the designation that will be used along this paper to designate such simulation tools. A multitude of academic and commercial simulation platforms are available [10]: Some of them are domain dependent while others are generic. MUSCLE [6] and Mapper [12] simulation frameworks proposed generic simulation platforms. Domains where simulation is widely used are numerous: Physics, biology, medicine, and others. Integrated Plasma Simulator (IPS) platform [13], and Virtual Imaging Platform (VPM) [25] are respectively simulation platforms dedicated to the plasma physics and medical imaging domains. Due to the profusion of concepts, methods, frameworks and tools related to the modelling and simulation field, we present in the following a synthesis addressing advanced issues relevant to this field.

Modelling and simulation core concepts
A model is an abstract representation of reality. One of the practical uses of models is generating the dynamic of systems from their models. Simulation consists in moving a model over time, given some inputs. Models can be either in a mathematical form, i.e., a system of equations for example, or in an algorithmic form: In the first case the simulation takes the form of a kind of software, named simulator, that implements a solver for this system of equations; in this case models, often specified thanks to domain specific modelling languages, and simulators are separated. Solvers may be categorized according to different criteria such as their application domain and their solving methods. They may be either legacy code or newly developed codes. In the second case, models are specified in terms of algorithmic components; models are embedded in the simulation code. In our work we deal with both cases.
Simulation codes accept well defined scripts as inputs. These scripts specify the set-up and the protocol of the targeted experiments. Simulation engines interpret the input scripts and run the simulation of individual models. Simulation scripts are usually written thanks to specific scripting languages like Python, and Ruby, or in the form of standardized data representation languages like XML.

Modelling and simulation approaches
Modern modelling and simulation approaches distinguish between the monolithic approach and the partitioned one. In the first approach a single large scale model capturing the whole phenomena under study is built and then its associated simulation code is executed, while in the second one, a complex model is partitioned into a set of single models and then their associated individual simulation codes are coupled and then executed together.

Partitioned methods
A categorization of partitioned methods is given in [26]: (i) Multiphysics Partitioning This method is used when the model of the phenomena under study captures multiple physical processes, each of these physical processes belongs to a specific physics such as temperature and viscosity. In this case the model is decomposed into a set of sub-models; each of these sub-models concerns a specific physical process, and all submodels of the model operate on the same time and space scales.
This method is used when the model of the phenomena under study captures only one physical process; this model, because of its complexity, is decomposed into a set of sub-models that operate on different time and space scales.

(iii) Multiphysics Multiscale Partitioning
Here multi-scale and multi-physics methods are both used. This method is used when the model of the phenomena captures multiple physical processes that don't operate on the same scales.
Partitioned simulations encompass not only the performance of a set of single simulation experiments but also the interactions between these single simulation experiments. It presupposes the availability of specific mechanisms, called coupling mechanisms, having the mission to drive these interactions. Two issues need to be addressed when coupling single experiments: (i) The format of the data exchanged between coupled simulation experiments, (ii) The interaction pattern governing the interaction between coupled simulation patterns.
The same approach, based on a usual programming technique called wrapping, is generally used on almost all simulation platforms that deal with the experiments coupling issues. The wrappers are pieces of code that embodies the simulation code of single experiments. For instance the layered architecture of the Integrated Plasma simulation platform described in [13] distinguishes between data wrappers and coupling wrappers: The data wrapper takes in charge the data conversion from the internal data format used by single experiments into a common exchange data format. The European Fusion research community suggested a generic data structure, named Consistent Physical Objects (CPO), as a common format for the data to be exchanged between single experiments. Data wrappers are not simulation platforms dependent.
The coupling wrapper takes in charge the data motion as well as the pattern of the interaction between coupled single experiments during their data exchanges. Coupling wrappers, contrarily to data wrappers, are simulation platform dependents.

Coupling issue
In [14] the authors laid the foundations of multi-scale computing.
Their formalization of the multi-scale coupling reveals two complementary features related to this concept: (i) Coupling template: Specifying the information flow that may occur between any pair of coupled (single) experiments. Unidirectional as well as a bidirectional data flows are admitted. (ii) Coupling topology: A graph representing the couplings (edges) between pair of single submodels (nodes) belonging to a partitioned model. The graph edges are labelled by coupling templates. Two kinds of coupling topology are identified: a. Acyclic topology: It is characterized by an absence of cycles in the coupling topology. In this case coupled simulation codes can be ordered and executed sequentially; this kind of coupling is also named loose coupling. b. Cyclic topology: It is characterized by the presence of cycles in the coupling topology. In this case the order of the execution of individual simulation codes is not predefined; this kind of coupling is also called tight coupling.

Orchestration of coupled simulations
Three ways to coordinate and orchestrate a set of coupled single experiments are commonly used:

Simulation from the scientific workflow perspective
The workflow technology, mainly used by the business community, seems to be one of the promising approaches adopted by the scientific community; the concept of scientific workflows emerged as an alternative to the conventional concept of business workflow. There are similarities as well as differences between the two kinds of workflows. For example, business workflows are control-flow oriented, while scientific workflow are mainly data-flow oriented. The readers interested in more details may refer to [15].

Scientific workflows
A workflow is a pre-defined set of work steps with a partial order on these steps [17]. Work steps represent tasks to be carried out when they are enacted by workflow engines. Scientific workflows Management Systems have been developed during the last two decades. They are intended to manage, enact and monitor scientific workflows which are a composition of a series of computation and/or data manipulation [13]. Scientific workflows are enacted and orchestrated by specific engines, called workflow engines, forming the core components of scientific workflows Management Systems. Some examples of known scientific workflows management systems are Taverna, Kepler, and Vistrails [16].
Generally, workflows describe control flows and/or data flows. Scientific workflows are usually classified into two categories: Abstract and concrete workflows [19]. Quoting the authors of [18]: "An abstract scientific workflow is a definition of a scientific process with emphasis on the analytical operations or function to be performed rather than the mechanisms for performing these operations". In opposite, concrete scientific workflows bind the work steps to resources that execute the corresponding tasks.

Simulation workflows
Simulations of scientific or engineering models are seen as kinds of scientific workflows. Simulations of models are often described by scientific workflows. These workflows follow specific patterns/motifs and include various kinds of steps: Data processing steps, solving/simulation step, visualization step, and data exchanges step. In [24] the authors elaborated catalogues of common motifs for both scientific workflows and data operations that may be performed when conducting scientific experiments.
The iterative pattern is one of the most used control patterns to describe the workflow of individual experiments. For instance structured loops are a kind of iterative pattern.
In the case of a multi-experiment the workflows of the participating individual experiments are coupled. Their coupling is performed thanks to a set of data exchanges constrained by specific interaction patterns. The authors of [20] suggest the concept of "choreography", borrowed to the business management community, to couple the workflows of single experiments. Every single experiment is realized as an orchestration of scientific services and the whole multiexperiment is described by choreographies without a centralized control.

Related works
The literature reports two different directions regarding the development of simulation frameworks: (i) Component based approaches inspired from the software component-based design and programming methods, (ii) Workflow based approaches inspired from the workflow based business systems. Recent works with respect to each of these two research directions emphasize the MDA practices.
In [7] the authors proposed a simulation framework based on the hierarchical component-based approach. Their framework is supported by well-defined metamodels capturing Conceptual Simulation Models (CSM) as well as Platform Independent Simulation Models (PISM). However they did not define meta-models that capture Platform Specific Simulation Models (PSSM); in fact these are considered as implementations of PISM models. PISM and PSSM terminology used in the simulation field corresponds respectively to the PIM and PSM terminology used in the software engineering field. It is worthwhile to note that the work in [7] does not consider the simulation platform description models as primary models.
The authors in [21] adopted a workflow based approach for the simulation framework they developed. Their approach, based on an MDA approach too, relies on three distinct levels: A conceptual level at which the modellers describe the models that capture the phenomena under study; an abstract level at which PSSM models, independent form the computing infrastructures are conceived; a concrete level at which models are strongly dependent from the computing infrastructure intended to host the simulation experiments; these last models, called Platform Description Models (PDM) refer to the hardware infrastructure rather than to the simulation workflow framework. Conceptual models are first transformed into specific intermediate representations which are themselves converted to abstract workflows to be enacted by a targeted scientific workflow framework.
Both research works does not consider the modelling of simulation platforms. To the best of our knowledge, the sole research work that investigated the issue of simulation platform modelling is described in [9]. Its authors aimed at discovering commonalities and variations among a sample of open source multi-physics simulation platforms, and proposing a feature model capturing the discovered commonalities and variations using the feature-oriented modelling approach. According to the authors, one of the possible uses of their produced feature model is to serve as a reference for simulation platforms developers.
Our research work, contrarily to [9], targets the modelling of simulation platforms in the context of the MDA approach for the simulation domain, i.e., providing a UML profile intended to build Simulation Platform Description Models (SPDM) for simulation experiments; in opposite to [21], PDM models here refer to simulation platform models rather than to computing infrastructure models.
The present work considers scientific workflows for the description of scientific experiment behaviors, and relies on the concept of generic resources as defined in [8] to model elements of simulation platforms.

The proposed UML profile
In this section we develop our UML profile intended for the simulation field. A set of UML stereotypes intended to capture core concepts of the simulation domain are exposed.

Linking PISM and SPDM models
The proposed profile focuses on the SPDM modelling. Figure 2 depicts the well-known relationship between the PISM, and PSSM models. Elements of PISM models are mapped to elements of SPDM leading to PSSM models.

PISM model elements
In this section we identify and define a set of UML stereotypes that constitutes the main PISM model elements of our profile.

Simulation stereotype
The simulation and experiment concepts, as defined above, are modelled as stereotypes. Both extend the UML BehavioredClassifier metaclass which is a UML classifier that owns behaviors. A. The class diagram depicted in Figure 3.a describes the Simulation stereotype and the hierarchy of its refined stereotypes covering various kinds of simulation approaches. Comments: (i). Simulation Stereotype includes at least two properties.
IdentifierElts reports a set of required elements that may identify and characterize conducted simulations such their identification number, their date, the target domain, the version number.
SimulParam is used to report some parameters related to the simulation itself; for instance the duration of the simulation, the space dimension of the simulated model and others.

Map property specifies the mapping between workflow nodes and their corresponding workflow call actions.
The Mapping class is a datatype that records (workflow node, action to be called) pairs. The concept of UML call action is detailed in the part B of section 5.2.3.
(iii). CentralizedMsc stereotype represents multiscale simulations designed according to the centralized version of the workflow based approach. It refines WrkFlowMsc stereotype. Its coord property (instance of the Coordinator class) is intended to represent the central coordinator that orchestrates the whole simulation workflow. The Coordinator class is not detailed in this paper.
(iv). MasterSlaveMsc stereotype represents multiscale simulations designed according to the master/slave version of the workflow based approach. It refines the WrfFlowMsc stereotype. Its Master property records the single experiment that plays the role of master in the whole multiscale simulation.     + CplIdElts: Specifies suitable information susceptible to identify its instances. + CplIType: Set of suitable typed elements allowing to specify the kind of the coupling. + SourceNode, TargetNode: These attributes play the role of the UML association end. They specify the model elements that are coupled.

Simulation behavior stereotype
Instances of both Experiment and Simulation stereotypes own their specific behaviours. The stereotype SimBehavior is intended to capture various simulation and experiment behaviors.
A. Figure 4.a shows a class diagram depicting the usual behaviors met in the simulation world. The SimBeh stereotype is intended to model the behavior of experiments and simulations. Two categories of behavior are identified. The opaques ones characterized by their unknown structure, and the regular ones characterized by well-defined, regular and known structures. For instance, workflows and automata-like structures are kinds of regular behavior.

Comments:
(i) Opaque behaviors, as defined in the UML infrastructure, are usually characterized by their body (body source plus the language used to express the source); in the context of our work, Opaque Experiment stereotype represents experiments driven by simulation engines. Here we adopt the UML Opaque Behavior metaclass as a base class.
(ii) Automata-based behavior which are explicitly described by automata-like formalisms such as Cellular Automata or others. Such kind of behaviors may, for instance, characterizes the behavior of single experiments that participate to multiscale simulations. Here we adopt the UML State Machine metaclass as a base class.
(iii) Workflow-based behaviors which are explicitly described by abstract workflows. Such kind of behavior may for instance characterizes the behaviour of monolithic simulation as well as multiscale simulations. These are often expressed in terms of Petri nets or UML activity diagrams. The authors of [23] defined a profile for scientific workflows by proposing a refinement of the UML Activity metaclass tailored to their own abstract workflow language. In our work we define the WorkFlowBeh stereotype to represent abstract simulation workflows by extending the UML Activity metaclass.
SimMotif is one of the properties associated with the WorkFlowBeh stereotype. It is intended to specify the abstract motif/pattern of simulation workflows. Abstract simulation workflows are composed by sets of workflow nodes assembled according to a particular structure. We assume the availability of a library of UML model elements regrouping a catalogue of usual simulation workflow motifs. B. More on Workflow based Experiments Workflow-based experiments are usually composed of work steps structured and organized according to a specific workflow motif/pattern. In order to be independent from specific abstract workflow language, we adopt a solution, used by some workflow engines, that uncouples the workflow motif nodes from the task to be performed at the node level. To achieve this objective, we rely on the UML Behavior metaclass infrastructure to define the SimulationWorkflowStep stereotype. This stereotype extends the UML Call Operation and Call Behavior metaclasses which are themselves two refinements of the UML Execution Action metaclass:

Figure 4.a Typology of Simulation Behaviors
-Call Operation is used to trigger atomic operations that correspond to simulation processing steps, like solving, data processing or data interaction steps. -Call Behavior is used to trigger behaviors that correspond to potential sub-workflows contained in simulation workflows (hierarchical workflow motifs). It is useful to handle the master/slave approach (a master experiment enacting a slave experiment) and the centralized approach (a coordinator enacting the workflow of single experiments).
Figure4.b shows two refinements of the SimulationWorkflowStep stereotype: SimAction stereotype representing various kinds of atomic simulation actions call (solving, data processing, data interaction operations) that may be associated with nodes of abstract workflow motifs. It extends the UML Call Operation metaclass.
WrkFAction stereotype representing sub-workflows with call action that may be associated with nodes of workflow motifs. It extends the UML Call behaviour metaclass.

Experiment and simulation model stereotypes
The Experiment stereotype represents PSIM elements. Figure 5 shows the features of this stereotype.
(i) IdentifierElts property records any useful information susceptible to identify the experiment (identifier number, experiment date, version, and eventually others). (ii) ExpParam property records experiment parameters (experiment duration, and eventually other parameters).    (ii) Ph property specifies the domain specific phenomena targeted by the simulation. (iii) SlvMth property specifies the set of mathematical methods that may be used to solve the simulation model. We define SolvingMethod a stereotype as an extension of the UML OpaqueExpression metaclass.

SPDM model elements
Simulation and experiments, as previously mentioned, are hosted and executed by simulation platforms. UML-MARTE profile provides the concept of Resource to model in a uniform way hardware as well as software elements. Resources are abstract entities that provide services and they are themselves composed of other resources. We refine the concept of abstract resource to concrete (software) elements of simulation platforms.
In the present work we focus on only two core stereotypes that may be used to model PDSMs: Engines and Data Processor resources.

Engine resources
The concept of "engine' is often used in the simulation field as well as in the workflow technology. Here engines represent virtual computing resources that interpret and run scripts or workflows written in specific formalisms. Engine refines the abstract Resource stereotype class defined in UML-MARTE profile.
This abstract resource provides a set of services common to all kinds of resources. Figure 7 shows two kinds of engines: Simulation and Workflow engines. A. SimulationEngine: An engine that interprets opaque simulation code written in specific formalism/language. It may also be a simulation tool, called simulator, that performs solving methods; simulators accept models and simulation scripts as inputs.  B. WorkflowEngine: An engine that is responsible for the interpretation of executable workflow and the orchestration of workflows. It is a kind of scheduling resource. Workflow steps may be either basic/atomic tasks or sub-workflows. Modellers specify their workflows using either a human readable textual script or a diagram-based workflow language (Front-End workflow language), while workflow engines interpret platform readable and executable workflow languages (Back-End language). Figure 9 depicts the main features of the Workflow Engine stereotype. (i) WorkFlowPattern is a sub-class of the Control Node meta-class. It includes the usual set of control nodes found in simulation workflows like sequence, loop, and parallel. (ii) ExternResourceWrapper, and EngineWrapper are derived from the UML Adapter pattern. External ResourceWrapper refers to wrappers that encapsulate data processing operators, and EngineWrapper refers to wrappers that encapsulate simulation engines in case of cooperation between workflow engines.

Data processor elements
In the following, we present a set of stereotypes aiming to model a set of specific computing resources that are able to support the execution of specific operations: data operation, and data interaction. We model these resources as kinds of virtual processor. Our approach to categorize the data operations is slightly different from the one reported in [20]. We differentiate the data processing operations that may operate inside individual experiments, the intraexperiment case, from the operations on data that are performed along the data motion from one single experiment to another experiment, the inter-experiment case. A categorization of these Data processors is shown in Figures 10a, 10b, and 10c. The following kinds of data processor are identified:

A. Inter-Experiment Data Processor
Data are potentially subject to manipulation during their motion between single experiments. Each kind of manipulation is described by a specific (mathematical) function or algorithm. Two kinds of manipulations are identified: (i) Data transformation: filtering, (ii) Data combination: usually carried out by operators called Mappers.
(a) Data aggregation: aggregating multiple data sources to one data source, (b) Data dis-aggregation: separating one data source into multiple data sources.

B. Intra-Experiment Data Processor
Usually the input data need to be set into a specific format before to be submitted to simulation engines. The output data (produced by simulation engines) need also to be set in specific formats before to be visualized to the modellers. Commercial and academic libraries provide such data processors.    C. The stereotype Data Processor inherits from the Resource class. Its main features are: (i) InputElts: Specifies the number and types of inputs which depend from the kind of data processor, (ii) OutputElts: Specifies the number and types of outputs depend from the kind of data processor, (iii) ProcessingElts: Specifies an algorithm (body) that implements the analytic (mathematical) operation to be performed as well as a set of appropriate parameters qualifying its performance.

Data interaction operator
Single experiments participating to multiscale simulations are coupled according to specific coupling mechanisms. They exchange data either in a direct way, in case of a component based multiscale simulation approach or in an indirect way in case of master/slave and centralized multiscale simulation approaches.
Our profile provides a stereotype class named DataInteractionOperator intended to run various kinds of coupling (data motion according to specific templates). It represents an abstraction of the so-called coupling wrappers mentioned in the section 2.2.1. We adopt and refine the UML Adapter pattern to define this stereotype.

Example
In this section we introduce a simple example to illustrate the (partial) use of our proposed profile. The example exposes only the PISM model elements.
The Realistic and complete case studies are currently under construction.

Conclusion and future works
In this work we present a synthesis of recent contributions in the modelling and simulation field encompassing up-to-date simulation topics. Model driven approaches for the simulation field are discussed. Multiscale and multi-physics simulation methods and their related issues are outlined. Modern simulation platforms adopting a component-as well as a workflow-based approach are exposed. We also propose modelling mechanisms intended for the description of simulation platforms, thus making possible the development of a kind of MDA primary model called SPDM. For this purpose we define a UML profile including a set of useful UML stereotypes that capture core simulation concepts as well as core simulation platforms elements such as simulation engines, workflow engines, and simulation data processors. In this work, a resource-based approach, similar to the one used for the UML-MARTE profile, is adopted for the modelling of simulation platforms elements.
As a first future work we plan also to develop UML meta-models for a set of widely used simulation model specification formalisms, thus enabling PISM-to-PISM transformations.