Type Inference in Flexible Model-Driven Engineering

About

This website includes all the necessary information needed to reproduce the experiments presented in the paper "Type Inference in Flexible Model-Driven Engineering". You can find step-by-step instructions on how to run the experiments in the Instructions section. All the source code needed can be downloaded from the Downlaods section. In section Data all the raw data can be downloaded while the section Charts includes all the charts generated by these data.

Abstract: In the emerging community of flexible Model-Driven Engineering (MDE), language engineers compose example models of the envisioned Domain-Specific Language (DSL). This can be an error-prone process, as there is no metamodel which can guarantee that the example models share common syntax and semantics. Nodes that should represent the same concept could instantiate different types, due to user error while other nodes can be left totally untyped either intentionally or unintentionally. In this paper, we propose an approach that feeds characteristics of each element, like the number of assigned attributes, references and containments, to Classification and Regression Trees (CART) in order to predict the types of nodes that left untyped facilitating the work of language engineers. The approach is tested in a number of randomly generated models. Results suggest that on average 80% of types were correctly identified in all the models that were tested. In addition, results reveal strong dependency between the success score and the ratio of known to unknown types in a model.

Instructions

The following image presents the experimentation approach overview as discussed in the paper. For each of the steps of the process, detailed instructions are provided. Readers can start from step 1 to generated their own models, muddles, features signatures lists and results or from any other step by downloading our files from all the previous steps which contain the artefacts generated as part of the experiment presented in the paper.

Fig. 1: An overview of the experimentation approach.

Data & Results

The metamodels (.ecore & .emf), the random generated models (.model), the corresponding muddles (.graphml), the feature signatures (.txt) and all the raw results (.xlsx) can be downloaded from this section.

Charts

Boxplots based on the raw results are presented here, for each of the seven different sampling rates that were used in the experiment. Scatter charts for the CART performance for each of the ten different metamodels follow.

Boxplots (click to enlarge)

Boxplot for average prediction score for all 10 models of each metamodel at 30% sampling rate.
Boxplot for average prediction score for all 10 models of each metamodel at 40% sampling rate.
Boxplot for average prediction score for all 10 models of each metamodel at 50% sampling rate.
Boxplot for average prediction score for all 10 models of each metamodel at 60% sampling rate.
Boxplot for average prediction score for all 10 models of each metamodel at 70% sampling rate.
Boxplot for average prediction score for all 10 models of each metamodel at 80% sampling rate.
Boxplot for average prediction score for all 10 models of each metamodel at 90% sampling rate.

Scatter Charts (click to enlarge)

Scatter chart for the average prediction score of the Ant metamodel for all the 7 sampling rates.
Scatter chart for the average prediction score of the BibTeX metamodel for all the 7 sampling rates.
Scatter chart for the average prediction score of the Bugzilla metamodel for all the 7 sampling rates.
Scatter chart for the average prediction score of the Chess metamodel for all the 7 sampling rates.
Scatter chart for the average prediction score of the Cobol metamodel for all the 7 sampling rates.
Scatter chart for the average prediction score of the Conference metamodel for all the 7 sampling rates.
Scatter chart for the average prediction score of the Professor metamodel for all the 7 sampling rates.
Scatter chart for the average prediction score of the Use Case metamodel for all the 7 sampling rates.
Scatter chart for the average prediction score of the Wordpress metamodel for all the 7 sampling rates.
Scatter chart for the average prediction score of the Zoo metamodel for all the 7 sampling rates.