Portal Home

cleverMachine documentation

Dataset submission
Output interpretation

Datasets submission

The usage of our tools is extremely simple. When a new submission is desired, user is presented with following simple form (mouse-over and click over the element for more information):

Form information input data submission status sample data run mode

The form has following components:

Output interpretation

The usage of our tools is extremely simple. When a new submission is desired, user is presented with following simple form:

Individual scales

The first chart focuses on individual property strengths. On the left-hand side, we can see a plot sorted by individual item's Z-scores and on the right hand side are corresponding scale titles. The plot is also interactive and if a bar is clicked, it gets highlighted in the table, making data exploration a breeze.
The plot shows contrast between the two datasets. If a bar points to the left, (towards the positive set), it means that the property is stronger on the positive set and vice versa. For example, if a hydrophobicity scale points to the right, it means that the negative set is more hydrophobic or, conversely, that the positive set is less hydrophobic (or more polar if you like).
The individual properties are also colour-coded and the colour corresponds to grouping in the second section.

Each of the lines of the table shows scale description including its source and there also is a coverage and associated Z-score shown. It is possible to sort the table by clicking on the headers, however, the sorting is not reflected in the plot (but you can still click the bar and see the item in the table).

In most cases, users can see large scale of properties with their corresponding strenghts. However, in case if there is not enough signal contained in the datasets, there may be only a small number of scales shown. Lastly, if there are no properties strong enough to pass the threshold, the analysis stops and the tool showns appropriate message.

Important concepts

Grouped properties

Next section utilises information about higher-level grouping of the scales, painting more complete picture about the dataset.

You are presented with an interactive view of the scale groupings. Each of the scale groups is represented by a single plot. The meaning of the individual bars is similar to the previous plot - they show coverages and strengts between the datasets. In this case, if the bar points up, it means that the property is stronger on the positive dataset and vice versa.

Individual scale titles are not shown for all of the items but the plots are interactive - after clicking on any of the bars, there is additional information shown on the right-hand side of the screen.
Lastly, the Z-strenght of the properties is also reflected in the plots by colouring weak properties white (and the other properties in the colour corresponding to the property).

Scale combinations

The next view includes the combined coverage of multiple scales. The difference between previous groupings is the coverage calculation, where all of the scales shown are merged together to form a "super" scale.
The coverage bars here, however, show slightly different information that the coverages previously. Now, we can see how the combinations scores on both of the datasets individually - we no longer have a single property and direction because we are combining multiple properties with possibly differing directions.

So, what is the actual interpretation of the plot? Let us present an example. We look at a plot with 5 combined properties that shows two bars - green and blue. The blue bar points up and the green bar points down. For a combination of scales ranging from 1 to 5, the blue bars measures which fraction of the positive dataset can be discriminated from the negative dataset, while the green bars measures which fraction of the negative dataset can be discriminated from the positive set. There can be a difference between those two values and it can be useful to see which dataset is more covered.
Please note that the algorithm searches the space of all scales combinations to identify those that have the highest coverage (see combination statistics in the tutorial).

Combination statistics

In order to see the individual property strengths, one needs to expand the information contained in any of the bars. This could be done either by clicking on any of the bars or on the descriptions of the table below. Once expanded, the information contained shows the scales selected for the combination as well as wide range of group and individual statistics.

Statistics are shown both for the combination's total and for the individual scales. The individual abbreviations are reported in the table on the right.

There is also an additional piece of information shown right in front of the individual properties - it is a sign and is either + or -. It is a simple abbreviation of a direction of a property. If a + is shown, it means that the property is stronger on the positive dataset and vice versa (for example, the + means that same as the bar pointing left for individual properties and a bar pointing up for groupings).

Classifier creation

Provided that there was enough signal and the analysis did run, user has the possibility to create a classifier with their datasets as a model. It is very simple - just quote the submission reference during new clever classifier creation.

This is all folks! Thanks for using the tool...