Meta-feature Description Table

The table shows for each meta-feature the group, a quick description and paper reference. See examples of how to compute the meta-feature in sphx_glr_auto_examples.

Meta-feature description

Group

Meta-feature name

Description

Reference

clustering

ch

Compute the Calinski and Harabasz index.

[1] T. Calinski, J. Harabasz, A dendrite method for cluster analysis, Commun. Stat. Theory Methods 3 (1) (1974) 1–27.

clustering

int

Compute the INT index.

[1] SOUZA, Bruno Feres de. Meta-aprendizagem aplicada à classificação de dados de expressão gênica. 2010. Tese (Doutorado em Ciências de Computação e Matemática Computacional), Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo, São Carlos, 2010. doi:10.11606/T.55.2010.tde-04012011-142551. [2] Bezdek, J. C.; Pal, N. R. (1998a). Some new indexes of cluster validity. IEEE Transactions on Systems, Man, and Cybernetics, Part B, v.28, n.3, p.301–315.

clustering

nre

Compute the normalized relative entropy.

[1] Bruno Almeida Pimentel, André C.P.L.F. de Carvalho. A new data characterization for selecting clustering algorithms using meta-learning. Information Sciences, Volume 477, 2019, Pages 203-219.

clustering

pb

Compute the pearson correlation between class matching and instance distances.

[1] J. Lev, “The Point Biserial Coefficient of Correlation”, Ann. Math. Statist., Vol. 20, no.1, pp. 125-126, 1949.

clustering

sc

Compute the number of clusters with size smaller than a given size.

[1] Bruno Almeida Pimentel, André C.P.L.F. de Carvalho. A new data characterization for selecting clustering algorithms using meta-learning. Information Sciences, Volume 477, 2019, Pages 203-219.

clustering

sil

Compute the mean silhouette value.

[1] P.J. Rousseeuw, Silhouettes: a graphical aid to the interpretation and validation of cluster analysis, J. Comput. Appl. Math. 20 (1987) 53–65.

clustering

vdb

Compute the Davies and Bouldin Index.

[1] D.L. Davies, D.W. Bouldin, A cluster separation measure, IEEE Trans. Pattern Anal. Mach. Intell. 1 (2) (1979) 224–227.

clustering

vdu

Compute the Dunn Index.

[1] J.C. Dunn, Well-separated clusters and optimal fuzzy partitions, J. Cybern. 4 (1) (1974) 95–104.

complexity

c1

Compute the entropy of class proportions.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 15). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.

complexity

c2

Compute the imbalance ratio.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 16). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.

complexity

cls_coef

Clustering coefficient.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.

complexity

density

Average density of the network.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.

complexity

f1

Maximum Fisher’s discriminant ratio.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. [2] Ramón A Mollineda, José S Sánchez, and José M Sotoca. Data characterization for effective prototype selection. In 2nd Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA), pages 27–34, 2005.

complexity

f1v

Directional-vector maximum Fisher’s discriminant ratio.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. [2] Witold Malina. Two-parameter fisher criterion. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 31(4):629–636, 2001.

complexity

f2

Volume of the overlapping region.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. [2] Marcilio C P Souto, Ana C Lorena, Newton Spolaôr, and Ivan G Costa. Complexity measures of supervised classification tasks: a case study for cancer gene expression data. In International Joint Conference on Neural Networks (IJCNN), pages 1352–1358, 2010. [3] Lisa Cummins. Combining and Choosing Case Base Maintenance Algorithms. PhD thesis, National University of Ireland, Cork, 2013.

complexity

f3

Compute feature maximum individual efficiency.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 6). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.

complexity

f4

Compute the collective feature efficiency.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 7). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.

complexity

hubs

Hub score.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.

complexity

l1

Sum of error distance by linear programming.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.

complexity

l2

Compute the OVO subsets error rate of linear classifier.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.

complexity

l3

Non-Linearity of a linear classifier.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.

complexity

lsc

Local set average cardinality.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 15). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. [2] Enrique Leyva, Antonio González, and Raúl Pérez. A set of complexity measures designed for applying meta-learning to instance selection. IEEE Transactions on Knowledge and Data Engineering, 27(2):354–367, 2014.

complexity

n1

Compute the fraction of borderline points.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9-10). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.

complexity

n2

Ratio of intra and extra class nearest neighbor distance.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.

complexity

n3

Error rate of the nearest neighbor classifier.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.

complexity

n4

Compute the non-linearity of the k-NN Classifier.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9-11). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.

complexity

t1

Fraction of hyperspheres covering data.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. [2] Tin K Ho and Mitra Basu. Complexity measures of supervised classification problems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3):289–300, 2002.

complexity

t2

Compute the average number of features per dimension.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 15). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.

complexity

t3

Compute the average number of PCA dimensions per points.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 15). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.

complexity

t4

Compute the ratio of the PCA dimension to the original dimension.

[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 15). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.

concept

cohesiveness

Compute the improved version of the weighted distance, that captures how dense or sparse is the example distribution.

[1] Vilalta, R and Drissi, Y (2002). A Characterization of Difficult Problems in Classification. Proceedings of the 2002 International Conference on Machine Learning and Applications (pp. 133-138).

concept

conceptvar

Compute the concept variation that estimates the variability of class labels among examples.

[1] Vilalta, R. (1999). Understanding accuracy performance through concept characterization and algorithm analysis. In Proceedings of the ICML-99 workshop on recent advances in meta-learning and future work (pp. 3-9).

concept

impconceptvar

Compute the improved concept variation that estimates the variability of class labels among examples.

[1] Vilalta, R and Drissi, Y (2002). A Characterization of Difficult Problems in Classification. Proceedings of the 2002 International Conference on Machine Learning and Applications (pp. 133-138).

concept

wg_dist

Compute the weighted distance, that captures how dense or sparse is the example distribution.

[1] Vilalta, R. (1999). Understanding accuracy performance through concept characterization and algorithm analysis. In Proceedings of the ICML-99 workshop on recent advances in meta-learning and future work (pp. 3-9).

general

attr_to_inst

Compute the ratio between the number of attributes.

[1] Alexandros Kalousis and Theoharis Theoharis. NOEMON: Design, implementation and performance results of an intelligent assistant for classifier selection. Intelligent Data Analysis, 3(5):319–337, 1999.

general

cat_to_num

Compute the ratio between the number of categoric and numeric features.

[1] Matthias Feurer, Jost Tobias Springenberg, and Frank Hutter. Using meta-learning toinitialize bayesian optimization of hyperparameters. In International Conference on Meta-learning and Algorithm Selection (MLAS), pages 3 – 10, 2014.

general

freq_class

Compute the relative frequency of each distinct class.

[1] Guido Lindner and Rudi Studer. AST: Support for algorithm selection with a CBR approach. In European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 418 – 423, 1999.

general

inst_to_attr

Compute the ratio between the number of instances and attributes.

[1] Petr Kuba, Pavel Brazdil, Carlos Soares, and Adam Woznica. Exploiting sampling andmeta-learning for parameter setting for support vector machines. In 8th IBERAMIA Workshop on Learning and Data Mining, pages 209 – 216, 2002.

general

nr_attr

Compute the total number of attributes.

[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.

general

nr_bin

Compute the number of binary attributes.

[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.

general

nr_cat

Compute the number of categorical attributes.

[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998.

general

nr_class

Compute the number of distinct classes.

[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.

general

nr_inst

Compute the number of instances (rows) in the dataset.

[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.

general

nr_num

Compute the number of numeric features.

[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998.

general

num_to_cat

Compute the number of numerical and categorical features.

[1] Matthias Feurer, Jost Tobias Springenberg, and Frank Hutter. Using meta-learning toinitialize bayesian optimization of hyperparameters. In International Conference on Meta-learning and Algorithm Selection (MLAS), pages 3 – 10, 2014.

info-theory

attr_conc

Compute concentration coef. of each pair of distinct attributes.

[1] Alexandros Kalousis and Melanie Hilario. Model selection via meta-learning: a comparative study. International Journal on Artificial Intelligence Tools, 10(4):525–554, 2001.

info-theory

attr_ent

Compute Shannon’s entropy for each predictive attribute.

[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.

info-theory

class_conc

Compute concentration coefficient between each attribute and class.

[1] Alexandros Kalousis and Melanie Hilario. Model selection via meta-learning: a comparative study. International Journal on Artificial Intelligence Tools, 10(4):525–554, 2001.

info-theory

class_ent

Compute target attribute Shannon’s entropy.

[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.

info-theory

eq_num_attr

Compute the number of attributes equivalent for a predictive task.

[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.

info-theory

joint_ent

Compute the joint entropy between each attribute and class.

[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.

info-theory

mut_inf

Compute the mutual information between each attribute and target.

[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.

info-theory

ns_ratio

Compute the noisiness of attributes.

[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.

itemset

one_itemset

Compute the one itemset meta-feature.

[1] Song, Q., Wang, G., & Wang, C. (2012). Automatic recommendation of classification algorithms based on data set characteristics. Pattern recognition, 45(7), 2672-2689.

itemset

two_itemset

Compute the two itemset meta-feature.

[1] Song, Q., Wang, G., & Wang, C. (2012). Automatic recommendation of classification algorithms based on data set characteristics. Pattern recognition, 45(7), 2672-2689.

landmarking

best_node

Performance of a the best single decision tree node.

[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000. [2] Johannes Furnkranz and Johann Petrak. An evaluation of landmarking variants. In 1st ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM), pages 57 – 68, 2001.

landmarking

elite_nn

Performance of Elite Nearest Neighbor.

[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000.

landmarking

linear_discr

Performance of the Linear Discriminant classifier.

[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000. [2] Johannes Furnkranz and Johann Petrak. An evaluation of landmarking variants. In 1st ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM), pages 57 – 68, 2001.

landmarking

naive_bayes

Performance of the Naive Bayes classifier.

[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000. [2] Johannes Furnkranz and Johann Petrak. An evaluation of landmarking variants. In 1st ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM), pages 57 – 68, 2001.

landmarking

one_nn

Performance of the 1-Nearest Neighbor classifier.

[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000.

landmarking

random_node

Performance of the single decision tree node model induced by a random attribute.

[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000. [2] Johannes Furnkranz and Johann Petrak. An evaluation of landmarking variants. In 1st ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM), pages 57 – 68, 2001.

landmarking

worst_node

Performance of the single decision tree node model induced by the worst informative attribute.

[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000. [2] Johannes Furnkranz and Johann Petrak. An evaluation of landmarking variants. In 1st ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM), pages 57 – 68, 2001.

model-based

leaves

Compute the number of leaf nodes in the DT model.

[1] Yonghong Peng, PA Flach, Pavel Brazdil, and Carlos Soares. Decision tree-based data characterization for meta-learning. In 2nd ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning(IDDM), pages 111 – 122, 2002a.

model-based

leaves_branch

Compute the size of branches in the DT model.

[1] Yonghong Peng, PA Flach, Pavel Brazdil, and Carlos Soares. Decision tree-based data characterization for meta-learning. In 2nd ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning(IDDM), pages 111 – 122, 2002a.

model-based

leaves_corrob

Compute the leaves corroboration of the DT model.

[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000.

model-based

leaves_homo

Compute the DT model Homogeneity for every leaf node.

[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000.

model-based

leaves_per_class

Compute the proportion of leaves per class in DT model.

[1] Andray Filchenkov and Arseniy Pendryak. Datasets meta-feature description for recom-mending feature selection algorithm. In Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMWFRUCT), pages 11 – 18, 2015.

model-based

nodes

Compute the number of non-leaf nodes in DT model.

[1] Yonghong Peng, PA Flach, Pavel Brazdil, and Carlos Soares. Decision tree-based data characterization for meta-learning. In 2nd ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning(IDDM), pages 111 – 122, 2002a.

model-based

nodes_per_attr

Compute the ratio of nodes per number of attributes in DT model.

[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000.

model-based

nodes_per_inst

Compute the ratio of non-leaf nodes per number of instances in DT model.

[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000.

model-based

nodes_per_level

Compute the ratio of number of nodes per tree level in DT model.

[1] Yonghong Peng, PA Flach, Pavel Brazdil, and Carlos Soares. Decision tree-based data characterization for meta-learning. In 2nd ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning(IDDM), pages 111 – 122, 2002a.

model-based

nodes_repeated

Compute the number of repeated nodes in DT model.

[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000.

model-based

tree_depth

Compute the depth of every node in the DT model.

[1] Yonghong Peng, PA Flach, Pavel Brazdil, and Carlos Soares. Decision tree-based data characterization for meta-learning. In 2nd ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning(IDDM), pages 111 – 122, 2002a.

model-based

tree_imbalance

Compute the tree imbalance for each leaf node.

[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000.

model-based

tree_shape

Compute the tree shape for every leaf node.

[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000.

model-based

var_importance

Compute the features importance of the DT model for each attribute.

[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000.

statistical

can_cor

Compute canonical correlations of data.

[1] Alexandros Kalousis. Algorithm Selection via Meta-Learning. PhD thesis, Faculty of Science of the University of Geneva, 2002.

statistical

cor

Compute the absolute value of the correlation of distinct dataset column pairs.

[1] Ciro Castiello, Giovanna Castellano, and Anna Maria Fanelli. Meta-data: Characterization of input features for meta-learning. In 2nd International Conference on Modeling Decisions for Artificial Intelligence (MDAI), pages 457–468, 2005. [2] Matthias Reif, Faisal Shafait, Markus Goldstein, Thomas Breuel, and Andreas Dengel. Automatic classifier selection for non-experts. Pattern Analysis and Applications, 17(1):83–96, 2014. [3] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.

statistical

cov

Compute the absolute value of the covariance of distinct dataset attribute pairs.

[1] Ciro Castiello, Giovanna Castellano, and Anna Maria Fanelli. Meta-data: Characterization of input features for meta-learning. In 2nd International Conference on Modeling Decisions for Artificial Intelligence (MDAI), pages 457–468, 2005. [2] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.

statistical

eigenvalues

Compute the eigenvalues of covariance matrix from dataset.

[1] Shawkat Ali and Kate A. Smith. On learning algorithm selection for classification. Applied Soft Computing, 6(2):119 – 138, 2006.

statistical

g_mean

Compute the geometric mean of each attribute.

[1] Shawkat Ali and Kate A. Smith-Miles. A meta-learning approach to automatic kernel selection for support vector machines. Neurocomputing, 70(1):173 – 186, 2006.

statistical

gravity

Compute the distance between minority and majority classes center of mass.

[1] Shawkat Ali and Kate A. Smith. On learning algorithm selection for classification. Applied Soft Computing, 6(2):119 – 138, 2006.

statistical

h_mean

Compute the harmonic mean of each attribute.

[1] Shawkat Ali and Kate A. Smith-Miles. A meta-learning approach to automatic kernel selection for support vector machines. Neurocomputing, 70(1):173 – 186, 2006.

statistical

iq_range

Compute the interquartile range (IQR) of each attribute.

[1] Shawkat Ali and Kate A. Smith-Miles. A meta-learning approach to automatic kernel selection for support vector machines. Neurocomputing, 70(1):173 – 186, 2006.

statistical

kurtosis

Compute the kurtosis of each attribute.

[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.

statistical

lh_trace

Compute the Lawley-Hotelling trace.

[1] Lawley D. A Generalization of Fisher’s z Test. Biometrika. 1938;30(1):180-187. [2] Hotelling H. A generalized T test and measure of multivariate dispersion. In: Neyman J, ed. Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability. Berkeley: University of California Press; 1951:23-41.

statistical

mad

Compute the Median Absolute Deviation (MAD) adjusted by a factor.

[1] Shawkat Ali and Kate A. Smith. On learning algorithm selection for classification. Applied Soft Computing, 6(2):119 – 138, 2006.

statistical

max

Compute the maximum value from each attribute.

[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998.

statistical

mean

Compute the mean value of each attribute.

[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998.

statistical

median

Compute the median value from each attribute.

[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998.

statistical

min

Compute the minimum value from each attribute.

[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998.

statistical

nr_cor_attr

Compute the number of distinct highly correlated pair of attributes.

[1] Mostafa A. Salama, Aboul Ella Hassanien, and Kenneth Revett. Employment of neural network and rough set in meta-learning. Memetic Computing, 5(3):165 – 177, 2013.

statistical

nr_disc

Compute the number of canonical correlation between each attribute and class.

[1] Guido Lindner and Rudi Studer. AST: Support for algorithm selection with a CBR approach. In European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 418 – 423, 1999.

statistical

nr_norm

Compute the number of attributes normally distributed based in a given method.

[1] Christian Kopf, Charles Taylor, and Jorg Keller. Meta-Analysis: From data characterisation for meta-learning to meta-regression. In PKDD Workshop on Data Mining, Decision Support, Meta-Learning and Inductive Logic Programming, pages 15 – 26, 2000.

statistical

nr_outliers

Compute the number of attributes with at least one outlier value.

[1] Christian Kopf and Ioannis Iglezakis. Combination of task description strategies and case base properties for meta-learning. In 2nd ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning(IDDM), pages 65 – 76, 2002. [2] Peter J. Rousseeuw and Mia Hubert. Robust statistics for outlier detection. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(1):73 – 79, 2011.

statistical

p_trace

Compute the Pillai’s trace.

[1] Pillai K.C.S (1955). Some New test criteria in multivariate analysis. Ann Math Stat: 26(1):117–21. Seber, G.A.F. (1984). Multivariate Observations. New York: John Wiley and Sons.

statistical

range

Compute the range (max - min) of each attribute.

[1] Shawkat Ali and Kate A. Smith-Miles. A meta-learning approach to automatic kernel selection for support vector machines. Neurocomputing, 70(1):173 – 186, 2006.

statistical

roy_root

Compute the Roy’s largest root.

[1] Roy SN. On a Heuristic Method of Test Construction and its use in Multivariate Analysis. Ann Math Stat. 1953;24(2):220-238. [2] A note on Roy’s largest root. Kuhfeld, W.F. Psychometrika (1986) 51: 479. https://doi.org/10.1007/BF02294069

statistical

sd

Compute the standard deviation of each attribute.

[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998.

statistical

sd_ratio

Compute a statistical test for homogeneity of covariances.

[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.

statistical

skewness

Compute the skewness for each attribute.

[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.

statistical

sparsity

Compute (possibly normalized) sparsity metric for each attribute.

[1] Mostafa A. Salama, Aboul Ella Hassanien, and Kenneth Revett. Employment of neural network and rough set in meta-learning. Memetic Computing, 5(3):165 – 177, 2013.

statistical

t_mean

Compute the trimmed mean of each attribute.

[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998.

statistical

var

Compute the variance of each attribute.

[1] Ciro Castiello, Giovanna Castellano, and Anna Maria Fanelli. Meta-data: Characterization of input features for meta-learning. In 2nd International Conference on Modeling Decisions for Artificial Intelligence (MDAI), pages 457–468, 2005.

statistical

w_lambda

Compute the Wilks’ Lambda value.

[1] Guido Lindner and Rudi Studer. AST: Support for algorithm selection with a CBR approach. In European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 418 – 423, 1999.

Note

Relative and Subsampling Landmarking are subcase of Landmarking. Thus, the Landmarking description is the same for Relative and Subsampling groups.

Note

More info about implementation can be found in API Documentation. See API Documentation.