Meta-feature Description Table

The table shows for each meta-feature the group, a quick description and paper reference. See examples of how to compute the meta-feature in sphx_glr_auto_examples.

Meta-feature description
Group	Meta-feature name	Description	Reference
clustering	ch	Compute the Calinski and Harabasz index.	[1] T. Calinski, J. Harabasz, A dendrite method for cluster analysis, Commun. Stat. Theory Methods 3 (1) (1974) 1–27.
clustering	int	Compute the INT index.	[1] SOUZA, Bruno Feres de. Meta-aprendizagem aplicada à classificação de dados de expressão gênica. 2010. Tese (Doutorado em Ciências de Computação e Matemática Computacional), Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo, São Carlos, 2010. doi:10.11606/T.55.2010.tde-04012011-142551. [2] Bezdek, J. C.; Pal, N. R. (1998a). Some new indexes of cluster validity. IEEE Transactions on Systems, Man, and Cybernetics, Part B, v.28, n.3, p.301–315.
clustering	nre	Compute the normalized relative entropy.	[1] Bruno Almeida Pimentel, André C.P.L.F. de Carvalho. A new data characterization for selecting clustering algorithms using meta-learning. Information Sciences, Volume 477, 2019, Pages 203-219.
clustering	pb	Compute the pearson correlation between class matching and instance distances.	[1] J. Lev, “The Point Biserial Coefficient of Correlation”, Ann. Math. Statist., Vol. 20, no.1, pp. 125-126, 1949.
clustering	sc	Compute the number of clusters with size smaller than a given size.	[1] Bruno Almeida Pimentel, André C.P.L.F. de Carvalho. A new data characterization for selecting clustering algorithms using meta-learning. Information Sciences, Volume 477, 2019, Pages 203-219.
clustering	sil	Compute the mean silhouette value.	[1] P.J. Rousseeuw, Silhouettes: a graphical aid to the interpretation and validation of cluster analysis, J. Comput. Appl. Math. 20 (1987) 53–65.
clustering	vdb	Compute the Davies and Bouldin Index.	[1] D.L. Davies, D.W. Bouldin, A cluster separation measure, IEEE Trans. Pattern Anal. Mach. Intell. 1 (2) (1979) 224–227.
clustering	vdu	Compute the Dunn Index.	[1] J.C. Dunn, Well-separated clusters and optimal fuzzy partitions, J. Cybern. 4 (1) (1974) 95–104.
complexity	c1	Compute the entropy of class proportions.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 15). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.
complexity	c2	Compute the imbalance ratio.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 16). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.
complexity	cls_coef	Clustering coefficient.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.
complexity	density	Average density of the network.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.
complexity	f1	Maximum Fisher’s discriminant ratio.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. [2] Ramón A Mollineda, José S Sánchez, and José M Sotoca. Data characterization for effective prototype selection. In 2nd Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA), pages 27–34, 2005.
complexity	f1v	Directional-vector maximum Fisher’s discriminant ratio.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. [2] Witold Malina. Two-parameter fisher criterion. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 31(4):629–636, 2001.
complexity	f2	Volume of the overlapping region.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. [2] Marcilio C P Souto, Ana C Lorena, Newton Spolaôr, and Ivan G Costa. Complexity measures of supervised classification tasks: a case study for cancer gene expression data. In International Joint Conference on Neural Networks (IJCNN), pages 1352–1358, 2010. [3] Lisa Cummins. Combining and Choosing Case Base Maintenance Algorithms. PhD thesis, National University of Ireland, Cork, 2013.
complexity	f3	Compute feature maximum individual efficiency.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 6). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.
complexity	f4	Compute the collective feature efficiency.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 7). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.
complexity	hubs	Hub score.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.
complexity	l1	Sum of error distance by linear programming.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.
complexity	l2	Compute the OVO subsets error rate of linear classifier.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.
complexity	l3	Non-Linearity of a linear classifier.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.
complexity	lsc	Local set average cardinality.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 15). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. [2] Enrique Leyva, Antonio González, and Raúl Pérez. A set of complexity measures designed for applying meta-learning to instance selection. IEEE Transactions on Knowledge and Data Engineering, 27(2):354–367, 2014.
complexity	n1	Compute the fraction of borderline points.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9-10). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.
complexity	n2	Ratio of intra and extra class nearest neighbor distance.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.
complexity	n3	Error rate of the nearest neighbor classifier.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.
complexity	n4	Compute the non-linearity of the k-NN Classifier.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9-11). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.
complexity	t1	Fraction of hyperspheres covering data.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. [2] Tin K Ho and Mitra Basu. Complexity measures of supervised classification problems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3):289–300, 2002.
complexity	t2	Compute the average number of features per dimension.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 15). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.
complexity	t3	Compute the average number of PCA dimensions per points.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 15). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.
complexity	t4	Compute the ratio of the PCA dimension to the original dimension.	[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 15). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107.
concept	cohesiveness	Compute the improved version of the weighted distance, that captures how dense or sparse is the example distribution.	[1] Vilalta, R and Drissi, Y (2002). A Characterization of Difficult Problems in Classification. Proceedings of the 2002 International Conference on Machine Learning and Applications (pp. 133-138).
concept	conceptvar	Compute the concept variation that estimates the variability of class labels among examples.	[1] Vilalta, R. (1999). Understanding accuracy performance through concept characterization and algorithm analysis. In Proceedings of the ICML-99 workshop on recent advances in meta-learning and future work (pp. 3-9).
concept	impconceptvar	Compute the improved concept variation that estimates the variability of class labels among examples.	[1] Vilalta, R and Drissi, Y (2002). A Characterization of Difficult Problems in Classification. Proceedings of the 2002 International Conference on Machine Learning and Applications (pp. 133-138).
concept	wg_dist	Compute the weighted distance, that captures how dense or sparse is the example distribution.	[1] Vilalta, R. (1999). Understanding accuracy performance through concept characterization and algorithm analysis. In Proceedings of the ICML-99 workshop on recent advances in meta-learning and future work (pp. 3-9).
general	attr_to_inst	Compute the ratio between the number of attributes.	[1] Alexandros Kalousis and Theoharis Theoharis. NOEMON: Design, implementation and performance results of an intelligent assistant for classifier selection. Intelligent Data Analysis, 3(5):319–337, 1999.
general	cat_to_num	Compute the ratio between the number of categoric and numeric features.	[1] Matthias Feurer, Jost Tobias Springenberg, and Frank Hutter. Using meta-learning toinitialize bayesian optimization of hyperparameters. In International Conference on Meta-learning and Algorithm Selection (MLAS), pages 3 – 10, 2014.
general	freq_class	Compute the relative frequency of each distinct class.	[1] Guido Lindner and Rudi Studer. AST: Support for algorithm selection with a CBR approach. In European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 418 – 423, 1999.
general	inst_to_attr	Compute the ratio between the number of instances and attributes.	[1] Petr Kuba, Pavel Brazdil, Carlos Soares, and Adam Woznica. Exploiting sampling andmeta-learning for parameter setting for support vector machines. In 8th IBERAMIA Workshop on Learning and Data Mining, pages 209 – 216, 2002.
general	nr_attr	Compute the total number of attributes.	[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
general	nr_bin	Compute the number of binary attributes.	[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
general	nr_cat	Compute the number of categorical attributes.	[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998.
general	nr_class	Compute the number of distinct classes.	[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
general	nr_inst	Compute the number of instances (rows) in the dataset.	[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
general	nr_num	Compute the number of numeric features.	[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998.
general	num_to_cat	Compute the number of numerical and categorical features.	[1] Matthias Feurer, Jost Tobias Springenberg, and Frank Hutter. Using meta-learning toinitialize bayesian optimization of hyperparameters. In International Conference on Meta-learning and Algorithm Selection (MLAS), pages 3 – 10, 2014.
info-theory	attr_conc	Compute concentration coef. of each pair of distinct attributes.	[1] Alexandros Kalousis and Melanie Hilario. Model selection via meta-learning: a comparative study. International Journal on Artificial Intelligence Tools, 10(4):525–554, 2001.
info-theory	attr_ent	Compute Shannon’s entropy for each predictive attribute.	[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
info-theory	class_conc	Compute concentration coefficient between each attribute and class.	[1] Alexandros Kalousis and Melanie Hilario. Model selection via meta-learning: a comparative study. International Journal on Artificial Intelligence Tools, 10(4):525–554, 2001.
info-theory	class_ent	Compute target attribute Shannon’s entropy.	[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
info-theory	eq_num_attr	Compute the number of attributes equivalent for a predictive task.	[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
info-theory	joint_ent	Compute the joint entropy between each attribute and class.	[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
info-theory	mut_inf	Compute the mutual information between each attribute and target.	[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
info-theory	ns_ratio	Compute the noisiness of attributes.	[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
itemset	one_itemset	Compute the one itemset meta-feature.	[1] Song, Q., Wang, G., & Wang, C. (2012). Automatic recommendation of classification algorithms based on data set characteristics. Pattern recognition, 45(7), 2672-2689.
itemset	two_itemset	Compute the two itemset meta-feature.	[1] Song, Q., Wang, G., & Wang, C. (2012). Automatic recommendation of classification algorithms based on data set characteristics. Pattern recognition, 45(7), 2672-2689.
landmarking	best_node	Performance of a the best single decision tree node.	[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000. [2] Johannes Furnkranz and Johann Petrak. An evaluation of landmarking variants. In 1st ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM), pages 57 – 68, 2001.
landmarking	elite_nn	Performance of Elite Nearest Neighbor.	[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000.
landmarking	linear_discr	Performance of the Linear Discriminant classifier.	[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000. [2] Johannes Furnkranz and Johann Petrak. An evaluation of landmarking variants. In 1st ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM), pages 57 – 68, 2001.
landmarking	naive_bayes	Performance of the Naive Bayes classifier.	[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000. [2] Johannes Furnkranz and Johann Petrak. An evaluation of landmarking variants. In 1st ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM), pages 57 – 68, 2001.
landmarking	one_nn	Performance of the 1-Nearest Neighbor classifier.	[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000.
landmarking	random_node	Performance of the single decision tree node model induced by a random attribute.	[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000. [2] Johannes Furnkranz and Johann Petrak. An evaluation of landmarking variants. In 1st ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM), pages 57 – 68, 2001.
landmarking	worst_node	Performance of the single decision tree node model induced by the worst informative attribute.	[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000. [2] Johannes Furnkranz and Johann Petrak. An evaluation of landmarking variants. In 1st ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM), pages 57 – 68, 2001.
model-based	leaves	Compute the number of leaf nodes in the DT model.	[1] Yonghong Peng, PA Flach, Pavel Brazdil, and Carlos Soares. Decision tree-based data characterization for meta-learning. In 2nd ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning(IDDM), pages 111 – 122, 2002a.
model-based	leaves_branch	Compute the size of branches in the DT model.	[1] Yonghong Peng, PA Flach, Pavel Brazdil, and Carlos Soares. Decision tree-based data characterization for meta-learning. In 2nd ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning(IDDM), pages 111 – 122, 2002a.
model-based	leaves_corrob	Compute the leaves corroboration of the DT model.	[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000.
model-based	leaves_homo	Compute the DT model Homogeneity for every leaf node.	[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000.
model-based	leaves_per_class	Compute the proportion of leaves per class in DT model.	[1] Andray Filchenkov and Arseniy Pendryak. Datasets meta-feature description for recom-mending feature selection algorithm. In Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMWFRUCT), pages 11 – 18, 2015.
model-based	nodes	Compute the number of non-leaf nodes in DT model.	[1] Yonghong Peng, PA Flach, Pavel Brazdil, and Carlos Soares. Decision tree-based data characterization for meta-learning. In 2nd ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning(IDDM), pages 111 – 122, 2002a.
model-based	nodes_per_attr	Compute the ratio of nodes per number of attributes in DT model.	[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000.
model-based	nodes_per_inst	Compute the ratio of non-leaf nodes per number of instances in DT model.	[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000.
model-based	nodes_per_level	Compute the ratio of number of nodes per tree level in DT model.	[1] Yonghong Peng, PA Flach, Pavel Brazdil, and Carlos Soares. Decision tree-based data characterization for meta-learning. In 2nd ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning(IDDM), pages 111 – 122, 2002a.
model-based	nodes_repeated	Compute the number of repeated nodes in DT model.	[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000.
model-based	tree_depth	Compute the depth of every node in the DT model.	[1] Yonghong Peng, PA Flach, Pavel Brazdil, and Carlos Soares. Decision tree-based data characterization for meta-learning. In 2nd ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning(IDDM), pages 111 – 122, 2002a.
model-based	tree_imbalance	Compute the tree imbalance for each leaf node.	[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000.
model-based	tree_shape	Compute the tree shape for every leaf node.	[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000.
model-based	var_importance	Compute the features importance of the DT model for each attribute.	[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000.
statistical	can_cor	Compute canonical correlations of data.	[1] Alexandros Kalousis. Algorithm Selection via Meta-Learning. PhD thesis, Faculty of Science of the University of Geneva, 2002.
statistical	cor	Compute the absolute value of the correlation of distinct dataset column pairs.	[1] Ciro Castiello, Giovanna Castellano, and Anna Maria Fanelli. Meta-data: Characterization of input features for meta-learning. In 2nd International Conference on Modeling Decisions for Artificial Intelligence (MDAI), pages 457–468, 2005. [2] Matthias Reif, Faisal Shafait, Markus Goldstein, Thomas Breuel, and Andreas Dengel. Automatic classifier selection for non-experts. Pattern Analysis and Applications, 17(1):83–96, 2014. [3] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
statistical	cov	Compute the absolute value of the covariance of distinct dataset attribute pairs.	[1] Ciro Castiello, Giovanna Castellano, and Anna Maria Fanelli. Meta-data: Characterization of input features for meta-learning. In 2nd International Conference on Modeling Decisions for Artificial Intelligence (MDAI), pages 457–468, 2005. [2] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
statistical	eigenvalues	Compute the eigenvalues of covariance matrix from dataset.	[1] Shawkat Ali and Kate A. Smith. On learning algorithm selection for classification. Applied Soft Computing, 6(2):119 – 138, 2006.
statistical	g_mean	Compute the geometric mean of each attribute.	[1] Shawkat Ali and Kate A. Smith-Miles. A meta-learning approach to automatic kernel selection for support vector machines. Neurocomputing, 70(1):173 – 186, 2006.
statistical	gravity	Compute the distance between minority and majority classes center of mass.	[1] Shawkat Ali and Kate A. Smith. On learning algorithm selection for classification. Applied Soft Computing, 6(2):119 – 138, 2006.
statistical	h_mean	Compute the harmonic mean of each attribute.	[1] Shawkat Ali and Kate A. Smith-Miles. A meta-learning approach to automatic kernel selection for support vector machines. Neurocomputing, 70(1):173 – 186, 2006.
statistical	iq_range	Compute the interquartile range (IQR) of each attribute.	[1] Shawkat Ali and Kate A. Smith-Miles. A meta-learning approach to automatic kernel selection for support vector machines. Neurocomputing, 70(1):173 – 186, 2006.
statistical	kurtosis	Compute the kurtosis of each attribute.	[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
statistical	lh_trace	Compute the Lawley-Hotelling trace.	[1] Lawley D. A Generalization of Fisher’s z Test. Biometrika. 1938;30(1):180-187. [2] Hotelling H. A generalized T test and measure of multivariate dispersion. In: Neyman J, ed. Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability. Berkeley: University of California Press; 1951:23-41.
statistical	mad	Compute the Median Absolute Deviation (MAD) adjusted by a factor.	[1] Shawkat Ali and Kate A. Smith. On learning algorithm selection for classification. Applied Soft Computing, 6(2):119 – 138, 2006.
statistical	max	Compute the maximum value from each attribute.	[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998.
statistical	mean	Compute the mean value of each attribute.	[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998.
statistical	median	Compute the median value from each attribute.	[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998.
statistical	min	Compute the minimum value from each attribute.	[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998.
statistical	nr_cor_attr	Compute the number of distinct highly correlated pair of attributes.	[1] Mostafa A. Salama, Aboul Ella Hassanien, and Kenneth Revett. Employment of neural network and rough set in meta-learning. Memetic Computing, 5(3):165 – 177, 2013.
statistical	nr_disc	Compute the number of canonical correlation between each attribute and class.	[1] Guido Lindner and Rudi Studer. AST: Support for algorithm selection with a CBR approach. In European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 418 – 423, 1999.
statistical	nr_norm	Compute the number of attributes normally distributed based in a given method.	[1] Christian Kopf, Charles Taylor, and Jorg Keller. Meta-Analysis: From data characterisation for meta-learning to meta-regression. In PKDD Workshop on Data Mining, Decision Support, Meta-Learning and Inductive Logic Programming, pages 15 – 26, 2000.
statistical	nr_outliers	Compute the number of attributes with at least one outlier value.	[1] Christian Kopf and Ioannis Iglezakis. Combination of task description strategies and case base properties for meta-learning. In 2nd ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning(IDDM), pages 65 – 76, 2002. [2] Peter J. Rousseeuw and Mia Hubert. Robust statistics for outlier detection. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(1):73 – 79, 2011.
statistical	p_trace	Compute the Pillai’s trace.	[1] Pillai K.C.S (1955). Some New test criteria in multivariate analysis. Ann Math Stat: 26(1):117–21. Seber, G.A.F. (1984). Multivariate Observations. New York: John Wiley and Sons.
statistical	range	Compute the range (max - min) of each attribute.	[1] Shawkat Ali and Kate A. Smith-Miles. A meta-learning approach to automatic kernel selection for support vector machines. Neurocomputing, 70(1):173 – 186, 2006.
statistical	roy_root	Compute the Roy’s largest root.	[1] Roy SN. On a Heuristic Method of Test Construction and its use in Multivariate Analysis. Ann Math Stat. 1953;24(2):220-238. [2] A note on Roy’s largest root. Kuhfeld, W.F. Psychometrika (1986) 51: 479. https://doi.org/10.1007/BF02294069
statistical	sd	Compute the standard deviation of each attribute.	[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998.
statistical	sd_ratio	Compute a statistical test for homogeneity of covariances.	[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
statistical	skewness	Compute the skewness for each attribute.	[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
statistical	sparsity	Compute (possibly normalized) sparsity metric for each attribute.	[1] Mostafa A. Salama, Aboul Ella Hassanien, and Kenneth Revett. Employment of neural network and rough set in meta-learning. Memetic Computing, 5(3):165 – 177, 2013.
statistical	t_mean	Compute the trimmed mean of each attribute.	[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998.
statistical	var	Compute the variance of each attribute.	[1] Ciro Castiello, Giovanna Castellano, and Anna Maria Fanelli. Meta-data: Characterization of input features for meta-learning. In 2nd International Conference on Modeling Decisions for Artificial Intelligence (MDAI), pages 457–468, 2005.
statistical	w_lambda	Compute the Wilks’ Lambda value.	[1] Guido Lindner and Rudi Studer. AST: Support for algorithm selection with a CBR approach. In European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 418 – 423, 1999.

Note

Relative and Subsampling Landmarking are subcase of Landmarking. Thus, the Landmarking description is the same for Relative and Subsampling groups.

Note

More info about implementation can be found in API Documentation. See API Documentation.