Welcome to PyMFE’s documentation!
Install
Requirements
The PyMFE package requires the following dependencies:
numpy |
scipy |
scikit-learn |
patsy |
pandas |
statsmodels |
texttable |
tqdm |
gower |
igraph |
Install
The PyMFE is available on the PyPi. You can install it via pip as follow:
pip install -U pymfe
It is possible to use the development version installing from GitHub:
pip install -U git+https://github.com/ealcobaca/pymfe.git
If you prefer, you can clone it and run the setup.py file. Use the following commands to get a copy from Github and install all dependencies:
git clone https://github.com/ealcobaca/pymfe.git
cd pymfe
pip install .
Test and coverage
You want to test/test-coverage the code before to install:
$ make install-dev
$ make test-cov
Using PyMFE
Extracting metafeatures with PyMFE is easy.
The simplest way to extract meta-features is by instantiating the MFE class. It computes five meta-features groups by default using mean and standard deviation as summary functions: General, Statistical, Information-theoretic, Model-based, and Landmarking. The fit method can be called by passing the X and y. Then the extract method is used to extract the related measures. A simple example using pymfe for supervised tasks is given next:
# Load a dataset
from sklearn.datasets import load_iris
from pymfe.mfe import MFE
data = load_iris()
y = data.target
X = data.data
# Extract default measures
mfe = MFE()
mfe.fit(X, y)
ft = mfe.extract()
print(ft)
# Extract general, statistical and information-theoretic measures
mfe = MFE(groups=["general", "statistical", "info-theory"])
mfe.fit(X, y)
ft = mfe.extract()
print(ft)
For more examples see sphx_glr_auto_examples.
Meta-feature Description Table
The table shows for each meta-feature the group, a quick description and paper reference. See examples of how to compute the meta-feature in sphx_glr_auto_examples.
Group |
Meta-feature name |
Description |
Reference |
---|---|---|---|
clustering |
ch |
Compute the Calinski and Harabasz index. |
[1] T. Calinski, J. Harabasz, A dendrite method for cluster analysis, Commun. Stat. Theory Methods 3 (1) (1974) 1–27. |
clustering |
int |
Compute the INT index. |
[1] SOUZA, Bruno Feres de. Meta-aprendizagem aplicada à classificação de dados de expressão gênica. 2010. Tese (Doutorado em Ciências de Computação e Matemática Computacional), Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo, São Carlos, 2010. doi:10.11606/T.55.2010.tde-04012011-142551. [2] Bezdek, J. C.; Pal, N. R. (1998a). Some new indexes of cluster validity. IEEE Transactions on Systems, Man, and Cybernetics, Part B, v.28, n.3, p.301–315. |
clustering |
nre |
Compute the normalized relative entropy. |
[1] Bruno Almeida Pimentel, André C.P.L.F. de Carvalho. A new data characterization for selecting clustering algorithms using meta-learning. Information Sciences, Volume 477, 2019, Pages 203-219. |
clustering |
pb |
Compute the pearson correlation between class matching and instance distances. |
[1] J. Lev, “The Point Biserial Coefficient of Correlation”, Ann. Math. Statist., Vol. 20, no.1, pp. 125-126, 1949. |
clustering |
sc |
Compute the number of clusters with size smaller than a given size. |
[1] Bruno Almeida Pimentel, André C.P.L.F. de Carvalho. A new data characterization for selecting clustering algorithms using meta-learning. Information Sciences, Volume 477, 2019, Pages 203-219. |
clustering |
sil |
Compute the mean silhouette value. |
[1] P.J. Rousseeuw, Silhouettes: a graphical aid to the interpretation and validation of cluster analysis, J. Comput. Appl. Math. 20 (1987) 53–65. |
clustering |
vdb |
Compute the Davies and Bouldin Index. |
[1] D.L. Davies, D.W. Bouldin, A cluster separation measure, IEEE Trans. Pattern Anal. Mach. Intell. 1 (2) (1979) 224–227. |
clustering |
vdu |
Compute the Dunn Index. |
[1] J.C. Dunn, Well-separated clusters and optimal fuzzy partitions, J. Cybern. 4 (1) (1974) 95–104. |
complexity |
c1 |
Compute the entropy of class proportions. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 15). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. |
complexity |
c2 |
Compute the imbalance ratio. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 16). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. |
complexity |
cls_coef |
Clustering coefficient. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. |
complexity |
density |
Average density of the network. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. |
complexity |
f1 |
Maximum Fisher’s discriminant ratio. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. [2] Ramón A Mollineda, José S Sánchez, and José M Sotoca. Data characterization for effective prototype selection. In 2nd Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA), pages 27–34, 2005. |
complexity |
f1v |
Directional-vector maximum Fisher’s discriminant ratio. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. [2] Witold Malina. Two-parameter fisher criterion. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 31(4):629–636, 2001. |
complexity |
f2 |
Volume of the overlapping region. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. [2] Marcilio C P Souto, Ana C Lorena, Newton Spolaôr, and Ivan G Costa. Complexity measures of supervised classification tasks: a case study for cancer gene expression data. In International Joint Conference on Neural Networks (IJCNN), pages 1352–1358, 2010. [3] Lisa Cummins. Combining and Choosing Case Base Maintenance Algorithms. PhD thesis, National University of Ireland, Cork, 2013. |
complexity |
f3 |
Compute feature maximum individual efficiency. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 6). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. |
complexity |
f4 |
Compute the collective feature efficiency. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 7). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. |
complexity |
hubs |
Hub score. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. |
complexity |
l1 |
Sum of error distance by linear programming. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. |
complexity |
l2 |
Compute the OVO subsets error rate of linear classifier. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. |
complexity |
l3 |
Non-Linearity of a linear classifier. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. |
complexity |
lsc |
Local set average cardinality. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 15). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. [2] Enrique Leyva, Antonio González, and Raúl Pérez. A set of complexity measures designed for applying meta-learning to instance selection. IEEE Transactions on Knowledge and Data Engineering, 27(2):354–367, 2014. |
complexity |
n1 |
Compute the fraction of borderline points. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9-10). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. |
complexity |
n2 |
Ratio of intra and extra class nearest neighbor distance. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. |
complexity |
n3 |
Error rate of the nearest neighbor classifier. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. |
complexity |
n4 |
Compute the non-linearity of the k-NN Classifier. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9-11). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. |
complexity |
t1 |
Fraction of hyperspheres covering data. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 9). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. [2] Tin K Ho and Mitra Basu. Complexity measures of supervised classification problems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3):289–300, 2002. |
complexity |
t2 |
Compute the average number of features per dimension. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 15). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. |
complexity |
t3 |
Compute the average number of PCA dimensions per points. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 15). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. |
complexity |
t4 |
Compute the ratio of the PCA dimension to the original dimension. |
[1] Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, and Tin K. Ho. How Complex is your classification problem? A survey on measuring classification complexity (V2). (2019) (Cited on page 15). Published in ACM Computing Surveys (CSUR), Volume 52 Issue 5, October 2019, Article No. 107. |
concept |
cohesiveness |
Compute the improved version of the weighted distance, that captures how dense or sparse is the example distribution. |
[1] Vilalta, R and Drissi, Y (2002). A Characterization of Difficult Problems in Classification. Proceedings of the 2002 International Conference on Machine Learning and Applications (pp. 133-138). |
concept |
conceptvar |
Compute the concept variation that estimates the variability of class labels among examples. |
[1] Vilalta, R. (1999). Understanding accuracy performance through concept characterization and algorithm analysis. In Proceedings of the ICML-99 workshop on recent advances in meta-learning and future work (pp. 3-9). |
concept |
impconceptvar |
Compute the improved concept variation that estimates the variability of class labels among examples. |
[1] Vilalta, R and Drissi, Y (2002). A Characterization of Difficult Problems in Classification. Proceedings of the 2002 International Conference on Machine Learning and Applications (pp. 133-138). |
concept |
wg_dist |
Compute the weighted distance, that captures how dense or sparse is the example distribution. |
[1] Vilalta, R. (1999). Understanding accuracy performance through concept characterization and algorithm analysis. In Proceedings of the ICML-99 workshop on recent advances in meta-learning and future work (pp. 3-9). |
general |
attr_to_inst |
Compute the ratio between the number of attributes. |
[1] Alexandros Kalousis and Theoharis Theoharis. NOEMON: Design, implementation and performance results of an intelligent assistant for classifier selection. Intelligent Data Analysis, 3(5):319–337, 1999. |
general |
cat_to_num |
Compute the ratio between the number of categoric and numeric features. |
[1] Matthias Feurer, Jost Tobias Springenberg, and Frank Hutter. Using meta-learning toinitialize bayesian optimization of hyperparameters. In International Conference on Meta-learning and Algorithm Selection (MLAS), pages 3 – 10, 2014. |
general |
freq_class |
Compute the relative frequency of each distinct class. |
[1] Guido Lindner and Rudi Studer. AST: Support for algorithm selection with a CBR approach. In European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 418 – 423, 1999. |
general |
inst_to_attr |
Compute the ratio between the number of instances and attributes. |
[1] Petr Kuba, Pavel Brazdil, Carlos Soares, and Adam Woznica. Exploiting sampling andmeta-learning for parameter setting for support vector machines. In 8th IBERAMIA Workshop on Learning and Data Mining, pages 209 – 216, 2002. |
general |
nr_attr |
Compute the total number of attributes. |
[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994. |
general |
nr_bin |
Compute the number of binary attributes. |
[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994. |
general |
nr_cat |
Compute the number of categorical attributes. |
[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998. |
general |
nr_class |
Compute the number of distinct classes. |
[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994. |
general |
nr_inst |
Compute the number of instances (rows) in the dataset. |
[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994. |
general |
nr_num |
Compute the number of numeric features. |
[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998. |
general |
num_to_cat |
Compute the number of numerical and categorical features. |
[1] Matthias Feurer, Jost Tobias Springenberg, and Frank Hutter. Using meta-learning toinitialize bayesian optimization of hyperparameters. In International Conference on Meta-learning and Algorithm Selection (MLAS), pages 3 – 10, 2014. |
info-theory |
attr_conc |
Compute concentration coef. of each pair of distinct attributes. |
[1] Alexandros Kalousis and Melanie Hilario. Model selection via meta-learning: a comparative study. International Journal on Artificial Intelligence Tools, 10(4):525–554, 2001. |
info-theory |
attr_ent |
Compute Shannon’s entropy for each predictive attribute. |
[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994. |
info-theory |
class_conc |
Compute concentration coefficient between each attribute and class. |
[1] Alexandros Kalousis and Melanie Hilario. Model selection via meta-learning: a comparative study. International Journal on Artificial Intelligence Tools, 10(4):525–554, 2001. |
info-theory |
class_ent |
Compute target attribute Shannon’s entropy. |
[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994. |
info-theory |
eq_num_attr |
Compute the number of attributes equivalent for a predictive task. |
[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994. |
info-theory |
joint_ent |
Compute the joint entropy between each attribute and class. |
[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994. |
info-theory |
mut_inf |
Compute the mutual information between each attribute and target. |
[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994. |
info-theory |
ns_ratio |
Compute the noisiness of attributes. |
[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994. |
itemset |
one_itemset |
Compute the one itemset meta-feature. |
[1] Song, Q., Wang, G., & Wang, C. (2012). Automatic recommendation of classification algorithms based on data set characteristics. Pattern recognition, 45(7), 2672-2689. |
itemset |
two_itemset |
Compute the two itemset meta-feature. |
[1] Song, Q., Wang, G., & Wang, C. (2012). Automatic recommendation of classification algorithms based on data set characteristics. Pattern recognition, 45(7), 2672-2689. |
landmarking |
best_node |
Performance of a the best single decision tree node. |
[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000. [2] Johannes Furnkranz and Johann Petrak. An evaluation of landmarking variants. In 1st ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM), pages 57 – 68, 2001. |
landmarking |
elite_nn |
Performance of Elite Nearest Neighbor. |
[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000. |
landmarking |
linear_discr |
Performance of the Linear Discriminant classifier. |
[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000. [2] Johannes Furnkranz and Johann Petrak. An evaluation of landmarking variants. In 1st ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM), pages 57 – 68, 2001. |
landmarking |
naive_bayes |
Performance of the Naive Bayes classifier. |
[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000. [2] Johannes Furnkranz and Johann Petrak. An evaluation of landmarking variants. In 1st ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM), pages 57 – 68, 2001. |
landmarking |
one_nn |
Performance of the 1-Nearest Neighbor classifier. |
[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000. |
landmarking |
random_node |
Performance of the single decision tree node model induced by a random attribute. |
[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000. [2] Johannes Furnkranz and Johann Petrak. An evaluation of landmarking variants. In 1st ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM), pages 57 – 68, 2001. |
landmarking |
worst_node |
Performance of the single decision tree node model induced by the worst informative attribute. |
[1] Hilan Bensusan and Christophe Giraud-Carrier. Discovering task neighbourhoods through landmark learning performances. In 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 325 – 330, 2000. [2] Johannes Furnkranz and Johann Petrak. An evaluation of landmarking variants. In 1st ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM), pages 57 – 68, 2001. |
model-based |
leaves |
Compute the number of leaf nodes in the DT model. |
[1] Yonghong Peng, PA Flach, Pavel Brazdil, and Carlos Soares. Decision tree-based data characterization for meta-learning. In 2nd ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning(IDDM), pages 111 – 122, 2002a. |
model-based |
leaves_branch |
Compute the size of branches in the DT model. |
[1] Yonghong Peng, PA Flach, Pavel Brazdil, and Carlos Soares. Decision tree-based data characterization for meta-learning. In 2nd ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning(IDDM), pages 111 – 122, 2002a. |
model-based |
leaves_corrob |
Compute the leaves corroboration of the DT model. |
[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000. |
model-based |
leaves_homo |
Compute the DT model Homogeneity for every leaf node. |
[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000. |
model-based |
leaves_per_class |
Compute the proportion of leaves per class in DT model. |
[1] Andray Filchenkov and Arseniy Pendryak. Datasets meta-feature description for recom-mending feature selection algorithm. In Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMWFRUCT), pages 11 – 18, 2015. |
model-based |
nodes |
Compute the number of non-leaf nodes in DT model. |
[1] Yonghong Peng, PA Flach, Pavel Brazdil, and Carlos Soares. Decision tree-based data characterization for meta-learning. In 2nd ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning(IDDM), pages 111 – 122, 2002a. |
model-based |
nodes_per_attr |
Compute the ratio of nodes per number of attributes in DT model. |
[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000. |
model-based |
nodes_per_inst |
Compute the ratio of non-leaf nodes per number of instances in DT model. |
[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000. |
model-based |
nodes_per_level |
Compute the ratio of number of nodes per tree level in DT model. |
[1] Yonghong Peng, PA Flach, Pavel Brazdil, and Carlos Soares. Decision tree-based data characterization for meta-learning. In 2nd ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning(IDDM), pages 111 – 122, 2002a. |
model-based |
nodes_repeated |
Compute the number of repeated nodes in DT model. |
[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000. |
model-based |
tree_depth |
Compute the depth of every node in the DT model. |
[1] Yonghong Peng, PA Flach, Pavel Brazdil, and Carlos Soares. Decision tree-based data characterization for meta-learning. In 2nd ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning(IDDM), pages 111 – 122, 2002a. |
model-based |
tree_imbalance |
Compute the tree imbalance for each leaf node. |
[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000. |
model-based |
tree_shape |
Compute the tree shape for every leaf node. |
[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000. |
model-based |
var_importance |
Compute the features importance of the DT model for each attribute. |
[1] Hilan Bensusan, Christophe Giraud-Carrier, and Claire Kennedy. A higher-order approachto meta-learning. In 10th International Conference Inductive Logic Programming (ILP), pages 33 – 42, 2000. |
statistical |
can_cor |
Compute canonical correlations of data. |
[1] Alexandros Kalousis. Algorithm Selection via Meta-Learning. PhD thesis, Faculty of Science of the University of Geneva, 2002. |
statistical |
cor |
Compute the absolute value of the correlation of distinct dataset column pairs. |
[1] Ciro Castiello, Giovanna Castellano, and Anna Maria Fanelli. Meta-data: Characterization of input features for meta-learning. In 2nd International Conference on Modeling Decisions for Artificial Intelligence (MDAI), pages 457–468, 2005. [2] Matthias Reif, Faisal Shafait, Markus Goldstein, Thomas Breuel, and Andreas Dengel. Automatic classifier selection for non-experts. Pattern Analysis and Applications, 17(1):83–96, 2014. [3] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994. |
statistical |
cov |
Compute the absolute value of the covariance of distinct dataset attribute pairs. |
[1] Ciro Castiello, Giovanna Castellano, and Anna Maria Fanelli. Meta-data: Characterization of input features for meta-learning. In 2nd International Conference on Modeling Decisions for Artificial Intelligence (MDAI), pages 457–468, 2005. [2] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994. |
statistical |
eigenvalues |
Compute the eigenvalues of covariance matrix from dataset. |
[1] Shawkat Ali and Kate A. Smith. On learning algorithm selection for classification. Applied Soft Computing, 6(2):119 – 138, 2006. |
statistical |
g_mean |
Compute the geometric mean of each attribute. |
[1] Shawkat Ali and Kate A. Smith-Miles. A meta-learning approach to automatic kernel selection for support vector machines. Neurocomputing, 70(1):173 – 186, 2006. |
statistical |
gravity |
Compute the distance between minority and majority classes center of mass. |
[1] Shawkat Ali and Kate A. Smith. On learning algorithm selection for classification. Applied Soft Computing, 6(2):119 – 138, 2006. |
statistical |
h_mean |
Compute the harmonic mean of each attribute. |
[1] Shawkat Ali and Kate A. Smith-Miles. A meta-learning approach to automatic kernel selection for support vector machines. Neurocomputing, 70(1):173 – 186, 2006. |
statistical |
iq_range |
Compute the interquartile range (IQR) of each attribute. |
[1] Shawkat Ali and Kate A. Smith-Miles. A meta-learning approach to automatic kernel selection for support vector machines. Neurocomputing, 70(1):173 – 186, 2006. |
statistical |
kurtosis |
Compute the kurtosis of each attribute. |
[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994. |
statistical |
lh_trace |
Compute the Lawley-Hotelling trace. |
[1] Lawley D. A Generalization of Fisher’s z Test. Biometrika. 1938;30(1):180-187. [2] Hotelling H. A generalized T test and measure of multivariate dispersion. In: Neyman J, ed. Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability. Berkeley: University of California Press; 1951:23-41. |
statistical |
mad |
Compute the Median Absolute Deviation (MAD) adjusted by a factor. |
[1] Shawkat Ali and Kate A. Smith. On learning algorithm selection for classification. Applied Soft Computing, 6(2):119 – 138, 2006. |
statistical |
max |
Compute the maximum value from each attribute. |
[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998. |
statistical |
mean |
Compute the mean value of each attribute. |
[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998. |
statistical |
median |
Compute the median value from each attribute. |
[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998. |
statistical |
min |
Compute the minimum value from each attribute. |
[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998. |
statistical |
nr_cor_attr |
Compute the number of distinct highly correlated pair of attributes. |
[1] Mostafa A. Salama, Aboul Ella Hassanien, and Kenneth Revett. Employment of neural network and rough set in meta-learning. Memetic Computing, 5(3):165 – 177, 2013. |
statistical |
nr_disc |
Compute the number of canonical correlation between each attribute and class. |
[1] Guido Lindner and Rudi Studer. AST: Support for algorithm selection with a CBR approach. In European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 418 – 423, 1999. |
statistical |
nr_norm |
Compute the number of attributes normally distributed based in a given method. |
[1] Christian Kopf, Charles Taylor, and Jorg Keller. Meta-Analysis: From data characterisation for meta-learning to meta-regression. In PKDD Workshop on Data Mining, Decision Support, Meta-Learning and Inductive Logic Programming, pages 15 – 26, 2000. |
statistical |
nr_outliers |
Compute the number of attributes with at least one outlier value. |
[1] Christian Kopf and Ioannis Iglezakis. Combination of task description strategies and case base properties for meta-learning. In 2nd ECML/PKDD International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning(IDDM), pages 65 – 76, 2002. [2] Peter J. Rousseeuw and Mia Hubert. Robust statistics for outlier detection. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(1):73 – 79, 2011. |
statistical |
p_trace |
Compute the Pillai’s trace. |
[1] Pillai K.C.S (1955). Some New test criteria in multivariate analysis. Ann Math Stat: 26(1):117–21. Seber, G.A.F. (1984). Multivariate Observations. New York: John Wiley and Sons. |
statistical |
range |
Compute the range (max - min) of each attribute. |
[1] Shawkat Ali and Kate A. Smith-Miles. A meta-learning approach to automatic kernel selection for support vector machines. Neurocomputing, 70(1):173 – 186, 2006. |
statistical |
roy_root |
Compute the Roy’s largest root. |
[1] Roy SN. On a Heuristic Method of Test Construction and its use in Multivariate Analysis. Ann Math Stat. 1953;24(2):220-238. [2] A note on Roy’s largest root. Kuhfeld, W.F. Psychometrika (1986) 51: 479. https://doi.org/10.1007/BF02294069 |
statistical |
sd |
Compute the standard deviation of each attribute. |
[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998. |
statistical |
sd_ratio |
Compute a statistical test for homogeneity of covariances. |
[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994. |
statistical |
skewness |
Compute the skewness for each attribute. |
[1] Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994. |
statistical |
sparsity |
Compute (possibly normalized) sparsity metric for each attribute. |
[1] Mostafa A. Salama, Aboul Ella Hassanien, and Kenneth Revett. Employment of neural network and rough set in meta-learning. Memetic Computing, 5(3):165 – 177, 2013. |
statistical |
t_mean |
Compute the trimmed mean of each attribute. |
[1] Robert Engels and Christiane Theusinger. Using a data metric for preprocessing advice for data mining applications. In 13th European Conference on on Artificial Intelligence (ECAI), pages 430 – 434, 1998. |
statistical |
var |
Compute the variance of each attribute. |
[1] Ciro Castiello, Giovanna Castellano, and Anna Maria Fanelli. Meta-data: Characterization of input features for meta-learning. In 2nd International Conference on Modeling Decisions for Artificial Intelligence (MDAI), pages 457–468, 2005. |
statistical |
w_lambda |
Compute the Wilks’ Lambda value. |
[1] Guido Lindner and Rudi Studer. AST: Support for algorithm selection with a CBR approach. In European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pages 418 – 423, 1999. |
Note
Relative and Subsampling Landmarking are subcase of Landmarking. Thus, the Landmarking description is the same for Relative and Subsampling groups.
Note
More info about implementation can be found in API Documentation. See API Documentation.
API Documentation
This is the full API documentation of the PyMFE package.
pymfe.mfe
: Meta-feature extractor
Main module for extracting metafeatures from datasets.
|
Core class for metafeature extraction. |
pymfe.general
: General Meta-features
A module dedicated to the extraction of general metafeatures.
Keep methods for metafeatures of |
pymfe.statistical
: Statistical Meta-features
A module dedicated to the extraction of statistical metafeatures.
Keep methods for metafeatures of |
pymfe.info_theory
: Information theory Meta-features
A module dedicated to the extraction of Information Theoretic Metafeatures.
Keeps methods for metafeatures of |
pymfe.model_based
: Model-based Meta-features
Module dedicated to extraction of model-based metafeatures.
Keep methods for metafeatures of |
pymfe.landmarking
: Landmarking Meta-features
Module dedicated to extraction of landmarking metafeatures.
Keep methods for metafeatures of |
pymfe.relative
: Relative Landmarking Meta-features
Module dedicated to extraction of relative landmarking metafeatures.
Keep methods for metafeatures of |
pymfe.clustering
: Clustering Meta-features
A module dedicated to the extraction of clustering metafeatures.
Keep methods for metafeatures of |
pymfe.concept
: Concept Meta-features
Module dedicated to extraction of Concept Metafeatures.
Keep methods for metafeatures of |
pymfe.itemset
: Itemset Meta-features
Module dedicated to extraction of itemset metafeatures.
Keep methods for metafeatures of |
pymfe.complexity
: Complexity Meta-features
Module dedicated to extraction of complexity metafeatures.
Keep methods for metafeatures of |
The PyMFE example gallery
In this gallery, we will show a set of examples to help you to use this package and guide you on the meta-feature extraction process.
In the Meta-learning (MtL) literature, meta-features are measures used to characterize data sets and/or their relations with algorithm bias. According to Brazdil et al. (2008), “Meta-learning is the study of principled methods that exploit meta-knowledge to obtain efficient models and solutions by adapting the machine learning and data mining process”.
Meta-features are used in MtL and AutoML tasks in general, to represent/understand a dataset, to understanding a learning bias, to create machine learning (or data mining) recommendations systems, and to create surrogates models, to name a few.
Pinto et al. (2016) and Rivolli et al. (2018) defined a meta-feature as follows. Let \(D \in \mathcal{D}\) be a dataset, \(m\colon \mathcal{D} \to \mathbb{R}^{k'}\) be a characterization measure, and \(\sigma\colon \mathbb{R}^{k'} \to \mathbb{R}^{k}\) be a summarization function. Both \(m\) and \(\sigma\) have also hyperparameters associated, \(h_m\) and \(h_\sigma\) respectively. Thus, a meta-feature \(f\colon \mathcal{D} \to \mathbb{R}^{k}\) for a given dataset \(D\) is:
The measure :math: m can extract more than one value from each data set, i.e., \(k'\) can vary according to \(D\), which can be mapped to a vector of fixed length \(k\) using a summarization function :math: sigma.
In this package, We provided the following meta-features groups:
General: General information related to the dataset, also known as simple measures, such as the number of instances, attributes and classes.
Statistical: Standard statistical measures to describe the numerical properties of data distribution.
Information-theoretic: Particularly appropriate to describe discrete (categorical) attributes and their relationship with the classes.
Model-based: Measures designed to extract characteristics from simple machine learning models.
Landmarking: Performance of simple and efficient learning algorithms.
Relative Landmarking: Relative performance of simple and efficient learning algorithms.
Subsampling Landmarking: Performance of simple and efficient learning algorithms from a subsample of the dataset.
Clustering: Clustering measures extract information about dataset based on external validation indexes.
Concept: Estimate the variability of class labels among examples and the examples density.
Itemset: Compute the correlation between binary attributes.
Complexity: Estimate the difficulty in separating the data points into their expected classes.
Below is a gallery of examples:
Introductory Examples
Introductory examples for the PyMFE package.

Extracting meta-features from unsupervised learning
Advanced Examples
These examples will show you how to use some advanced configurations and tricks to code more comfortable.
Miscellaneous Examples
Miscellaneous examples for the pymfe package.

Listing available metafeatures, groups, and summaries

Plotting elapsed time in a meta-feature extraction
Examples for Developers
These examples are dedicated to any person that wish contribute to the development of the package or understand more about it. We expect that these examples show you the basic about PYMFE architecture and inspire you to contribute.
Introductory Examples
Introductory examples for the PyMFE package.

Extracting meta-features from unsupervised learning
Note
Click here to download the full example code
Extracting meta-features from unsupervised learning
In this example we will show you how to extract meta-features from unsupervised machine learning tasks.
# Load a dataset
from sklearn.datasets import load_iris
from pymfe.mfe import MFE
data = load_iris()
y = data.target
X = data.data
You can simply omit the target attribute for unsupervised tasks while fitting the data into the MFE model. The pymfe package automatically finds and extracts only the metafeatures suitable for this type of task.
# Extract default unsupervised measures
mfe = MFE()
mfe.fit(X)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
# Extract all available unsupervised measures
mfe = MFE(groups="all")
mfe.fit(X)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
attr_conc.mean 0.20980476831180148
attr_conc.sd 0.1195879817732128
attr_ent.mean 2.2771912775084115
attr_ent.sd 0.06103943244855649
attr_to_inst 0.02666666666666667
cat_to_num 0.0
cor.mean 0.594116025760156
cor.sd 0.3375443182856702
cov.mean 0.5966542132736764
cov.sd 0.5582672431248462
eigenvalues.mean 1.1432392617449672
eigenvalues.sd 2.0587713015069764
g_mean.mean 3.2230731578977903
g_mean.sd 2.0229431040263726
h_mean.mean 2.9783891110628673
h_mean.sd 2.145948231748242
inst_to_attr 37.5
iq_range.mean 1.7000000000000002
iq_range.sd 1.2754084313139324
kurtosis.mean -0.8105361276250795
kurtosis.sd 0.7326910069728161
mad.mean 1.0934175
mad.sd 0.5785781994035033
max.mean 5.425000000000001
max.sd 2.4431878083083722
mean.mean 3.4645000000000006
mean.sd 1.918485079431164
median.mean 3.6125000000000003
median.sd 1.919364043982624
min.mean 1.8499999999999999
min.sd 1.8083141320025125
nr_attr 4
nr_bin 0
nr_cat 0
nr_cor_attr 0.5
nr_inst 150
nr_norm 1.0
nr_num 4
nr_outliers 1
num_to_cat nan
range.mean 3.5750000000000006
range.sd 1.6500000000000001
sd.mean 0.9478670787835934
sd.sd 0.5712994109375844
skewness.mean 0.06273198447775732
skewness.sd 0.29439896290757683
sparsity.mean 0.0287147773948895
sparsity.sd 0.011032357470087495
t_mean.mean 3.4705555555555554
t_mean.sd 1.9048021402275979
var.mean 1.1432392617449665
var.sd 1.3325463926454557
attr_conc.mean 0.20980476831180148
attr_conc.sd 0.1195879817732128
attr_ent.mean 2.2771912775084115
attr_ent.sd 0.06103943244855649
attr_to_inst 0.02666666666666667
cat_to_num 0.0
cohesiveness.mean 67.10333333333334
cohesiveness.sd 5.355733510152213
cor.mean 0.594116025760156
cor.sd 0.3375443182856702
cov.mean 0.5966542132736764
cov.sd 0.5582672431248462
eigenvalues.mean 1.1432392617449672
eigenvalues.sd 2.0587713015069764
g_mean.mean 3.2230731578977903
g_mean.sd 2.0229431040263726
h_mean.mean 2.9783891110628673
h_mean.sd 2.145948231748242
inst_to_attr 37.5
iq_range.mean 1.7000000000000002
iq_range.sd 1.2754084313139324
kurtosis.mean -0.8105361276250795
kurtosis.sd 0.7326910069728161
mad.mean 1.0934175
mad.sd 0.5785781994035033
max.mean 5.425000000000001
max.sd 2.4431878083083722
mean.mean 3.4645000000000006
mean.sd 1.918485079431164
median.mean 3.6125000000000003
median.sd 1.919364043982624
min.mean 1.8499999999999999
min.sd 1.8083141320025125
nr_attr 4
nr_bin 0
nr_cat 0
nr_cor_attr 0.5
nr_inst 150
nr_norm 1.0
nr_num 4
nr_outliers 1
num_to_cat nan
one_itemset.mean 0.2
one_itemset.sd 0.04993563108104261
range.mean 3.5750000000000006
range.sd 1.6500000000000001
sd.mean 0.9478670787835934
sd.sd 0.5712994109375844
skewness.mean 0.06273198447775732
skewness.sd 0.29439896290757683
sparsity.mean 0.0287147773948895
sparsity.sd 0.011032357470087495
t2 0.02666666666666667
t3 0.013333333333333334
t4 0.5
t_mean.mean 3.4705555555555554
t_mean.sd 1.9048021402275979
two_itemset.mean 0.32
two_itemset.sd 0.0851125499534728
var.mean 1.1432392617449665
var.sd 1.3325463926454557
wg_dist.mean 0.4620901765870531
wg_dist.sd 0.05612193762635788
Total running time of the script: ( 0 minutes 0.306 seconds)
Note
Click here to download the full example code
Meta-features from a model
In this example, we will show you how to extract meta-features from a pre-fitted model.
# Load a dataset
import sklearn.tree
from sklearn.datasets import load_iris
from pymfe.mfe import MFE
iris = load_iris()
If you want to extract metafeatures from a pre-fitted machine learning model (from sklearn package), you can use the extract_from_model method without needing to use the training data:
# Extract from model
model = sklearn.tree.DecisionTreeClassifier().fit(iris.data, iris.target)
extractor = MFE()
ft = extractor.extract_from_model(model)
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
# Extract specific metafeatures from model
extractor = MFE(features=["tree_shape", "nodes_repeated"], summary="histogram")
ft = extractor.extract_from_model(
model,
arguments_fit={"verbose": 1},
arguments_extract={"verbose": 1, "histogram": {"bins": 5}})
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
leaves 9
leaves_branch.mean 3.7777777777777777
leaves_branch.sd 1.2018504251546631
leaves_corrob.mean 0.1111111111111111
leaves_corrob.sd 0.15051762539834182
leaves_homo.mean 37.46666666666667
leaves_homo.sd 13.142298124757328
leaves_per_class.mean 0.3333333333333333
leaves_per_class.sd 0.22222222222222224
nodes 8
nodes_per_attr 2.0
nodes_per_inst 0.05333333333333334
nodes_per_level.mean 1.6
nodes_per_level.sd 0.8944271909999159
nodes_repeated.mean 2.0
nodes_repeated.sd 1.1547005383792515
tree_depth.mean 3.0588235294117645
tree_depth.sd 1.4348601079588785
tree_imbalance.mean 0.19491705385114738
tree_imbalance.sd 0.13300709991513865
tree_shape.mean 0.2708333333333333
tree_shape.sd 0.10711960313126631
var_importance.mean 0.25
var_importance.sd 0.27823897162264016
0%| | 0/1 [00:00<?, ?it/s]
100%|##########| 1/1 [00:00<00:00, 6017.65it/s]
Process of precomputation finished.
0%| | 0/2 [00:00<?, ?it/s]
100%|##########| 2/2 [00:00<00:00, 3238.84it/s]
Process of metafeature extraction finished.
nodes_repeated.histogram.0 0.5
nodes_repeated.histogram.1 0.0
nodes_repeated.histogram.2 0.0
nodes_repeated.histogram.3 0.0
nodes_repeated.histogram.4 0.5
tree_shape.histogram.0 0.2222222222222222
tree_shape.histogram.1 0.5555555555555556
tree_shape.histogram.2 0.0
tree_shape.histogram.3 0.1111111111111111
tree_shape.histogram.4 0.1111111111111111
Total running time of the script: ( 0 minutes 0.013 seconds)
Note
Click here to download the full example code
Using Summaries
In this example we will explain the different ways to select summary functions.
# Load a dataset
from sklearn.datasets import load_iris
from pymfe.mfe import MFE
data = load_iris()
y = data.target
X = data.data
Summary Methods
Several meta-features generate multiple values and mean
and sd
are
the standard method to summary these values. In order to increase the
flexibility, the PyMFE package implemented the summary (or post processing)
methods to deal with multiple measures values. This method is able to deal
with descriptive statistic (resulting in a single value) or a distribution
(resulting in multiple values).
The post processing methods are setted using the parameter summary. It is possible to compute min, max, mean, median, kurtosis, standard deviation, among others. It will be illustrated in the following examples:
Apply several statistical measures as post processing
mfe = MFE(summary=["max", "min", "median", "mean", "var", "sd", "kurtosis",
"skewness"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
attr_conc.kurtosis -0.9474216477983255
attr_conc.max 0.4299566853449739
attr_conc.mean 0.20980476831180148
attr_conc.median 0.18467386404867223
attr_conc.min 0.08478331361536394
attr_conc.sd 0.1195879817732128
attr_conc.skewness 0.7075924186351203
attr_conc.var 0.014301285384590275
attr_ent.kurtosis -1.7072116699243152
attr_ent.max 2.3156530476978263
attr_ent.mean 2.2771912775084115
attr_ent.median 2.3034401979164256
attr_ent.min 2.186231666502969
attr_ent.sd 0.06103943244855649
attr_ent.skewness -0.7209530933492252
attr_ent.var 0.0037258123136418905
attr_to_inst 0.02666666666666667
best_node.kurtosis -3.0
best_node.max 0.6666666666666666
best_node.mean 0.6666666666666667
best_node.median 0.6666666666666666
best_node.min 0.6666666666666666
best_node.sd 1.1702778228589004e-16
best_node.skewness 0.0
best_node.var 1.3695501826753678e-32
can_cor.kurtosis -2.75
can_cor.max 0.9848208927389822
can_cor.mean 0.7280089563896481
can_cor.median 0.7280089563896481
can_cor.min 0.47119702004031394
can_cor.sd 0.3631869233645244
can_cor.skewness -2.5347649085285293e-16
can_cor.var 0.1319047413029889
cat_to_num 0.0
class_conc.kurtosis -2.34680678006496
class_conc.max 0.4011425322248528
class_conc.mean 0.27347384133126745
class_conc.median 0.28650664619878463
class_conc.min 0.11973954070264788
class_conc.sd 0.14091096327223987
class_conc.skewness -0.07091647996659645
class_conc.var 0.019855899570310535
class_ent 1.584962500721156
cor.kurtosis -1.9476130087221712
cor.max 0.9628654314027961
cor.mean 0.594116025760156
cor.median 0.6231906153010576
cor.min 0.11756978413300208
cor.sd 0.3375443182856702
cor.skewness -0.18142911996033195
cor.var 0.11393616680693783
cov.kurtosis -1.9705891027997176
cov.max 1.2956093959731547
cov.mean 0.5966542132736764
cov.median 0.4229635346756151
cov.min 0.042434004474272924
cov.sd 0.5582672431248462
cov.skewness 0.34072276443380106
cov.var 0.3116623147462162
eigenvalues.kurtosis -1.6906307400544616
eigenvalues.max 4.228241706034867
eigenvalues.mean 1.1432392617449672
eigenvalues.median 0.1604401239857764
eigenvalues.min 0.023835092973449445
eigenvalues.sd 2.0587713015069764
eigenvalues.skewness 0.7454458797939764
eigenvalues.var 4.238539271908729
elite_nn.kurtosis -0.4687499999999991
elite_nn.max 1.0
elite_nn.mean 0.9333333333333333
elite_nn.median 0.9333333333333333
elite_nn.min 0.8
elite_nn.sd 0.06285393610547088
elite_nn.skewness -0.7159456159513794
elite_nn.var 0.003950617283950616
eq_num_attr 1.8780672345507194
freq_class.kurtosis -3.0
freq_class.max 0.3333333333333333
freq_class.mean 0.3333333333333333
freq_class.median 0.3333333333333333
freq_class.min 0.3333333333333333
freq_class.sd 0.0
freq_class.skewness 0.0
freq_class.var 0.0
g_mean.kurtosis -1.876087805810185
g_mean.max 5.785720390427728
g_mean.mean 3.2230731578977903
g_mean.median 3.1324323471229167
g_mean.min 0.8417075469176013
g_mean.sd 2.0229431040263726
g_mean.skewness 0.10017663652972701
g_mean.var 4.092298802127854
gravity 3.2082811597489393
h_mean.kurtosis -1.8765954987057685
h_mean.max 5.728905057850834
h_mean.mean 2.9783891110628673
h_mean.median 2.8449903044543063
h_mean.min 0.49467077749202265
h_mean.sd 2.145948231748242
h_mean.skewness 0.1382251313881372
h_mean.var 4.605093813343408
inst_to_attr 37.5
iq_range.kurtosis -1.809694974469229
iq_range.max 3.4999999999999996
iq_range.mean 1.7000000000000002
iq_range.median 1.4000000000000004
iq_range.min 0.5
iq_range.sd 1.2754084313139324
iq_range.skewness 0.485861717653184
iq_range.var 1.626666666666666
joint_ent.kurtosis -2.3945662964722434
joint_ent.max 3.410577680708083
joint_ent.mean 3.0182209990602855
joint_ent.median 2.9901513033202027
joint_ent.min 2.6820037088926547
joint_ent.sd 0.3821875549207214
joint_ent.skewness 0.03611581267545158
joint_ent.var 0.1460673271362794
kurtosis.kurtosis -2.098903711032839
kurtosis.max 0.13870467668072406
kurtosis.mean -0.8105361276250795
kurtosis.median -0.9819958777250918
kurtosis.min -1.4168574317308589
kurtosis.sd 0.7326910069728161
kurtosis.skewness 0.30302223794237043
kurtosis.var 0.5368361116988393
leaves 9
leaves_branch.kurtosis 0.4284461976769669
leaves_branch.max 5
leaves_branch.mean 3.7777777777777777
leaves_branch.median 4.0
leaves_branch.min 1
leaves_branch.sd 1.2018504251546631
leaves_branch.skewness -1.1647123778290422
leaves_branch.var 1.4444444444444444
leaves_corrob.kurtosis -1.7726882865481086
leaves_corrob.max 0.3333333333333333
leaves_corrob.mean 0.1111111111111111
leaves_corrob.median 0.013333333333333334
leaves_corrob.min 0.006666666666666667
leaves_corrob.sd 0.15051762539834182
leaves_corrob.skewness 0.6063813319643286
leaves_corrob.var 0.022655555555555557
leaves_homo.kurtosis -1.1186086765355643
leaves_homo.max 57.6
leaves_homo.mean 37.46666666666667
leaves_homo.median 36.0
leaves_homo.min 18.0
leaves_homo.sd 13.142298124757328
leaves_homo.skewness 0.317544360680112
leaves_homo.var 172.72
leaves_per_class.kurtosis -2.3333333333333335
leaves_per_class.max 0.5555555555555556
leaves_per_class.mean 0.3333333333333333
leaves_per_class.median 0.3333333333333333
leaves_per_class.min 0.1111111111111111
leaves_per_class.sd 0.22222222222222224
leaves_per_class.skewness 2.1076890233118196e-16
leaves_per_class.var 0.04938271604938273
lh_trace 32.477316568194915
linear_discr.kurtosis 1.1714277215943012
linear_discr.max 1.0
linear_discr.mean 0.9800000000000001
linear_discr.median 1.0
linear_discr.min 0.8666666666666667
linear_discr.sd 0.04499657051403685
linear_discr.skewness -1.6391493111228852
linear_discr.var 0.0020246913580246905
mad.kurtosis -1.8614049069823586
mad.max 1.8532499999999998
mad.mean 1.0934175
mad.median 1.03782
mad.min 0.44477999999999973
mad.sd 0.5785781994035033
mad.skewness 0.21354801391337835
mad.var 0.334752732825
max.kurtosis -2.182177604436795
max.max 7.9
max.mean 5.425000000000001
max.median 5.65
max.min 2.5
max.sd 2.4431878083083722
max.skewness -0.13254651618979896
max.var 5.969166666666667
mean.kurtosis -1.9225347042283154
mean.max 5.843333333333334
mean.mean 3.4645000000000006
mean.median 3.407666666666667
mean.min 1.1993333333333336
mean.sd 1.918485079431164
mean.skewness 0.06361261265760602
mean.var 3.680585
median.kurtosis -2.04337146925642
median.max 5.8
median.mean 3.6125000000000003
median.median 3.675
median.min 1.3
median.sd 1.919364043982624
median.skewness -0.061080929963701326
median.var 3.6839583333333326
min.kurtosis -1.9261232920910136
min.max 4.3
min.mean 1.8499999999999999
min.median 1.5
min.min 0.1
min.sd 1.8083141320025125
min.skewness 0.3693439632179755
min.var 3.27
mut_inf.kurtosis -2.303310024453824
mut_inf.max 1.2015788914374017
mut_inf.mean 0.8439327791692818
mut_inf.median 0.9067678693618417
mut_inf.min 0.36061648651604195
mut_inf.sd 0.4222019352579773
mut_inf.skewness -0.11787771034076516
mut_inf.var 0.17825447413558124
naive_bayes.kurtosis -1.1414812611540737
naive_bayes.max 1.0
naive_bayes.mean 0.9533333333333334
naive_bayes.median 0.9333333333333333
naive_bayes.min 0.8666666666666667
naive_bayes.sd 0.04499657051403685
naive_bayes.skewness -0.31221891640435945
naive_bayes.var 0.00202469135802469
nodes 8
nodes_per_attr 2.0
nodes_per_inst 0.05333333333333334
nodes_per_level.kurtosis -1.6700000000000004
nodes_per_level.max 3
nodes_per_level.mean 1.6
nodes_per_level.median 1.0
nodes_per_level.min 1
nodes_per_level.sd 0.8944271909999159
nodes_per_level.skewness 0.603738353924943
nodes_per_level.var 0.8
nodes_repeated.kurtosis -2.333333333333333
nodes_repeated.max 4
nodes_repeated.mean 2.6666666666666665
nodes_repeated.median 3.0
nodes_repeated.min 1
nodes_repeated.sd 1.5275252316519465
nodes_repeated.skewness -0.20782656212951636
nodes_repeated.var 2.333333333333333
nr_attr 4
nr_bin 0
nr_cat 0
nr_class 3
nr_cor_attr 0.5
nr_disc 2
nr_inst 150
nr_norm 1.0
nr_num 4
nr_outliers 1
ns_ratio 1.698308838945616
num_to_cat nan
one_nn.kurtosis -1.3167187500000028
one_nn.max 1.0
one_nn.mean 0.96
one_nn.median 1.0
one_nn.min 0.8666666666666667
one_nn.sd 0.05621826951410451
one_nn.skewness -0.7204063794571065
one_nn.var 0.0031604938271604923
p_trace 1.191898822470078
random_node.kurtosis -3.0
random_node.max 0.6666666666666666
random_node.mean 0.6666666666666667
random_node.median 0.6666666666666666
random_node.min 0.6666666666666666
random_node.sd 1.1702778228589004e-16
random_node.skewness 0.0
random_node.var 1.3695501826753678e-32
range.kurtosis -1.8858268700023024
range.max 5.9
range.mean 3.5750000000000006
range.median 3.0000000000000004
range.min 2.4
range.sd 1.6500000000000001
range.skewness 0.5188872193004419
range.var 2.7225
roy_root 32.191925524310506
sd.kurtosis -1.7876277008883368
sd.max 1.7652982332594662
sd.mean 0.9478670787835934
sd.median 0.7951518984691048
sd.min 0.4358662849366982
sd.sd 0.5712994109375844
sd.skewness 0.541487250344505
sd.var 0.326383016937631
sd_ratio 1.2708666438750897
skewness.kurtosis -2.3196335687878826
skewness.max 0.3126147039228578
skewness.mean 0.06273198447775732
skewness.median 0.10386208214673759
skewness.min -0.26941093030530366
skewness.sd 0.29439896290757683
skewness.skewness -0.10337620962487609
skewness.var 0.08667074936105679
sparsity.kurtosis -2.340915274336955
sparsity.max 0.039048200122025624
sparsity.mean 0.0287147773948895
sparsity.median 0.029555212805869355
sparsity.min 0.016700483845793663
sparsity.sd 0.011032357470087495
sparsity.skewness -0.06436063304281459
sparsity.var 0.00012171291134779536
t_mean.kurtosis -1.9391694118386529
t_mean.max 5.797777777777777
t_mean.mean 3.4705555555555554
t_mean.median 3.4411111111111112
t_mean.min 1.2022222222222223
t_mean.sd 1.9048021402275979
t_mean.skewness 0.0327130494008835
t_mean.var 3.628271193415637
tree_depth.kurtosis -0.7921333239178021
tree_depth.max 5
tree_depth.mean 3.0588235294117645
tree_depth.median 3.0
tree_depth.min 0
tree_depth.sd 1.4348601079588785
tree_depth.skewness -0.5738062414925108
tree_depth.var 2.0588235294117645
tree_imbalance.kurtosis -2.1867984670621965
tree_imbalance.max 0.35355339059327373
tree_imbalance.mean 0.19491705385114738
tree_imbalance.median 0.18313230988382748
tree_imbalance.min 0.05985020504366078
tree_imbalance.sd 0.13300709991513865
tree_imbalance.skewness 0.12675317882685808
tree_imbalance.var 0.017690888627835678
tree_shape.kurtosis -0.28142447562999795
tree_shape.max 0.5
tree_shape.mean 0.2708333333333333
tree_shape.median 0.25
tree_shape.min 0.15625
tree_shape.sd 0.10711960313126631
tree_shape.skewness 0.9140413021207706
tree_shape.var 0.011474609375
var.kurtosis -1.721549473595456
var.max 3.116277852348993
var.mean 1.1432392617449665
var.median 0.6333498881431767
var.min 0.189979418344519
var.sd 1.3325463926454557
var.skewness 0.6911005971389304
var.var 1.7756798885524168
var_importance.kurtosis -1.6933440581861985
var_importance.max 0.9226107085346216
var_importance.mean 0.25
var_importance.median 0.03869464573268919
var_importance.min 0.0
var_importance.sd 0.44925548152944056
var_importance.skewness 0.7416243043271853
var_importance.var 0.20183048768424952
w_lambda 0.023438633222267347
worst_node.kurtosis -1.5739109350454201
worst_node.max 0.6666666666666666
worst_node.mean 0.58
worst_node.median 0.6
worst_node.min 0.4666666666666667
worst_node.sd 0.0773001205818937
worst_node.skewness -0.24632978798366398
worst_node.var 0.005975308641975306
Apply quantile as post processing method
mfe = MFE(features=["cor"], summary=["quantiles"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
cor.quantiles.0 0.11756978413300208
cor.quantiles.1 0.38170447548496433
cor.quantiles.2 0.6231906153010576
cor.quantiles.3 0.8583006134828313
cor.quantiles.4 0.9628654314027961
Apply histogram as post processing method
mfe = MFE(features=["cor"], summary=["histogram"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
cor.histogram.0 0.16666666666666666
cor.histogram.1 0.0
cor.histogram.2 0.16666666666666666
cor.histogram.3 0.16666666666666666
cor.histogram.4 0.0
cor.histogram.5 0.0
cor.histogram.6 0.0
cor.histogram.7 0.0
cor.histogram.8 0.3333333333333333
cor.histogram.9 0.16666666666666666
Get the default values without summarize them
mfe = MFE(features=["cor"], summary=None)
mfe.fit(X, y)
ft = mfe.extract()
print(ft)
(['cor'], [array([0.11756978, 0.87175378, 0.4284401 , 0.81794113, 0.36612593,
0.96286543])])
Total running time of the script: ( 0 minutes 0.304 seconds)
Note
Click here to download the full example code
Select specific measures and summaries
To customize the measure extraction, is necessary to use the feature
and summary attribute. For instance, info-theo and
and statistical
compute the information theoretical and the statistical measures,
respectively. The following examples illustrate how to run specific measues
and summaries from them:
# Load a dataset
from sklearn.datasets import load_iris
from pymfe.mfe import MFE
data = load_iris()
y = data.target
X = data.data
Select specific measures and summaries for statistical
Extracting three information theoretical measures.
mfe = MFE(groups="all", features=["attr_ent", "joint_ent"],
summary=["median", "min", "max"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
attr_ent.max 2.3156530476978263
attr_ent.median 2.3034401979164256
attr_ent.min 2.186231666502969
joint_ent.max 3.410577680708083
joint_ent.median 2.9901513033202027
joint_ent.min 2.6820037088926547
Select specific measures and summaries for info-theo
Extracting two statistical measures.
mfe = MFE(groups="all", features=["can_cor", "cor", "iq_range"],
summary=["median", "min", "max"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
can_cor.max 0.9848208927389822
can_cor.median 0.7280089563896481
can_cor.min 0.47119702004031394
cor.max 0.9628654314027961
cor.median 0.6231906153010576
cor.min 0.11756978413300208
iq_range.max 3.4999999999999996
iq_range.median 1.4000000000000004
iq_range.min 0.5
Select specific measures for both info-theo
and statistical
Extracting five measures.
mfe = MFE(groups="all", features=["attr_ent", "joint_ent", "can_cor", "cor", "iq_range"],
summary=["median", "min", "max"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
attr_ent.max 2.3156530476978263
attr_ent.median 2.3034401979164256
attr_ent.min 2.186231666502969
can_cor.max 0.9848208927389822
can_cor.median 0.7280089563896481
can_cor.min 0.47119702004031394
cor.max 0.9628654314027961
cor.median 0.6231906153010576
cor.min 0.11756978413300208
iq_range.max 3.4999999999999996
iq_range.median 1.4000000000000004
iq_range.min 0.5
joint_ent.max 3.410577680708083
joint_ent.median 2.9901513033202027
joint_ent.min 2.6820037088926547
Total running time of the script: ( 0 minutes 0.090 seconds)
Note
Click here to download the full example code
Basic of meta-features extraction
This example show how to extract meta-features using standard configuration.
Extracting meta-features
The standard way to extract meta-features is using the MFE class. The parameters are the dataset and the group of measures to be extracted. By default, the method extracts general, info-theory, statistical, model-based and landmarking measures. For instance:
from sklearn.datasets import load_iris
from pymfe.mfe import MFE
# Load a dataset
data = load_iris()
y = data.target
X = data.data
Extracting default measures
mfe = MFE()
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
attr_conc.mean 0.20980476831180148
attr_conc.sd 0.1195879817732128
attr_ent.mean 2.2771912775084115
attr_ent.sd 0.06103943244855649
attr_to_inst 0.02666666666666667
best_node.mean 0.6666666666666667
best_node.sd 1.1702778228589004e-16
can_cor.mean 0.7280089563896481
can_cor.sd 0.3631869233645244
cat_to_num 0.0
class_conc.mean 0.27347384133126745
class_conc.sd 0.14091096327223987
class_ent 1.584962500721156
cor.mean 0.594116025760156
cor.sd 0.3375443182856702
cov.mean 0.5966542132736764
cov.sd 0.5582672431248462
eigenvalues.mean 1.1432392617449672
eigenvalues.sd 2.0587713015069764
elite_nn.mean 0.9466666666666667
elite_nn.sd 0.05258737584977435
eq_num_attr 1.8780672345507194
freq_class.mean 0.3333333333333333
freq_class.sd 0.0
g_mean.mean 3.2230731578977903
g_mean.sd 2.0229431040263726
gravity 3.2082811597489393
h_mean.mean 2.9783891110628673
h_mean.sd 2.145948231748242
inst_to_attr 37.5
iq_range.mean 1.7000000000000002
iq_range.sd 1.2754084313139324
joint_ent.mean 3.0182209990602855
joint_ent.sd 0.3821875549207214
kurtosis.mean -0.8105361276250795
kurtosis.sd 0.7326910069728161
leaves 9
leaves_branch.mean 3.7777777777777777
leaves_branch.sd 1.2018504251546631
leaves_corrob.mean 0.1111111111111111
leaves_corrob.sd 0.15051762539834182
leaves_homo.mean 37.46666666666667
leaves_homo.sd 13.142298124757328
leaves_per_class.mean 0.3333333333333333
leaves_per_class.sd 0.22222222222222224
lh_trace 32.477316568194915
linear_discr.mean 0.9800000000000001
linear_discr.sd 0.04499657051403685
mad.mean 1.0934175
mad.sd 0.5785781994035033
max.mean 5.425000000000001
max.sd 2.4431878083083722
mean.mean 3.4645000000000006
mean.sd 1.918485079431164
median.mean 3.6125000000000003
median.sd 1.919364043982624
min.mean 1.8499999999999999
min.sd 1.8083141320025125
mut_inf.mean 0.8439327791692818
mut_inf.sd 0.4222019352579773
naive_bayes.mean 0.9533333333333334
naive_bayes.sd 0.04499657051403685
nodes 8
nodes_per_attr 2.0
nodes_per_inst 0.05333333333333334
nodes_per_level.mean 1.6
nodes_per_level.sd 0.8944271909999159
nodes_repeated.mean 2.0
nodes_repeated.sd 1.4142135623730951
nr_attr 4
nr_bin 0
nr_cat 0
nr_class 3
nr_cor_attr 0.5
nr_disc 2
nr_inst 150
nr_norm 1.0
nr_num 4
nr_outliers 1
ns_ratio 1.698308838945616
num_to_cat nan
one_nn.mean 0.96
one_nn.sd 0.05621826951410451
p_trace 1.191898822470078
random_node.mean 0.6666666666666667
random_node.sd 1.1702778228589004e-16
range.mean 3.5750000000000006
range.sd 1.6500000000000001
roy_root 32.191925524310506
sd.mean 0.9478670787835934
sd.sd 0.5712994109375844
sd_ratio 1.2708666438750897
skewness.mean 0.06273198447775732
skewness.sd 0.29439896290757683
sparsity.mean 0.0287147773948895
sparsity.sd 0.011032357470087495
t_mean.mean 3.4705555555555554
t_mean.sd 1.9048021402275979
tree_depth.mean 3.0588235294117645
tree_depth.sd 1.4348601079588785
tree_imbalance.mean 0.19491705385114738
tree_imbalance.sd 0.13300709991513865
tree_shape.mean 0.2708333333333333
tree_shape.sd 0.10711960313126631
var.mean 1.1432392617449665
var.sd 1.3325463926454557
var_importance.mean 0.25
var_importance.sd 0.4487534065700905
w_lambda 0.023438633222267347
worst_node.mean 0.5866666666666667
worst_node.sd 0.08195150628704786
Extracting general, statistical and information-theoretic measures
mfe = MFE(groups=["general", "statistical", "info-theory"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
attr_conc.mean 0.20980476831180148
attr_conc.sd 0.1195879817732128
attr_ent.mean 2.2771912775084115
attr_ent.sd 0.06103943244855649
attr_to_inst 0.02666666666666667
can_cor.mean 0.7280089563896481
can_cor.sd 0.3631869233645244
cat_to_num 0.0
class_conc.mean 0.27347384133126745
class_conc.sd 0.14091096327223987
class_ent 1.584962500721156
cor.mean 0.594116025760156
cor.sd 0.3375443182856702
cov.mean 0.5966542132736764
cov.sd 0.5582672431248462
eigenvalues.mean 1.1432392617449672
eigenvalues.sd 2.0587713015069764
eq_num_attr 1.8780672345507194
freq_class.mean 0.3333333333333333
freq_class.sd 0.0
g_mean.mean 3.2230731578977903
g_mean.sd 2.0229431040263726
gravity 3.2082811597489393
h_mean.mean 2.9783891110628673
h_mean.sd 2.145948231748242
inst_to_attr 37.5
iq_range.mean 1.7000000000000002
iq_range.sd 1.2754084313139324
joint_ent.mean 3.0182209990602855
joint_ent.sd 0.3821875549207214
kurtosis.mean -0.8105361276250795
kurtosis.sd 0.7326910069728161
lh_trace 32.477316568194915
mad.mean 1.0934175
mad.sd 0.5785781994035033
max.mean 5.425000000000001
max.sd 2.4431878083083722
mean.mean 3.4645000000000006
mean.sd 1.918485079431164
median.mean 3.6125000000000003
median.sd 1.919364043982624
min.mean 1.8499999999999999
min.sd 1.8083141320025125
mut_inf.mean 0.8439327791692818
mut_inf.sd 0.4222019352579773
nr_attr 4
nr_bin 0
nr_cat 0
nr_class 3
nr_cor_attr 0.5
nr_disc 2
nr_inst 150
nr_norm 1.0
nr_num 4
nr_outliers 1
ns_ratio 1.698308838945616
num_to_cat nan
p_trace 1.191898822470078
range.mean 3.5750000000000006
range.sd 1.6500000000000001
roy_root 32.191925524310506
sd.mean 0.9478670787835934
sd.sd 0.5712994109375844
sd_ratio 1.2708666438750897
skewness.mean 0.06273198447775732
skewness.sd 0.29439896290757683
sparsity.mean 0.0287147773948895
sparsity.sd 0.011032357470087495
t_mean.mean 3.4705555555555554
t_mean.sd 1.9048021402275979
var.mean 1.1432392617449665
var.sd 1.3325463926454557
w_lambda 0.023438633222267347
Extracting all measures
mfe = MFE(groups="all")
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
attr_conc.mean 0.20980476831180148
attr_conc.sd 0.1195879817732128
attr_ent.mean 2.2771912775084115
attr_ent.sd 0.06103943244855649
attr_to_inst 0.02666666666666667
best_node.mean 0.6666666666666667
best_node.mean.relative 3.0
best_node.sd 1.1702778228589004e-16
best_node.sd.relative 1.0
c1 0.9999999999999998
c2 0.0
can_cor.mean 0.7280089563896481
can_cor.sd 0.3631869233645244
cat_to_num 0.0
ch 487.33087637489984
class_conc.mean 0.27347384133126745
class_conc.sd 0.14091096327223987
class_ent 1.584962500721156
cls_coef 0.2674506351402339
cohesiveness.mean 67.10333333333334
cohesiveness.sd 5.355733510152213
conceptvar.mean 0.495358313970321
conceptvar.sd 0.07796805526728046
cor.mean 0.594116025760156
cor.sd 0.3375443182856702
cov.mean 0.5966542132736764
cov.sd 0.5582672431248462
density 0.8329306487695749
eigenvalues.mean 1.1432392617449672
eigenvalues.sd 2.0587713015069764
elite_nn.mean 0.9466666666666667
elite_nn.mean.relative 4.0
elite_nn.sd 0.06885303726590962
elite_nn.sd.relative 6.0
eq_num_attr 1.8780672345507194
f1.mean 0.2775641932566493
f1.sd 0.2612622587707819
f1v.mean 0.026799629786085716
f1v.sd 0.03377041736533042
f2.mean 0.0063817663817663794
f2.sd 0.011053543615254369
f3.mean 0.12333333333333334
f3.sd 0.21361959960016152
f4.mean 0.043333333333333335
f4.sd 0.07505553499465135
freq_class.mean 0.3333333333333333
freq_class.sd 0.0
g_mean.mean 3.2230731578977903
g_mean.sd 2.0229431040263726
gravity 3.2082811597489393
h_mean.mean 2.9783891110628673
h_mean.sd 2.145948231748242
hubs.mean 0.7822257352122133
hubs.sd 0.3198336185970707
impconceptvar.mean 42.61
impconceptvar.sd 5.354503216731368
inst_to_attr 37.5
int 3.322592586185653
iq_range.mean 1.7000000000000002
iq_range.sd 1.2754084313139324
joint_ent.mean 3.0182209990602855
joint_ent.sd 0.3821875549207214
kurtosis.mean -0.8105361276250795
kurtosis.sd 0.7326910069728161
l1.mean 0.004328602623988265
l1.sd 0.007497359670523635
l2.mean 0.013333333333333345
l2.sd 0.023094010767585053
l3.mean 0.0
l3.sd 0.0
leaves 9
leaves_branch.mean 3.7777777777777777
leaves_branch.sd 1.2018504251546631
leaves_corrob.mean 0.1111111111111111
leaves_corrob.sd 0.15051762539834182
leaves_homo.mean 37.46666666666667
leaves_homo.sd 13.142298124757328
leaves_per_class.mean 0.3333333333333333
leaves_per_class.sd 0.22222222222222224
lh_trace 32.477316568194915
linear_discr.mean 0.9800000000000001
linear_discr.mean.relative 7.0
linear_discr.sd 0.04499657051403685
linear_discr.sd.relative 2.5
lsc 0.8166666666666667
mad.mean 1.0934175
mad.sd 0.5785781994035033
max.mean 5.425000000000001
max.sd 2.4431878083083722
mean.mean 3.4645000000000006
mean.sd 1.918485079431164
median.mean 3.6125000000000003
median.sd 1.919364043982624
min.mean 1.8499999999999999
min.sd 1.8083141320025125
mut_inf.mean 0.8439327791692818
mut_inf.sd 0.4222019352579773
n1 0.10666666666666667
n2.mean 0.19814444191641126
n2.sd 0.14669333921747651
n3.mean 0.06
n3.sd 0.2382824447791588
n4.mean 0.013333333333333334
n4.sd 0.11508191810497582
naive_bayes.mean 0.9533333333333334
naive_bayes.mean.relative 5.0
naive_bayes.sd 0.04499657051403685
naive_bayes.sd.relative 2.5
nodes 8
nodes_per_attr 2.0
nodes_per_inst 0.05333333333333334
nodes_per_level.mean 1.6
nodes_per_level.sd 0.8944271909999159
nodes_repeated.mean 2.6666666666666665
nodes_repeated.sd 1.5275252316519465
nr_attr 4
nr_bin 0
nr_cat 0
nr_class 3
nr_cor_attr 0.5
nr_disc 2
nr_inst 150
nr_norm 1.0
nr_num 4
nr_outliers 1
nre 1.0986122886681096
ns_ratio 1.698308838945616
num_to_cat nan
one_itemset.mean 0.2
one_itemset.sd 0.04993563108104261
one_nn.mean 0.96
one_nn.mean.relative 6.0
one_nn.sd 0.05621826951410451
one_nn.sd.relative 4.0
p_trace 1.191898822470078
pb -0.68004959585269
random_node.mean 0.5333333333333334
random_node.mean.relative 1.0
random_node.sd 0.06285393610547088
random_node.sd.relative 5.0
range.mean 3.5750000000000006
range.sd 1.6500000000000001
roy_root 32.191925524310506
sc 0
sd.mean 0.9478670787835934
sd.sd 0.5712994109375844
sd_ratio 1.2708666438750897
sil 0.503477440693296
skewness.mean 0.06273198447775732
skewness.sd 0.29439896290757683
sparsity.mean 0.0287147773948895
sparsity.sd 0.011032357470087495
t1.mean 0.007092198581560285
t1.sd 0.002283518026238616
t2 0.02666666666666667
t3 0.013333333333333334
t4 0.5
t_mean.mean 3.4705555555555554
t_mean.sd 1.9048021402275979
tree_depth.mean 3.0588235294117645
tree_depth.sd 1.4348601079588785
tree_imbalance.mean 0.19491705385114738
tree_imbalance.sd 0.13300709991513865
tree_shape.mean 0.2708333333333333
tree_shape.sd 0.10711960313126631
two_itemset.mean 0.32
two_itemset.sd 0.0851125499534728
var.mean 1.1432392617449665
var.sd 1.3325463926454557
var_importance.mean 0.25
var_importance.sd 0.44925548152944056
vdb 0.7513707094756737
vdu 2.3392212858877218e-05
w_lambda 0.023438633222267347
wg_dist.mean 0.4620901765870531
wg_dist.sd 0.05612193762635788
worst_node.mean 0.6000000000000001
worst_node.mean.relative 2.0
worst_node.sd 0.0831479419283098
worst_node.sd.relative 7.0
Changing summarization function
Several measures return more than one value. To aggregate them, post processing methods can be used. It is possible to compute min, max, mean, median, kurtosis, standard deviation, among others. The default methods are the mean and the sd. For instance:
Compute default measures using min, median and max
mfe = MFE(summary=["min", "median", "max"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
attr_conc.max 0.4299566853449739
attr_conc.median 0.18467386404867223
attr_conc.min 0.08478331361536394
attr_ent.max 2.3156530476978263
attr_ent.median 2.3034401979164256
attr_ent.min 2.186231666502969
attr_to_inst 0.02666666666666667
best_node.max 0.6666666666666666
best_node.median 0.6666666666666666
best_node.min 0.6666666666666666
can_cor.max 0.9848208927389822
can_cor.median 0.7280089563896481
can_cor.min 0.47119702004031394
cat_to_num 0.0
class_conc.max 0.4011425322248528
class_conc.median 0.28650664619878463
class_conc.min 0.11973954070264788
class_ent 1.584962500721156
cor.max 0.9628654314027961
cor.median 0.6231906153010576
cor.min 0.11756978413300208
cov.max 1.2956093959731547
cov.median 0.4229635346756151
cov.min 0.042434004474272924
eigenvalues.max 4.228241706034867
eigenvalues.median 0.1604401239857764
eigenvalues.min 0.023835092973449445
elite_nn.max 1.0
elite_nn.median 0.9333333333333333
elite_nn.min 0.8666666666666667
eq_num_attr 1.8780672345507194
freq_class.max 0.3333333333333333
freq_class.median 0.3333333333333333
freq_class.min 0.3333333333333333
g_mean.max 5.785720390427728
g_mean.median 3.1324323471229167
g_mean.min 0.8417075469176013
gravity 3.2082811597489393
h_mean.max 5.728905057850834
h_mean.median 2.8449903044543063
h_mean.min 0.49467077749202265
inst_to_attr 37.5
iq_range.max 3.4999999999999996
iq_range.median 1.4000000000000004
iq_range.min 0.5
joint_ent.max 3.410577680708083
joint_ent.median 2.9901513033202027
joint_ent.min 2.6820037088926547
kurtosis.max 0.13870467668072406
kurtosis.median -0.9819958777250918
kurtosis.min -1.4168574317308589
leaves 9
leaves_branch.max 5
leaves_branch.median 4.0
leaves_branch.min 1
leaves_corrob.max 0.3333333333333333
leaves_corrob.median 0.013333333333333334
leaves_corrob.min 0.006666666666666667
leaves_homo.max 57.6
leaves_homo.median 36.0
leaves_homo.min 18.0
leaves_per_class.max 0.5555555555555556
leaves_per_class.median 0.3333333333333333
leaves_per_class.min 0.1111111111111111
lh_trace 32.477316568194915
linear_discr.max 1.0
linear_discr.median 1.0
linear_discr.min 0.8666666666666667
mad.max 1.8532499999999998
mad.median 1.03782
mad.min 0.44477999999999973
max.max 7.9
max.median 5.65
max.min 2.5
mean.max 5.843333333333334
mean.median 3.407666666666667
mean.min 1.1993333333333336
median.max 5.8
median.median 3.675
median.min 1.3
min.max 4.3
min.median 1.5
min.min 0.1
mut_inf.max 1.2015788914374017
mut_inf.median 0.9067678693618417
mut_inf.min 0.36061648651604195
naive_bayes.max 1.0
naive_bayes.median 0.9333333333333333
naive_bayes.min 0.8666666666666667
nodes 8
nodes_per_attr 2.0
nodes_per_inst 0.05333333333333334
nodes_per_level.max 3
nodes_per_level.median 1.0
nodes_per_level.min 1
nodes_repeated.max 4
nodes_repeated.median 3.0
nodes_repeated.min 1
nr_attr 4
nr_bin 0
nr_cat 0
nr_class 3
nr_cor_attr 0.5
nr_disc 2
nr_inst 150
nr_norm 1.0
nr_num 4
nr_outliers 1
ns_ratio 1.698308838945616
num_to_cat nan
one_nn.max 1.0
one_nn.median 1.0
one_nn.min 0.8666666666666667
p_trace 1.191898822470078
random_node.max 0.6666666666666666
random_node.median 0.6333333333333333
random_node.min 0.5333333333333333
range.max 5.9
range.median 3.0000000000000004
range.min 2.4
roy_root 32.191925524310506
sd.max 1.7652982332594662
sd.median 0.7951518984691048
sd.min 0.4358662849366982
sd_ratio 1.2708666438750897
skewness.max 0.3126147039228578
skewness.median 0.10386208214673759
skewness.min -0.26941093030530366
sparsity.max 0.039048200122025624
sparsity.median 0.029555212805869355
sparsity.min 0.016700483845793663
t_mean.max 5.797777777777777
t_mean.median 3.4411111111111112
t_mean.min 1.2022222222222223
tree_depth.max 5
tree_depth.median 3.0
tree_depth.min 0
tree_imbalance.max 0.35355339059327373
tree_imbalance.median 0.18313230988382748
tree_imbalance.min 0.05985020504366078
tree_shape.max 0.5
tree_shape.median 0.25
tree_shape.min 0.15625
var.max 3.116277852348993
var.median 0.6333498881431767
var.min 0.189979418344519
var_importance.max 0.9226107085346216
var_importance.median 0.03869464573268919
var_importance.min 0.0
w_lambda 0.023438633222267347
worst_node.max 0.6666666666666666
worst_node.median 0.6
worst_node.min 0.4666666666666667
Compute default measures using quantile
mfe = MFE(summary=["quantiles"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
attr_conc.quantiles.0 0.08478331361536394
attr_conc.quantiles.1 0.12024175822191727
attr_conc.quantiles.2 0.18467386404867223
attr_conc.quantiles.3 0.2586088218989373
attr_conc.quantiles.4 0.4299566853449739
attr_ent.quantiles.0 2.186231666502969
attr_ent.quantiles.1 2.2705229913324176
attr_ent.quantiles.2 2.3034401979164256
attr_ent.quantiles.3 2.3101084840924195
attr_ent.quantiles.4 2.3156530476978263
attr_to_inst 0.02666666666666667
best_node.quantiles.0 0.6666666666666666
best_node.quantiles.1 0.6666666666666666
best_node.quantiles.2 0.6666666666666666
best_node.quantiles.3 0.6666666666666666
best_node.quantiles.4 0.6666666666666666
can_cor.quantiles.0 0.47119702004031394
can_cor.quantiles.1 0.599602988214981
can_cor.quantiles.2 0.7280089563896481
can_cor.quantiles.3 0.8564149245643151
can_cor.quantiles.4 0.9848208927389822
cat_to_num 0.0
class_conc.quantiles.0 0.11973954070264788
class_conc.quantiles.1 0.17114963295652744
class_conc.quantiles.2 0.28650664619878463
class_conc.quantiles.3 0.3888308545735246
class_conc.quantiles.4 0.4011425322248528
class_ent 1.584962500721156
cor.quantiles.0 0.11756978413300208
cor.quantiles.1 0.38170447548496433
cor.quantiles.2 0.6231906153010576
cor.quantiles.3 0.8583006134828313
cor.quantiles.4 0.9628654314027961
cov.quantiles.0 0.042434004474272924
cov.quantiles.1 0.17364362416107382
cov.quantiles.2 0.4229635346756151
cov.quantiles.3 1.0848042505592843
cov.quantiles.4 1.2956093959731547
eigenvalues.quantiles.0 0.023835092973449445
eigenvalues.quantiles.1 0.06461589827555177
eigenvalues.quantiles.2 0.1604401239857764
eigenvalues.quantiles.3 1.2390634874551918
eigenvalues.quantiles.4 4.228241706034867
elite_nn.quantiles.0 0.8
elite_nn.quantiles.1 0.8833333333333333
elite_nn.quantiles.2 0.9333333333333333
elite_nn.quantiles.3 1.0
elite_nn.quantiles.4 1.0
eq_num_attr 1.8780672345507194
freq_class.quantiles.0 0.3333333333333333
freq_class.quantiles.1 0.3333333333333333
freq_class.quantiles.2 0.3333333333333333
freq_class.quantiles.3 0.3333333333333333
freq_class.quantiles.4 0.3333333333333333
g_mean.quantiles.0 0.8417075469176013
g_mean.quantiles.1 2.4803752740295706
g_mean.quantiles.2 3.1324323471229167
g_mean.quantiles.3 3.875130230991137
g_mean.quantiles.4 5.785720390427728
gravity 3.2082811597489393
h_mean.quantiles.0 0.49467077749202265
h_mean.quantiles.1 2.1442918448667623
h_mean.quantiles.2 2.8449903044543063
h_mean.quantiles.3 3.679087570650412
h_mean.quantiles.4 5.728905057850834
inst_to_attr 37.5
iq_range.quantiles.0 0.5
iq_range.quantiles.1 1.1000000000000005
iq_range.quantiles.2 1.4000000000000004
iq_range.quantiles.3 2.0
iq_range.quantiles.4 3.4999999999999996
joint_ent.quantiles.0 2.6820037088926547
joint_ent.quantiles.1 2.694685139463199
joint_ent.quantiles.2 2.9901513033202027
joint_ent.quantiles.3 3.3136871629172893
joint_ent.quantiles.4 3.410577680708083
kurtosis.quantiles.0 -1.4168574317308589
kurtosis.quantiles.1 -1.3728487733842385
kurtosis.quantiles.2 -0.9819958777250918
kurtosis.quantiles.3 -0.4196832319659328
kurtosis.quantiles.4 0.13870467668072406
leaves 9
leaves_branch.quantiles.0 1.0
leaves_branch.quantiles.1 4.0
leaves_branch.quantiles.2 4.0
leaves_branch.quantiles.3 4.0
leaves_branch.quantiles.4 5.0
leaves_corrob.quantiles.0 0.006666666666666667
leaves_corrob.quantiles.1 0.006666666666666667
leaves_corrob.quantiles.2 0.013333333333333334
leaves_corrob.quantiles.3 0.2866666666666667
leaves_corrob.quantiles.4 0.3333333333333333
leaves_homo.quantiles.0 18.0
leaves_homo.quantiles.1 36.0
leaves_homo.quantiles.2 36.0
leaves_homo.quantiles.3 36.0
leaves_homo.quantiles.4 57.6
leaves_per_class.quantiles.0 0.1111111111111111
leaves_per_class.quantiles.1 0.2222222222222222
leaves_per_class.quantiles.2 0.3333333333333333
leaves_per_class.quantiles.3 0.4444444444444444
leaves_per_class.quantiles.4 0.5555555555555556
lh_trace 32.477316568194915
linear_discr.quantiles.0 0.8666666666666667
linear_discr.quantiles.1 1.0
linear_discr.quantiles.2 1.0
linear_discr.quantiles.3 1.0
linear_discr.quantiles.4 1.0
mad.quantiles.0 0.44477999999999973
mad.quantiles.1 0.8895599999999999
mad.quantiles.2 1.03782
mad.quantiles.3 1.2416775000000002
mad.quantiles.4 1.8532499999999998
max.quantiles.0 2.5
max.quantiles.1 3.9250000000000003
max.quantiles.2 5.65
max.quantiles.3 7.15
max.quantiles.4 7.9
mean.quantiles.0 1.1993333333333336
mean.quantiles.1 2.5928333333333335
mean.quantiles.2 3.407666666666667
mean.quantiles.3 4.279333333333334
mean.quantiles.4 5.843333333333334
median.quantiles.0 1.3
median.quantiles.1 2.575
median.quantiles.2 3.675
median.quantiles.3 4.7124999999999995
median.quantiles.4 5.8
min.quantiles.0 0.1
min.quantiles.1 0.775
min.quantiles.2 1.5
min.quantiles.3 2.575
min.quantiles.4 4.3
mut_inf.quantiles.0 0.36061648651604195
mut_inf.quantiles.1 0.5545730402029787
mut_inf.quantiles.2 0.9067678693618417
mut_inf.quantiles.3 1.196127608328145
mut_inf.quantiles.4 1.2015788914374017
naive_bayes.quantiles.0 0.8666666666666667
naive_bayes.quantiles.1 0.9333333333333333
naive_bayes.quantiles.2 0.9333333333333333
naive_bayes.quantiles.3 1.0
naive_bayes.quantiles.4 1.0
nodes 8
nodes_per_attr 2.0
nodes_per_inst 0.05333333333333334
nodes_per_level.quantiles.0 1.0
nodes_per_level.quantiles.1 1.0
nodes_per_level.quantiles.2 1.0
nodes_per_level.quantiles.3 2.0
nodes_per_level.quantiles.4 3.0
nodes_repeated.quantiles.0 1.0
nodes_repeated.quantiles.1 2.0
nodes_repeated.quantiles.2 3.0
nodes_repeated.quantiles.3 3.5
nodes_repeated.quantiles.4 4.0
nr_attr 4
nr_bin 0
nr_cat 0
nr_class 3
nr_cor_attr 0.5
nr_disc 2
nr_inst 150
nr_norm 1.0
nr_num 4
nr_outliers 1
ns_ratio 1.698308838945616
num_to_cat nan
one_nn.quantiles.0 0.8666666666666667
one_nn.quantiles.1 0.9333333333333333
one_nn.quantiles.2 1.0
one_nn.quantiles.3 1.0
one_nn.quantiles.4 1.0
p_trace 1.191898822470078
random_node.quantiles.0 0.5333333333333333
random_node.quantiles.1 0.6
random_node.quantiles.2 0.6333333333333333
random_node.quantiles.3 0.6666666666666666
random_node.quantiles.4 0.6666666666666666
range.quantiles.0 2.4
range.quantiles.1 2.4000000000000004
range.quantiles.2 3.0000000000000004
range.quantiles.3 4.175000000000001
range.quantiles.4 5.9
roy_root 32.191925524310506
sd.quantiles.0 0.4358662849366982
sd.quantiles.1 0.6806448229544344
sd.quantiles.2 0.7951518984691048
sd.quantiles.3 1.0623741542982639
sd.quantiles.4 1.7652982332594662
sd_ratio 1.2708666438750897
skewness.quantiles.0 -0.26941093030530366
skewness.quantiles.1 -0.14304015654817034
skewness.quantiles.2 0.10386208214673759
skewness.quantiles.3 0.30963422317266526
skewness.quantiles.4 0.3126147039228578
sparsity.quantiles.0 0.016700483845793663
sparsity.quantiles.1 0.020713951258667974
sparsity.quantiles.2 0.029555212805869355
sparsity.quantiles.3 0.03755603894209088
sparsity.quantiles.4 0.039048200122025624
t_mean.quantiles.0 1.2022222222222223
t_mean.quantiles.1 2.5805555555555553
t_mean.quantiles.2 3.4411111111111112
t_mean.quantiles.3 4.331111111111111
t_mean.quantiles.4 5.797777777777777
tree_depth.quantiles.0 0.0
tree_depth.quantiles.1 2.0
tree_depth.quantiles.2 3.0
tree_depth.quantiles.3 4.0
tree_depth.quantiles.4 5.0
tree_imbalance.quantiles.0 0.05985020504366078
tree_imbalance.quantiles.1 0.10093168031135315
tree_imbalance.quantiles.2 0.18313230988382748
tree_imbalance.quantiles.3 0.2771176834236217
tree_imbalance.quantiles.4 0.35355339059327373
tree_shape.quantiles.0 0.15625
tree_shape.quantiles.1 0.25
tree_shape.quantiles.2 0.25
tree_shape.quantiles.3 0.25
tree_shape.quantiles.4 0.5
var.quantiles.0 0.189979418344519
var.quantiles.1 0.4832495525727069
var.quantiles.2 0.6333498881431767
var.quantiles.3 1.2933395973154362
var.quantiles.4 3.116277852348993
var_importance.quantiles.0 0.0
var_importance.quantiles.1 0.009999999999999997
var_importance.quantiles.2 0.2179720209339774
var_importance.quantiles.3 0.45797202093397743
var_importance.quantiles.4 0.5640559581320451
w_lambda 0.023438633222267347
worst_node.quantiles.0 0.4666666666666667
worst_node.quantiles.1 0.6
worst_node.quantiles.2 0.6333333333333333
worst_node.quantiles.3 0.6666666666666666
worst_node.quantiles.4 0.6666666666666666
Total running time of the script: ( 0 minutes 1.418 seconds)
Note
Click here to download the full example code
Extracting meta-features by group
In this example, we will show you how to select different meta-features groups.
# Load a dataset
from sklearn.datasets import load_iris
from pymfe.mfe import MFE
data = load_iris()
y = data.target
X = data.data
General
These are the most simple measures for extracting general properties of the
datasets. For instance, nr_attr
and nr_class
are the total number of
attributes in the dataset and the number of output values (classes) in the
dataset, respectively. The following examples illustrate these measures:
Extract all general measures
mfe = MFE(groups=["general"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
attr_to_inst 0.02666666666666667
cat_to_num 0.0
freq_class.mean 0.3333333333333333
freq_class.sd 0.0
inst_to_attr 37.5
nr_attr 4
nr_bin 0
nr_cat 0
nr_class 3
nr_inst 150
nr_num 4
num_to_cat nan
Extract only two general measures
mfe = MFE(features=["nr_attr", "nr_class"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
nr_attr 4
nr_class 3
Statistical
Statistical meta-features are the standard statistical measures to describe
the numerical properties of a distribution of data. As it requires only
numerical attributes, the categorical data are transformed to numerical. For
instance, cor_cor
and skewness
are the absolute correlation between
of each pair of attributes and the skewness of the numeric attributes in the
dataset, respectively. The following examples illustrate these measures:
Extract all statistical measures
mfe = MFE(groups=["statistical"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
can_cor.mean 0.7280089563896481
can_cor.sd 0.3631869233645244
cor.mean 0.594116025760156
cor.sd 0.3375443182856702
cov.mean 0.5966542132736764
cov.sd 0.5582672431248462
eigenvalues.mean 1.1432392617449672
eigenvalues.sd 2.0587713015069764
g_mean.mean 3.2230731578977903
g_mean.sd 2.0229431040263726
gravity 3.2082811597489393
h_mean.mean 2.9783891110628673
h_mean.sd 2.145948231748242
iq_range.mean 1.7000000000000002
iq_range.sd 1.2754084313139324
kurtosis.mean -0.8105361276250795
kurtosis.sd 0.7326910069728161
lh_trace 32.477316568194915
mad.mean 1.0934175
mad.sd 0.5785781994035033
max.mean 5.425000000000001
max.sd 2.4431878083083722
mean.mean 3.4645000000000006
mean.sd 1.918485079431164
median.mean 3.6125000000000003
median.sd 1.919364043982624
min.mean 1.8499999999999999
min.sd 1.8083141320025125
nr_cor_attr 0.5
nr_disc 2
nr_norm 1.0
nr_outliers 1
p_trace 1.191898822470078
range.mean 3.5750000000000006
range.sd 1.6500000000000001
roy_root 32.191925524310506
sd.mean 0.9478670787835934
sd.sd 0.5712994109375844
sd_ratio 1.2708666438750897
skewness.mean 0.06273198447775732
skewness.sd 0.29439896290757683
sparsity.mean 0.0287147773948895
sparsity.sd 0.011032357470087495
t_mean.mean 3.4705555555555554
t_mean.sd 1.9048021402275979
var.mean 1.1432392617449665
var.sd 1.3325463926454557
w_lambda 0.023438633222267347
Extract only two statistical measures
mfe = MFE(features=["can_cor", "cor", "iq_range"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
can_cor.mean 0.7280089563896481
can_cor.sd 0.3631869233645244
cor.mean 0.594116025760156
cor.sd 0.3375443182856702
iq_range.mean 1.7000000000000002
iq_range.sd 1.2754084313139324
Information theory
Information theory meta-features are particularly appropriate to
describe discrete (categorical) attributes, but they also fit continuous ones
using a discretization process. These measures are based on information
theory. For instance, class_ent
and mut_inf
are the entropy of the
class and the common information shared between each attribute and the
class in the dataset, respectively. The following examples illustrate these
measures:
Extract all info-theory measures
mfe = MFE(groups=["info-theory"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
attr_conc.mean 0.20980476831180148
attr_conc.sd 0.1195879817732128
attr_ent.mean 2.2771912775084115
attr_ent.sd 0.06103943244855649
class_conc.mean 0.27347384133126745
class_conc.sd 0.14091096327223987
class_ent 1.584962500721156
eq_num_attr 1.8780672345507194
joint_ent.mean 3.0182209990602855
joint_ent.sd 0.3821875549207214
mut_inf.mean 0.8439327791692818
mut_inf.sd 0.4222019352579773
ns_ratio 1.698308838945616
Extract only two info-theo measures
mfe = MFE(features=["class_ent", "mut_inf"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
class_ent 1.584962500721156
mut_inf.mean 0.8439327791692818
mut_inf.sd 0.4222019352579773
Model-based
These measures describe characteristics of the investigated models. These
meta-features can include, for example, the description of the Decision Tree
induced for a dataset, like its number of leaves (leaves
) and the number
of nodes (nodes
) of the tree. The following examples illustrate these
measures:
Extract all model-based measures
mfe = MFE(groups=["model-based"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
leaves 9
leaves_branch.mean 3.7777777777777777
leaves_branch.sd 1.2018504251546631
leaves_corrob.mean 0.1111111111111111
leaves_corrob.sd 0.15051762539834182
leaves_homo.mean 37.46666666666667
leaves_homo.sd 13.142298124757328
leaves_per_class.mean 0.3333333333333333
leaves_per_class.sd 0.22222222222222224
nodes 8
nodes_per_attr 2.0
nodes_per_inst 0.05333333333333334
nodes_per_level.mean 1.6
nodes_per_level.sd 0.8944271909999159
nodes_repeated.mean 2.6666666666666665
nodes_repeated.sd 1.5275252316519465
tree_depth.mean 3.0588235294117645
tree_depth.sd 1.4348601079588785
tree_imbalance.mean 0.19491705385114738
tree_imbalance.sd 0.13300709991513865
tree_shape.mean 0.2708333333333333
tree_shape.sd 0.10711960313126631
var_importance.mean 0.25
var_importance.sd 0.44925548152944056
Extract only two model-based measures
mfe = MFE(features=["leaves", "nodes"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
leaves 9
nodes 8
Landmarking
Landmarking measures are simple and fast algorithms, from which performance
characteristics can be extracted. These measures include the performance of
simple and efficient learning algorithms like Naive Bayes (naive_bayes
)
and 1-Nearest Neighbor (one_nn
). The following examples illustrate these
measures:
Extract all landmarking measures
mfe = MFE(groups=["landmarking"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
best_node.mean 0.6666666666666667
best_node.sd 1.1702778228589004e-16
elite_nn.mean 0.9400000000000001
elite_nn.sd 0.05837300238472753
linear_discr.mean 0.9800000000000001
linear_discr.sd 0.04499657051403685
naive_bayes.mean 0.9533333333333334
naive_bayes.sd 0.04499657051403685
one_nn.mean 0.96
one_nn.sd 0.05621826951410451
random_node.mean 0.6666666666666667
random_node.sd 1.1702778228589004e-16
worst_node.mean 0.5533333333333333
worst_node.sd 0.0773001205818937
Extract only two landmarking measures
mfe = MFE(features=["one_nn", "naive_bayes"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
naive_bayes.mean 0.9533333333333334
naive_bayes.sd 0.04499657051403685
one_nn.mean 0.96
one_nn.sd 0.05621826951410451
Relative Landmarking
Relative Landmarking measures are simple and fast algorithms, from which performance characteristics can be extracted. But different from landmarking, a rank is returned where the best performance is the first ranked and the worst the last one ranked.
Extract all relative landmarking measures
mfe = MFE(groups=["relative"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
best_node.mean.relative 2.5
best_node.sd.relative 1.5
elite_nn.mean.relative 4.0
elite_nn.sd.relative 6.5
linear_discr.mean.relative 7.0
linear_discr.sd.relative 3.5
naive_bayes.mean.relative 5.0
naive_bayes.sd.relative 3.5
one_nn.mean.relative 6.0
one_nn.sd.relative 5.0
random_node.mean.relative 2.5
random_node.sd.relative 1.5
worst_node.mean.relative 1.0
worst_node.sd.relative 6.5
Subsampling Landmarking
Subsampling Landmarking measures are simple and fast algorithms, from which performance characteristics can be extracted. Nevertheless, different from landmarking, the performance is computed from a subsample of dataset.
Extract all subsampling landmarking measures
mfe = MFE(groups=["landmarking"], lm_sample_frac=0.7)
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
best_node.mean 0.6754545454545454
best_node.sd 0.051967709123569066
elite_nn.mean 0.9045454545454547
elite_nn.sd 0.0902246965512691
linear_discr.mean 0.990909090909091
linear_discr.sd 0.02874797872880346
naive_bayes.mean 0.9427272727272727
naive_bayes.sd 0.06542227038166469
one_nn.mean 0.9627272727272727
one_nn.sd 0.04819039374799481
random_node.mean 0.4445454545454545
random_node.sd 0.10518687729554597
worst_node.mean 0.6109090909090907
worst_node.sd 0.08395634879634749
Clustering
Clustering measures are based in clusteing algorithm, and clustering correlation and dissimilarity measures.
Extract all clustering based measures
mfe = MFE(groups=["clustering"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
ch 487.33087637489984
int 3.322592586185653
nre 1.0986122886681096
pb -0.68004959585269
sc 0
sil 0.503477440693296
vdb 0.7513707094756737
vdu 2.3392212858877218e-05
Concept
Concept measures estimate the variability of class labels among examples and the examples density.
Extract all concept measures
mfe = MFE(groups=["concept"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
cohesiveness.mean 67.10333333333334
cohesiveness.sd 5.355733510152213
conceptvar.mean 0.495358313970321
conceptvar.sd 0.07796805526728046
impconceptvar.mean 42.61
impconceptvar.sd 5.354503216731368
wg_dist.mean 0.4620901765870531
wg_dist.sd 0.05612193762635788
Itemset
The Itemset computes the correlation between binary attributes.
Extract all itemset measures
mfe = MFE(groups=["itemset"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
one_itemset.mean 0.2
one_itemset.sd 0.04993563108104261
two_itemset.mean 0.32
two_itemset.sd 0.0851125499534728
Complexity
The complexity measures estimate the difficulty in separating the data points into their expected classes.
Extract all complexity measures
mfe = MFE(groups=["complexity"])
mfe.fit(X, y)
ft = mfe.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
c1 0.9999999999999998
c2 0.0
cls_coef 0.2674506351402339
density 0.8329306487695749
f1.mean 0.2775641932566493
f1.sd 0.2612622587707819
f1v.mean 0.026799629786085716
f1v.sd 0.03377041736533042
f2.mean 0.0063817663817663794
f2.sd 0.011053543615254369
f3.mean 0.12333333333333334
f3.sd 0.21361959960016152
f4.mean 0.043333333333333335
f4.sd 0.07505553499465135
hubs.mean 0.7822257352122133
hubs.sd 0.3198336185970707
l1.mean 0.004338258439810357
l1.sd 0.007514084034116028
l2.mean 0.013333333333333345
l2.sd 0.023094010767585053
l3.mean 0.003333333333333336
l3.sd 0.005773502691896263
lsc 0.8166666666666667
n1 0.10666666666666667
n2.mean 0.19814444191641126
n2.sd 0.14669333921747651
n3.mean 0.06
n3.sd 0.2382824447791588
n4.mean 0.0
n4.sd 0.0
t1.mean 0.007092198581560285
t1.sd 0.002283518026238616
t2 0.02666666666666667
t3 0.013333333333333334
t4 0.5
Total running time of the script: ( 0 minutes 0.720 seconds)
Advanced Examples
These examples will show you how to use some advanced configurations and tricks to code more comfortable.
Note
Click here to download the full example code
Customizing measures arguments
In this example we will show you how to custorize the measures.
# Load a dataset
from sklearn.datasets import load_iris
from pymfe.mfe import MFE
data = load_iris()
y = data.target
X = data.data
Custom Arguments
It is possible to pass custom arguments to every meta-feature using PyMFE
extract method kwargs. The keywords must be the target meta-feature name, and
the value must be a dictionary in the format {argument: value}, i.e., each
key in the dictionary is a target argument with its respective value. In the
example below, the extraction of metafeatures min
and max
happens as
usual, but the meta-features sd
, nr_norm
and nr_cor_attr
will
receive user custom argument values, which will interfere in each metafeature
result.
# Extract measures with custom user arguments
mfe = MFE(features=["sd", "nr_norm", "nr_cor_attr", "min", "max"])
mfe.fit(X, y)
ft = mfe.extract(
sd={"ddof": 0},
nr_norm={"method": "all", "failure": "hard", "threshold": 0.025},
nr_cor_attr={"threshold": 0.6},
)
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
max.mean 5.425000000000001
max.sd 2.1158627082114756
min.mean 1.8499999999999999
min.sd 1.5660459763365826
nr_cor_attr 0.5
nr_norm 1.0
sd.mean 0.9447022382995245
sd.sd 0.4931078458294242
Total running time of the script: ( 0 minutes 0.011 seconds)
Note
Click here to download the full example code
Meta-feature confidence interval
In this example, we will show you how to extract meta-features with confidence interval.
Began the metafeature extraction with confidence intervals process.
Now extracting metafeatures from original sample.
Done extracting metafeatures from original sample (total of 7 metafeatures).
Started data resampling with bootstrap with the following configurations:
| Total data resamples: 256
| Confidence levels used: [0.99] (total of 1).
. Random seeds:
| For extractor model: 1234
. For bootstrapping: None
Now extracting metafeatures from resampled data.
0%| | 0/256 [00:00<?, ?it/s]
8%|8 | 21/256 [00:00<00:01, 202.40it/s]
16%|#6 | 42/256 [00:00<00:01, 202.64it/s]
25%|##4 | 63/256 [00:00<00:00, 202.53it/s]
33%|###2 | 84/256 [00:00<00:00, 202.04it/s]/home/docs/checkouts/readthedocs.org/user_builds/pymfe/envs/latest/lib/python3.7/site-packages/pymfe/_internal.py:1568: UserWarning: It is not possible make equal discretization
warnings.warn("It is not possible make equal discretization")
41%|####1 | 105/256 [00:00<00:00, 200.95it/s]
49%|####9 | 126/256 [00:00<00:00, 201.55it/s]
57%|#####7 | 147/256 [00:00<00:00, 201.21it/s]/home/docs/checkouts/readthedocs.org/user_builds/pymfe/envs/latest/lib/python3.7/site-packages/pymfe/_internal.py:1568: UserWarning: It is not possible make equal discretization
warnings.warn("It is not possible make equal discretization")
/home/docs/checkouts/readthedocs.org/user_builds/pymfe/envs/latest/lib/python3.7/site-packages/pymfe/_internal.py:1568: UserWarning: It is not possible make equal discretization
warnings.warn("It is not possible make equal discretization")
66%|######5 | 168/256 [00:00<00:00, 201.27it/s]/home/docs/checkouts/readthedocs.org/user_builds/pymfe/envs/latest/lib/python3.7/site-packages/pymfe/_internal.py:1568: UserWarning: It is not possible make equal discretization
warnings.warn("It is not possible make equal discretization")
74%|#######3 | 189/256 [00:00<00:00, 201.36it/s]
82%|########2 | 210/256 [00:01<00:00, 201.49it/s]
90%|######### | 231/256 [00:01<00:00, 201.45it/s]
98%|#########8| 252/256 [00:01<00:00, 201.70it/s]
100%|##########| 256/256 [00:01<00:00, 201.50it/s]
Done extracting metafeatures from resampled data.
Finished data resampling with bootstrap.
Now calculating confidence intervals... Done.
max.mean 5.425000000000001 5.600000000000001
max.sd 2.3630154293776005 2.5506030820292405
mean.mean 3.3116583333333347 3.601854166666668
mean.sd 1.8734800708370076 1.9621468909848063
nr_cor_attr 0.16666666666666674 0.5
sd.mean 0.8874615748303925 1.012106432730215
sd.sd 0.5227226500340972 0.6299378732590926
# Load a dataset
import sklearn.tree
from sklearn.datasets import load_iris
from pymfe.mfe import MFE
data = load_iris()
y = data.target
X = data.data
# You can also extract your meta-features with confidence intervals using
# bootstrap. Keep in mind that this method extracts each meta-feature several
# times, and may be very expensive depending mainly on your data and the
# number of meta-feature extract methods called.
# Extract meta-features with confidence interval
mfe = MFE(features=["mean", "nr_cor_attr", "sd", "max"])
mfe.fit(X, y)
ft = mfe.extract_with_confidence(
sample_num=256,
confidence=0.99,
verbose=1,
)
print("\n".join("{:50} {:30} {:30}".format(x, y[0], y[1])
for x, y in zip(ft[0], ft[2])))
Total running time of the script: ( 0 minutes 1.285 seconds)
Miscellaneous Examples
Miscellaneous examples for the pymfe package.

Listing available metafeatures, groups, and summaries

Plotting elapsed time in a meta-feature extraction
Note
Click here to download the full example code
Extracting large number of metafeatures
In this example, we will extract all possible metafeatures from the Iris dataset.
from sklearn.datasets import load_iris
from pymfe.mfe import MFE
# Load a dataset
data = load_iris()
y = data.target
X = data.data
Using standard parameters, we will get only a few metafeatures. They are most commonly used in the community.
mfe = MFE()
mfe.fit(X, y)
ft = mfe.extract()
print(len(ft[0]))
111
Using the value all
you can extract all available metafeatures. For
this, set the groups
and summary
with all
.
mfe = MFE(groups="all", summary="all")
mfe.fit(X, y)
ft = mfe.extract()
print(len(ft[0]))
3988
Note
Be careful when using all the metafeatures because you can bring to meta-level the curse of dimensionality.
Total running time of the script: ( 0 minutes 0.902 seconds)
Note
Click here to download the full example code
Metafeature description
In this example, we will show you how to list the types of metafeatures, groups, and summaries available.
from pymfe.mfe import MFE
This function shows the description of all metafeatures.
MFE.metafeature_description()
+-------------+-------------------+--------------------------------------------+
| Group | Meta-feature name | Description |
+=============+===================+============================================+
| clustering | ch | Compute the Calinski and Harabasz index. |
+-------------+-------------------+--------------------------------------------+
| clustering | int | Compute the INT index. |
+-------------+-------------------+--------------------------------------------+
| clustering | nre | Compute the normalized relative entropy. |
+-------------+-------------------+--------------------------------------------+
| clustering | pb | Compute the pearson correlation between |
| | | class matching and instance distances. |
+-------------+-------------------+--------------------------------------------+
| clustering | sc | Compute the number of clusters with size |
| | | smaller than a given size. |
+-------------+-------------------+--------------------------------------------+
| clustering | sil | Compute the mean silhouette value. |
+-------------+-------------------+--------------------------------------------+
| clustering | vdb | Compute the Davies and Bouldin Index. |
+-------------+-------------------+--------------------------------------------+
| clustering | vdu | Compute the Dunn Index. |
+-------------+-------------------+--------------------------------------------+
| complexity | c1 | Compute the entropy of class proportions. |
+-------------+-------------------+--------------------------------------------+
| complexity | c2 | Compute the imbalance ratio. |
+-------------+-------------------+--------------------------------------------+
| complexity | cls_coef | Clustering coefficient. |
+-------------+-------------------+--------------------------------------------+
| complexity | density | Average density of the network. |
+-------------+-------------------+--------------------------------------------+
| complexity | f1 | Maximum Fisher's discriminant ratio. |
+-------------+-------------------+--------------------------------------------+
| complexity | f1v | Directional-vector maximum Fisher's |
| | | discriminant ratio. |
+-------------+-------------------+--------------------------------------------+
| complexity | f2 | Volume of the overlapping region. |
+-------------+-------------------+--------------------------------------------+
| complexity | f3 | Compute feature maximum individual |
| | | efficiency. |
+-------------+-------------------+--------------------------------------------+
| complexity | f4 | Compute the collective feature efficiency. |
+-------------+-------------------+--------------------------------------------+
| complexity | hubs | Hub score. |
+-------------+-------------------+--------------------------------------------+
| complexity | l1 | Sum of error distance by linear |
| | | programming. |
+-------------+-------------------+--------------------------------------------+
| complexity | l2 | Compute the OVO subsets error rate of |
| | | linear classifier. |
+-------------+-------------------+--------------------------------------------+
| complexity | l3 | Non-Linearity of a linear classifier. |
+-------------+-------------------+--------------------------------------------+
| complexity | lsc | Local set average cardinality. |
+-------------+-------------------+--------------------------------------------+
| complexity | n1 | Compute the fraction of borderline points. |
+-------------+-------------------+--------------------------------------------+
| complexity | n2 | Ratio of intra and extra class nearest |
| | | neighbor distance. |
+-------------+-------------------+--------------------------------------------+
| complexity | n3 | Error rate of the nearest neighbor |
| | | classifier. |
+-------------+-------------------+--------------------------------------------+
| complexity | n4 | Compute the non-linearity of the k-NN |
| | | Classifier. |
+-------------+-------------------+--------------------------------------------+
| complexity | t1 | Fraction of hyperspheres covering data. |
+-------------+-------------------+--------------------------------------------+
| complexity | t2 | Compute the average number of features per |
| | | dimension. |
+-------------+-------------------+--------------------------------------------+
| complexity | t3 | Compute the average number of PCA |
| | | dimensions per points. |
+-------------+-------------------+--------------------------------------------+
| complexity | t4 | Compute the ratio of the PCA dimension to |
| | | the original dimension. |
+-------------+-------------------+--------------------------------------------+
| concept | cohesiveness | Compute the improved version of the |
| | | weighted distance, that captures how dense |
| | | or sparse is the example distribution. |
+-------------+-------------------+--------------------------------------------+
| concept | conceptvar | Compute the concept variation that |
| | | estimates the variability of class labels |
| | | among examples. |
+-------------+-------------------+--------------------------------------------+
| concept | impconceptvar | Compute the improved concept variation |
| | | that estimates the variability of class |
| | | labels among examples. |
+-------------+-------------------+--------------------------------------------+
| concept | wg_dist | Compute the weighted distance, that |
| | | captures how dense or sparse is the |
| | | example distribution. |
+-------------+-------------------+--------------------------------------------+
| info-theory | attr_conc | Compute concentration coef. of each pair |
| | | of distinct attributes. |
+-------------+-------------------+--------------------------------------------+
| info-theory | attr_ent | Compute Shannon's entropy for each |
| | | predictive attribute. |
+-------------+-------------------+--------------------------------------------+
| info-theory | class_conc | Compute concentration coefficient between |
| | | each attribute and class. |
+-------------+-------------------+--------------------------------------------+
| info-theory | class_ent | Compute target attribute Shannon's |
| | | entropy. |
+-------------+-------------------+--------------------------------------------+
| info-theory | eq_num_attr | Compute the number of attributes |
| | | equivalent for a predictive task. |
+-------------+-------------------+--------------------------------------------+
| info-theory | joint_ent | Compute the joint entropy between each |
| | | attribute and class. |
+-------------+-------------------+--------------------------------------------+
| info-theory | mut_inf | Compute the mutual information between |
| | | each attribute and target. |
+-------------+-------------------+--------------------------------------------+
| info-theory | ns_ratio | Compute the noisiness of attributes. |
+-------------+-------------------+--------------------------------------------+
| landmarking | best_node | Performance of a the best single decision |
| | | tree node. |
+-------------+-------------------+--------------------------------------------+
| landmarking | elite_nn | Performance of Elite Nearest Neighbor. |
+-------------+-------------------+--------------------------------------------+
| landmarking | linear_discr | Performance of the Linear Discriminant |
| | | classifier. |
+-------------+-------------------+--------------------------------------------+
| landmarking | naive_bayes | Performance of the Naive Bayes classifier. |
+-------------+-------------------+--------------------------------------------+
| landmarking | one_nn | Performance of the 1-Nearest Neighbor |
| | | classifier. |
+-------------+-------------------+--------------------------------------------+
| landmarking | random_node | Performance of the single decision tree |
| | | node model induced by a random attribute. |
+-------------+-------------------+--------------------------------------------+
| landmarking | worst_node | Performance of the single decision tree |
| | | node model induced by the worst |
| | | informative attribute. |
+-------------+-------------------+--------------------------------------------+
| general | attr_to_inst | Compute the ratio between the number of |
| | | attributes. |
+-------------+-------------------+--------------------------------------------+
| general | cat_to_num | Compute the ratio between the number of |
| | | categoric and numeric features. |
+-------------+-------------------+--------------------------------------------+
| general | freq_class | Compute the relative frequency of each |
| | | distinct class. |
+-------------+-------------------+--------------------------------------------+
| general | inst_to_attr | Compute the ratio between the number of |
| | | instances and attributes. |
+-------------+-------------------+--------------------------------------------+
| general | nr_attr | Compute the total number of attributes. |
+-------------+-------------------+--------------------------------------------+
| general | nr_bin | Compute the number of binary attributes. |
+-------------+-------------------+--------------------------------------------+
| general | nr_cat | Compute the number of categorical |
| | | attributes. |
+-------------+-------------------+--------------------------------------------+
| general | nr_class | Compute the number of distinct classes. |
+-------------+-------------------+--------------------------------------------+
| general | nr_inst | Compute the number of instances (rows) in |
| | | the dataset. |
+-------------+-------------------+--------------------------------------------+
| general | nr_num | Compute the number of numeric features. |
+-------------+-------------------+--------------------------------------------+
| general | num_to_cat | Compute the number of numerical and |
| | | categorical features. |
+-------------+-------------------+--------------------------------------------+
| statistical | can_cor | Compute canonical correlations of data. |
+-------------+-------------------+--------------------------------------------+
| statistical | cor | Compute the absolute value of the |
| | | correlation of distinct dataset column |
| | | pairs. |
+-------------+-------------------+--------------------------------------------+
| statistical | cov | Compute the absolute value of the |
| | | covariance of distinct dataset attribute |
| | | pairs. |
+-------------+-------------------+--------------------------------------------+
| statistical | eigenvalues | Compute the eigenvalues of covariance |
| | | matrix from dataset. |
+-------------+-------------------+--------------------------------------------+
| statistical | g_mean | Compute the geometric mean of each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | gravity | Compute the distance between minority and |
| | | majority classes center of mass. |
+-------------+-------------------+--------------------------------------------+
| statistical | h_mean | Compute the harmonic mean of each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | iq_range | Compute the interquartile range (IQR) of |
| | | each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | kurtosis | Compute the kurtosis of each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | lh_trace | Compute the Lawley-Hotelling trace. |
+-------------+-------------------+--------------------------------------------+
| statistical | mad | Compute the Median Absolute Deviation |
| | | (MAD) adjusted by a factor. |
+-------------+-------------------+--------------------------------------------+
| statistical | max | Compute the maximum value from each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | mean | Compute the mean value of each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | median | Compute the median value from each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | min | Compute the minimum value from each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | nr_cor_attr | Compute the number of distinct highly |
| | | correlated pair of attributes. |
+-------------+-------------------+--------------------------------------------+
| statistical | nr_disc | Compute the number of canonical |
| | | correlation between each attribute and |
| | | class. |
+-------------+-------------------+--------------------------------------------+
| statistical | nr_norm | Compute the number of attributes normally |
| | | distributed based in a given method. |
+-------------+-------------------+--------------------------------------------+
| statistical | nr_outliers | Compute the number of attributes with at |
| | | least one outlier value. |
+-------------+-------------------+--------------------------------------------+
| statistical | p_trace | Compute the Pillai's trace. |
+-------------+-------------------+--------------------------------------------+
| statistical | range | Compute the range (max - min) of each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | roy_root | Compute the Roy's largest root. |
+-------------+-------------------+--------------------------------------------+
| statistical | sd | Compute the standard deviation of each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | sd_ratio | Compute a statistical test for homogeneity |
| | | of covariances. |
+-------------+-------------------+--------------------------------------------+
| statistical | skewness | Compute the skewness for each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | sparsity | Compute (possibly normalized) sparsity |
| | | metric for each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | t_mean | Compute the trimmed mean of each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | var | Compute the variance of each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | w_lambda | Compute the Wilks' Lambda value. |
+-------------+-------------------+--------------------------------------------+
| model-based | leaves | Compute the number of leaf nodes in the DT |
| | | model. |
+-------------+-------------------+--------------------------------------------+
| model-based | leaves_branch | Compute the size of branches in the DT |
| | | model. |
+-------------+-------------------+--------------------------------------------+
| model-based | leaves_corrob | Compute the leaves corroboration of the DT |
| | | model. |
+-------------+-------------------+--------------------------------------------+
| model-based | leaves_homo | Compute the DT model Homogeneity for every |
| | | leaf node. |
+-------------+-------------------+--------------------------------------------+
| model-based | leaves_per_class | Compute the proportion of leaves per class |
| | | in DT model. |
+-------------+-------------------+--------------------------------------------+
| model-based | nodes | Compute the number of non-leaf nodes in DT |
| | | model. |
+-------------+-------------------+--------------------------------------------+
| model-based | nodes_per_attr | Compute the ratio of nodes per number of |
| | | attributes in DT model. |
+-------------+-------------------+--------------------------------------------+
| model-based | nodes_per_inst | Compute the ratio of non-leaf nodes per |
| | | number of instances in DT model. |
+-------------+-------------------+--------------------------------------------+
| model-based | nodes_per_level | Compute the ratio of number of nodes per |
| | | tree level in DT model. |
+-------------+-------------------+--------------------------------------------+
| model-based | nodes_repeated | Compute the number of repeated nodes in DT |
| | | model. |
+-------------+-------------------+--------------------------------------------+
| model-based | tree_depth | Compute the depth of every node in the DT |
| | | model. |
+-------------+-------------------+--------------------------------------------+
| model-based | tree_imbalance | Compute the tree imbalance for each leaf |
| | | node. |
+-------------+-------------------+--------------------------------------------+
| model-based | tree_shape | Compute the tree shape for every leaf |
| | | node. |
+-------------+-------------------+--------------------------------------------+
| model-based | var_importance | Compute the features importance of the DT |
| | | model for each attribute. |
+-------------+-------------------+--------------------------------------------+
| itemset | one_itemset | Compute the one itemset meta-feature. |
+-------------+-------------------+--------------------------------------------+
| itemset | two_itemset | Compute the two itemset meta-feature. |
+-------------+-------------------+--------------------------------------------+
You can select a specific group.
MFE.metafeature_description(groups=["general", "statistical"])
+-------------+-------------------+--------------------------------------------+
| Group | Meta-feature name | Description |
+=============+===================+============================================+
| general | attr_to_inst | Compute the ratio between the number of |
| | | attributes. |
+-------------+-------------------+--------------------------------------------+
| general | cat_to_num | Compute the ratio between the number of |
| | | categoric and numeric features. |
+-------------+-------------------+--------------------------------------------+
| general | freq_class | Compute the relative frequency of each |
| | | distinct class. |
+-------------+-------------------+--------------------------------------------+
| general | inst_to_attr | Compute the ratio between the number of |
| | | instances and attributes. |
+-------------+-------------------+--------------------------------------------+
| general | nr_attr | Compute the total number of attributes. |
+-------------+-------------------+--------------------------------------------+
| general | nr_bin | Compute the number of binary attributes. |
+-------------+-------------------+--------------------------------------------+
| general | nr_cat | Compute the number of categorical |
| | | attributes. |
+-------------+-------------------+--------------------------------------------+
| general | nr_class | Compute the number of distinct classes. |
+-------------+-------------------+--------------------------------------------+
| general | nr_inst | Compute the number of instances (rows) in |
| | | the dataset. |
+-------------+-------------------+--------------------------------------------+
| general | nr_num | Compute the number of numeric features. |
+-------------+-------------------+--------------------------------------------+
| general | num_to_cat | Compute the number of numerical and |
| | | categorical features. |
+-------------+-------------------+--------------------------------------------+
| statistical | can_cor | Compute canonical correlations of data. |
+-------------+-------------------+--------------------------------------------+
| statistical | cor | Compute the absolute value of the |
| | | correlation of distinct dataset column |
| | | pairs. |
+-------------+-------------------+--------------------------------------------+
| statistical | cov | Compute the absolute value of the |
| | | covariance of distinct dataset attribute |
| | | pairs. |
+-------------+-------------------+--------------------------------------------+
| statistical | eigenvalues | Compute the eigenvalues of covariance |
| | | matrix from dataset. |
+-------------+-------------------+--------------------------------------------+
| statistical | g_mean | Compute the geometric mean of each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | gravity | Compute the distance between minority and |
| | | majority classes center of mass. |
+-------------+-------------------+--------------------------------------------+
| statistical | h_mean | Compute the harmonic mean of each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | iq_range | Compute the interquartile range (IQR) of |
| | | each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | kurtosis | Compute the kurtosis of each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | lh_trace | Compute the Lawley-Hotelling trace. |
+-------------+-------------------+--------------------------------------------+
| statistical | mad | Compute the Median Absolute Deviation |
| | | (MAD) adjusted by a factor. |
+-------------+-------------------+--------------------------------------------+
| statistical | max | Compute the maximum value from each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | mean | Compute the mean value of each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | median | Compute the median value from each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | min | Compute the minimum value from each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | nr_cor_attr | Compute the number of distinct highly |
| | | correlated pair of attributes. |
+-------------+-------------------+--------------------------------------------+
| statistical | nr_disc | Compute the number of canonical |
| | | correlation between each attribute and |
| | | class. |
+-------------+-------------------+--------------------------------------------+
| statistical | nr_norm | Compute the number of attributes normally |
| | | distributed based in a given method. |
+-------------+-------------------+--------------------------------------------+
| statistical | nr_outliers | Compute the number of attributes with at |
| | | least one outlier value. |
+-------------+-------------------+--------------------------------------------+
| statistical | p_trace | Compute the Pillai's trace. |
+-------------+-------------------+--------------------------------------------+
| statistical | range | Compute the range (max - min) of each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | roy_root | Compute the Roy's largest root. |
+-------------+-------------------+--------------------------------------------+
| statistical | sd | Compute the standard deviation of each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | sd_ratio | Compute a statistical test for homogeneity |
| | | of covariances. |
+-------------+-------------------+--------------------------------------------+
| statistical | skewness | Compute the skewness for each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | sparsity | Compute (possibly normalized) sparsity |
| | | metric for each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | t_mean | Compute the trimmed mean of each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | var | Compute the variance of each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | w_lambda | Compute the Wilks' Lambda value. |
+-------------+-------------------+--------------------------------------------+
You can sort the metafeatures by name and group.
MFE.metafeature_description(sort_by_group=True, sort_by_mtf=True)
+-------------+-------------------+--------------------------------------------+
| Group | Meta-feature name | Description |
+=============+===================+============================================+
| clustering | ch | Compute the Calinski and Harabasz index. |
+-------------+-------------------+--------------------------------------------+
| clustering | int | Compute the INT index. |
+-------------+-------------------+--------------------------------------------+
| clustering | nre | Compute the normalized relative entropy. |
+-------------+-------------------+--------------------------------------------+
| clustering | pb | Compute the pearson correlation between |
| | | class matching and instance distances. |
+-------------+-------------------+--------------------------------------------+
| clustering | sc | Compute the number of clusters with size |
| | | smaller than a given size. |
+-------------+-------------------+--------------------------------------------+
| clustering | sil | Compute the mean silhouette value. |
+-------------+-------------------+--------------------------------------------+
| clustering | vdb | Compute the Davies and Bouldin Index. |
+-------------+-------------------+--------------------------------------------+
| clustering | vdu | Compute the Dunn Index. |
+-------------+-------------------+--------------------------------------------+
| complexity | c1 | Compute the entropy of class proportions. |
+-------------+-------------------+--------------------------------------------+
| complexity | c2 | Compute the imbalance ratio. |
+-------------+-------------------+--------------------------------------------+
| complexity | cls_coef | Clustering coefficient. |
+-------------+-------------------+--------------------------------------------+
| complexity | density | Average density of the network. |
+-------------+-------------------+--------------------------------------------+
| complexity | f1 | Maximum Fisher's discriminant ratio. |
+-------------+-------------------+--------------------------------------------+
| complexity | f1v | Directional-vector maximum Fisher's |
| | | discriminant ratio. |
+-------------+-------------------+--------------------------------------------+
| complexity | f2 | Volume of the overlapping region. |
+-------------+-------------------+--------------------------------------------+
| complexity | f3 | Compute feature maximum individual |
| | | efficiency. |
+-------------+-------------------+--------------------------------------------+
| complexity | f4 | Compute the collective feature efficiency. |
+-------------+-------------------+--------------------------------------------+
| complexity | hubs | Hub score. |
+-------------+-------------------+--------------------------------------------+
| complexity | l1 | Sum of error distance by linear |
| | | programming. |
+-------------+-------------------+--------------------------------------------+
| complexity | l2 | Compute the OVO subsets error rate of |
| | | linear classifier. |
+-------------+-------------------+--------------------------------------------+
| complexity | l3 | Non-Linearity of a linear classifier. |
+-------------+-------------------+--------------------------------------------+
| complexity | lsc | Local set average cardinality. |
+-------------+-------------------+--------------------------------------------+
| complexity | n1 | Compute the fraction of borderline points. |
+-------------+-------------------+--------------------------------------------+
| complexity | n2 | Ratio of intra and extra class nearest |
| | | neighbor distance. |
+-------------+-------------------+--------------------------------------------+
| complexity | n3 | Error rate of the nearest neighbor |
| | | classifier. |
+-------------+-------------------+--------------------------------------------+
| complexity | n4 | Compute the non-linearity of the k-NN |
| | | Classifier. |
+-------------+-------------------+--------------------------------------------+
| complexity | t1 | Fraction of hyperspheres covering data. |
+-------------+-------------------+--------------------------------------------+
| complexity | t2 | Compute the average number of features per |
| | | dimension. |
+-------------+-------------------+--------------------------------------------+
| complexity | t3 | Compute the average number of PCA |
| | | dimensions per points. |
+-------------+-------------------+--------------------------------------------+
| complexity | t4 | Compute the ratio of the PCA dimension to |
| | | the original dimension. |
+-------------+-------------------+--------------------------------------------+
| concept | cohesiveness | Compute the improved version of the |
| | | weighted distance, that captures how dense |
| | | or sparse is the example distribution. |
+-------------+-------------------+--------------------------------------------+
| concept | conceptvar | Compute the concept variation that |
| | | estimates the variability of class labels |
| | | among examples. |
+-------------+-------------------+--------------------------------------------+
| concept | impconceptvar | Compute the improved concept variation |
| | | that estimates the variability of class |
| | | labels among examples. |
+-------------+-------------------+--------------------------------------------+
| concept | wg_dist | Compute the weighted distance, that |
| | | captures how dense or sparse is the |
| | | example distribution. |
+-------------+-------------------+--------------------------------------------+
| general | attr_to_inst | Compute the ratio between the number of |
| | | attributes. |
+-------------+-------------------+--------------------------------------------+
| general | cat_to_num | Compute the ratio between the number of |
| | | categoric and numeric features. |
+-------------+-------------------+--------------------------------------------+
| general | freq_class | Compute the relative frequency of each |
| | | distinct class. |
+-------------+-------------------+--------------------------------------------+
| general | inst_to_attr | Compute the ratio between the number of |
| | | instances and attributes. |
+-------------+-------------------+--------------------------------------------+
| general | nr_attr | Compute the total number of attributes. |
+-------------+-------------------+--------------------------------------------+
| general | nr_bin | Compute the number of binary attributes. |
+-------------+-------------------+--------------------------------------------+
| general | nr_cat | Compute the number of categorical |
| | | attributes. |
+-------------+-------------------+--------------------------------------------+
| general | nr_class | Compute the number of distinct classes. |
+-------------+-------------------+--------------------------------------------+
| general | nr_inst | Compute the number of instances (rows) in |
| | | the dataset. |
+-------------+-------------------+--------------------------------------------+
| general | nr_num | Compute the number of numeric features. |
+-------------+-------------------+--------------------------------------------+
| general | num_to_cat | Compute the number of numerical and |
| | | categorical features. |
+-------------+-------------------+--------------------------------------------+
| info-theory | attr_conc | Compute concentration coef. of each pair |
| | | of distinct attributes. |
+-------------+-------------------+--------------------------------------------+
| info-theory | attr_ent | Compute Shannon's entropy for each |
| | | predictive attribute. |
+-------------+-------------------+--------------------------------------------+
| info-theory | class_conc | Compute concentration coefficient between |
| | | each attribute and class. |
+-------------+-------------------+--------------------------------------------+
| info-theory | class_ent | Compute target attribute Shannon's |
| | | entropy. |
+-------------+-------------------+--------------------------------------------+
| info-theory | eq_num_attr | Compute the number of attributes |
| | | equivalent for a predictive task. |
+-------------+-------------------+--------------------------------------------+
| info-theory | joint_ent | Compute the joint entropy between each |
| | | attribute and class. |
+-------------+-------------------+--------------------------------------------+
| info-theory | mut_inf | Compute the mutual information between |
| | | each attribute and target. |
+-------------+-------------------+--------------------------------------------+
| info-theory | ns_ratio | Compute the noisiness of attributes. |
+-------------+-------------------+--------------------------------------------+
| itemset | one_itemset | Compute the one itemset meta-feature. |
+-------------+-------------------+--------------------------------------------+
| itemset | two_itemset | Compute the two itemset meta-feature. |
+-------------+-------------------+--------------------------------------------+
| landmarking | best_node | Performance of a the best single decision |
| | | tree node. |
+-------------+-------------------+--------------------------------------------+
| landmarking | elite_nn | Performance of Elite Nearest Neighbor. |
+-------------+-------------------+--------------------------------------------+
| landmarking | linear_discr | Performance of the Linear Discriminant |
| | | classifier. |
+-------------+-------------------+--------------------------------------------+
| landmarking | naive_bayes | Performance of the Naive Bayes classifier. |
+-------------+-------------------+--------------------------------------------+
| landmarking | one_nn | Performance of the 1-Nearest Neighbor |
| | | classifier. |
+-------------+-------------------+--------------------------------------------+
| landmarking | random_node | Performance of the single decision tree |
| | | node model induced by a random attribute. |
+-------------+-------------------+--------------------------------------------+
| landmarking | worst_node | Performance of the single decision tree |
| | | node model induced by the worst |
| | | informative attribute. |
+-------------+-------------------+--------------------------------------------+
| model-based | leaves | Compute the number of leaf nodes in the DT |
| | | model. |
+-------------+-------------------+--------------------------------------------+
| model-based | leaves_branch | Compute the size of branches in the DT |
| | | model. |
+-------------+-------------------+--------------------------------------------+
| model-based | leaves_corrob | Compute the leaves corroboration of the DT |
| | | model. |
+-------------+-------------------+--------------------------------------------+
| model-based | leaves_homo | Compute the DT model Homogeneity for every |
| | | leaf node. |
+-------------+-------------------+--------------------------------------------+
| model-based | leaves_per_class | Compute the proportion of leaves per class |
| | | in DT model. |
+-------------+-------------------+--------------------------------------------+
| model-based | nodes | Compute the number of non-leaf nodes in DT |
| | | model. |
+-------------+-------------------+--------------------------------------------+
| model-based | nodes_per_attr | Compute the ratio of nodes per number of |
| | | attributes in DT model. |
+-------------+-------------------+--------------------------------------------+
| model-based | nodes_per_inst | Compute the ratio of non-leaf nodes per |
| | | number of instances in DT model. |
+-------------+-------------------+--------------------------------------------+
| model-based | nodes_per_level | Compute the ratio of number of nodes per |
| | | tree level in DT model. |
+-------------+-------------------+--------------------------------------------+
| model-based | nodes_repeated | Compute the number of repeated nodes in DT |
| | | model. |
+-------------+-------------------+--------------------------------------------+
| model-based | tree_depth | Compute the depth of every node in the DT |
| | | model. |
+-------------+-------------------+--------------------------------------------+
| model-based | tree_imbalance | Compute the tree imbalance for each leaf |
| | | node. |
+-------------+-------------------+--------------------------------------------+
| model-based | tree_shape | Compute the tree shape for every leaf |
| | | node. |
+-------------+-------------------+--------------------------------------------+
| model-based | var_importance | Compute the features importance of the DT |
| | | model for each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | can_cor | Compute canonical correlations of data. |
+-------------+-------------------+--------------------------------------------+
| statistical | cor | Compute the absolute value of the |
| | | correlation of distinct dataset column |
| | | pairs. |
+-------------+-------------------+--------------------------------------------+
| statistical | cov | Compute the absolute value of the |
| | | covariance of distinct dataset attribute |
| | | pairs. |
+-------------+-------------------+--------------------------------------------+
| statistical | eigenvalues | Compute the eigenvalues of covariance |
| | | matrix from dataset. |
+-------------+-------------------+--------------------------------------------+
| statistical | g_mean | Compute the geometric mean of each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | gravity | Compute the distance between minority and |
| | | majority classes center of mass. |
+-------------+-------------------+--------------------------------------------+
| statistical | h_mean | Compute the harmonic mean of each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | iq_range | Compute the interquartile range (IQR) of |
| | | each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | kurtosis | Compute the kurtosis of each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | lh_trace | Compute the Lawley-Hotelling trace. |
+-------------+-------------------+--------------------------------------------+
| statistical | mad | Compute the Median Absolute Deviation |
| | | (MAD) adjusted by a factor. |
+-------------+-------------------+--------------------------------------------+
| statistical | max | Compute the maximum value from each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | mean | Compute the mean value of each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | median | Compute the median value from each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | min | Compute the minimum value from each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | nr_cor_attr | Compute the number of distinct highly |
| | | correlated pair of attributes. |
+-------------+-------------------+--------------------------------------------+
| statistical | nr_disc | Compute the number of canonical |
| | | correlation between each attribute and |
| | | class. |
+-------------+-------------------+--------------------------------------------+
| statistical | nr_norm | Compute the number of attributes normally |
| | | distributed based in a given method. |
+-------------+-------------------+--------------------------------------------+
| statistical | nr_outliers | Compute the number of attributes with at |
| | | least one outlier value. |
+-------------+-------------------+--------------------------------------------+
| statistical | p_trace | Compute the Pillai's trace. |
+-------------+-------------------+--------------------------------------------+
| statistical | range | Compute the range (max - min) of each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | roy_root | Compute the Roy's largest root. |
+-------------+-------------------+--------------------------------------------+
| statistical | sd | Compute the standard deviation of each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | sd_ratio | Compute a statistical test for homogeneity |
| | | of covariances. |
+-------------+-------------------+--------------------------------------------+
| statistical | skewness | Compute the skewness for each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | sparsity | Compute (possibly normalized) sparsity |
| | | metric for each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | t_mean | Compute the trimmed mean of each |
| | | attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | var | Compute the variance of each attribute. |
+-------------+-------------------+--------------------------------------------+
| statistical | w_lambda | Compute the Wilks' Lambda value. |
+-------------+-------------------+--------------------------------------------+
You can include the references.
MFE.metafeature_description(sort_by_group=True, sort_by_mtf=True,
include_references=True)
+-------------+-------------------+----------------------+---------------------+
| Group | Meta-feature name | Description | Reference |
+=============+===================+======================+=====================+
| clustering | ch | Compute the Calinski | [1] T. Calinski, J. |
| | | and Harabasz index. | Harabasz, A |
| | | | dendrite method for |
| | | | cluster analysis, |
| | | | Commun. Stat. |
| | | | Theory Methods 3 |
| | | | (1) (1974) 1–27. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| clustering | int | Compute the INT | [1] SOUZA, Bruno |
| | | index. | Feres de. Meta- |
| | | | aprendizagem |
| | | | aplicada à |
| | | | classificação de |
| | | | dados de expressão |
| | | | gênica. 2010. Tese |
| | | | (Doutorado em |
| | | | Ciências de |
| | | | Computação e |
| | | | Matemática |
| | | | Computacional), |
| | | | Instituto de |
| | | | Ciências |
| | | | Matemáticas e de |
| | | | Computação, |
| | | | Universidade de São |
| | | | Paulo, São Carlos, |
| | | | 2010. doi:10.11606/ |
| | | | T.55.2010.tde-04012 |
| | | | 011-142551. |
| | | | [2] Bezdek, J. C.; |
| | | | Pal, N. R. (1998a). |
| | | | Some new indexes of |
| | | | cluster validity. |
| | | | IEEE Transactions |
| | | | on Systems, Man, |
| | | | and Cybernetics, |
| | | | Part B, v.28, n.3, |
| | | | p.301–315. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| clustering | nre | Compute the | [1] Bruno Almeida |
| | | normalized relative | Pimentel, André |
| | | entropy. | C.P.L.F. de |
| | | | Carvalho. A new |
| | | | data |
| | | | characterization |
| | | | for selecting |
| | | | clustering |
| | | | algorithms using |
| | | | meta-learning. |
| | | | Information |
| | | | Sciences, Volume |
| | | | 477, 2019, Pages |
| | | | 203-219. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| clustering | pb | Compute the pearson | [1] J. Lev, "The |
| | | correlation between | Point Biserial |
| | | class matching and | Coefficient of |
| | | instance distances. | Correlation", Ann. |
| | | | Math. Statist., |
| | | | Vol. 20, no.1, pp. |
| | | | 125-126, 1949. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| clustering | sc | Compute the number | [1] Bruno Almeida |
| | | of clusters with | Pimentel, André |
| | | size smaller than a | C.P.L.F. de |
| | | given size. | Carvalho. A new |
| | | | data |
| | | | characterization |
| | | | for selecting |
| | | | clustering |
| | | | algorithms using |
| | | | meta-learning. |
| | | | Information |
| | | | Sciences, Volume |
| | | | 477, 2019, Pages |
| | | | 203-219. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| clustering | sil | Compute the mean | [1] P.J. Rousseeuw, |
| | | silhouette value. | Silhouettes: a |
| | | | graphical aid to |
| | | | the interpretation |
| | | | and validation of |
| | | | cluster analysis, |
| | | | J. Comput. Appl. |
| | | | Math. 20 (1987) |
| | | | 53–65. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| clustering | vdb | Compute the Davies | [1] D.L. Davies, |
| | | and Bouldin Index. | D.W. Bouldin, A |
| | | | cluster separation |
| | | | measure, IEEE |
| | | | Trans. Pattern |
| | | | Anal. Mach. Intell. |
| | | | 1 (2) (1979) |
| | | | 224–227. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| clustering | vdu | Compute the Dunn | [1] J.C. Dunn, |
| | | Index. | Well-separated |
| | | | clusters and |
| | | | optimal fuzzy |
| | | | partitions, J. |
| | | | Cybern. 4 (1) |
| | | | (1974) 95–104. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | c1 | Compute the entropy | [1] Ana C. Lorena, |
| | | of class | Luís P. F. Garcia, |
| | | proportions. | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 15). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | c2 | Compute the | [1] Ana C. Lorena, |
| | | imbalance ratio. | Luís P. F. Garcia, |
| | | | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 16). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | cls_coef | Clustering | [1] Ana C. Lorena, |
| | | coefficient. | Luís P. F. Garcia, |
| | | | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 9). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | density | Average density of | [1] Ana C. Lorena, |
| | | the network. | Luís P. F. Garcia, |
| | | | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 9). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | f1 | Maximum Fisher's | [1] Ana C. Lorena, |
| | | discriminant ratio. | Luís P. F. Garcia, |
| | | | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 9). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | [2] Ramón A |
| | | | Mollineda, José S |
| | | | Sánchez, and José M |
| | | | Sotoca. Data |
| | | | characterization |
| | | | for effective |
| | | | prototype |
| | | | selection. In 2nd |
| | | | Iberian Conference |
| | | | on Pattern |
| | | | Recognition and |
| | | | Image Analysis |
| | | | (IbPRIA), pages |
| | | | 27–34, 2005. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | f1v | Directional-vector | [1] Ana C. Lorena, |
| | | maximum Fisher's | Luís P. F. Garcia, |
| | | discriminant ratio. | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 9). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | [2] Witold Malina. |
| | | | Two-parameter |
| | | | fisher criterion. |
| | | | IEEE Transactions |
| | | | on Systems, Man, |
| | | | and Cybernetics, |
| | | | Part B |
| | | | (Cybernetics), |
| | | | 31(4):629–636, |
| | | | 2001. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | f2 | Volume of the | [1] Ana C. Lorena, |
| | | overlapping region. | Luís P. F. Garcia, |
| | | | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 9). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | [2] Marcilio C P |
| | | | Souto, Ana C |
| | | | Lorena, Newton |
| | | | Spolaôr, and Ivan G |
| | | | Costa. Complexity |
| | | | measures of |
| | | | supervised |
| | | | classification |
| | | | tasks: a case study |
| | | | for cancer gene |
| | | | expression data. In |
| | | | International Joint |
| | | | Conference on |
| | | | Neural Networks |
| | | | (IJCNN), pages |
| | | | 1352–1358, 2010. |
| | | | [3] Lisa Cummins. |
| | | | Combining and |
| | | | Choosing Case Base |
| | | | Maintenance |
| | | | Algorithms. PhD |
| | | | thesis, National |
| | | | University of |
| | | | Ireland, Cork, |
| | | | 2013. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | f3 | Compute feature | [1] Ana C. Lorena, |
| | | maximum individual | Luís P. F. Garcia, |
| | | efficiency. | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 6). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | f4 | Compute the | [1] Ana C. Lorena, |
| | | collective feature | Luís P. F. Garcia, |
| | | efficiency. | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 7). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | hubs | Hub score. | [1] Ana C. Lorena, |
| | | | Luís P. F. Garcia, |
| | | | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 9). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | l1 | Sum of error | [1] Ana C. Lorena, |
| | | distance by linear | Luís P. F. Garcia, |
| | | programming. | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 9). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | l2 | Compute the OVO | [1] Ana C. Lorena, |
| | | subsets error rate | Luís P. F. Garcia, |
| | | of linear | Jens Lehmann, |
| | | classifier. | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 9). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | l3 | Non-Linearity of a | [1] Ana C. Lorena, |
| | | linear classifier. | Luís P. F. Garcia, |
| | | | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 9). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | lsc | Local set average | [1] Ana C. Lorena, |
| | | cardinality. | Luís P. F. Garcia, |
| | | | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 15). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | [2] Enrique Leyva, |
| | | | Antonio González, |
| | | | and Raúl Pérez. A |
| | | | set of complexity |
| | | | measures designed |
| | | | for applying meta- |
| | | | learning to |
| | | | instance selection. |
| | | | IEEE Transactions |
| | | | on Knowledge and |
| | | | Data Engineering, |
| | | | 27(2):354–367, |
| | | | 2014. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | n1 | Compute the fraction | [1] Ana C. Lorena, |
| | | of borderline | Luís P. F. Garcia, |
| | | points. | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 9-10). |
| | | | Published in ACM |
| | | | Computing Surveys |
| | | | (CSUR), Volume 52 |
| | | | Issue 5, October |
| | | | 2019, Article No. |
| | | | 107. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | n2 | Ratio of intra and | [1] Ana C. Lorena, |
| | | extra class nearest | Luís P. F. Garcia, |
| | | neighbor distance. | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 9). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | n3 | Error rate of the | [1] Ana C. Lorena, |
| | | nearest neighbor | Luís P. F. Garcia, |
| | | classifier. | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 9). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | n4 | Compute the non- | [1] Ana C. Lorena, |
| | | linearity of the | Luís P. F. Garcia, |
| | | k-NN Classifier. | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 9-11). |
| | | | Published in ACM |
| | | | Computing Surveys |
| | | | (CSUR), Volume 52 |
| | | | Issue 5, October |
| | | | 2019, Article No. |
| | | | 107. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | t1 | Fraction of | [1] Ana C. Lorena, |
| | | hyperspheres | Luís P. F. Garcia, |
| | | covering data. | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 9). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | [2] Tin K Ho and |
| | | | Mitra Basu. |
| | | | Complexity measures |
| | | | of supervised |
| | | | classification |
| | | | problems. IEEE |
| | | | Transactions on |
| | | | Pattern Analysis |
| | | | and Machine |
| | | | Intelligence, |
| | | | 24(3):289–300, |
| | | | 2002. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | t2 | Compute the average | [1] Ana C. Lorena, |
| | | number of features | Luís P. F. Garcia, |
| | | per dimension. | Jens Lehmann, |
| | | | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 15). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | t3 | Compute the average | [1] Ana C. Lorena, |
| | | number of PCA | Luís P. F. Garcia, |
| | | dimensions per | Jens Lehmann, |
| | | points. | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 15). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| complexity | t4 | Compute the ratio of | [1] Ana C. Lorena, |
| | | the PCA dimension to | Luís P. F. Garcia, |
| | | the original | Jens Lehmann, |
| | | dimension. | Marcilio C. P. |
| | | | Souto, and Tin K. |
| | | | Ho. How Complex is |
| | | | your classification |
| | | | problem? A survey |
| | | | on measuring |
| | | | classification |
| | | | complexity (V2). |
| | | | (2019) (Cited on |
| | | | page 15). Published |
| | | | in ACM Computing |
| | | | Surveys (CSUR), |
| | | | Volume 52 Issue 5, |
| | | | October 2019, |
| | | | Article No. 107. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| concept | cohesiveness | Compute the improved | [1] Vilalta, R and |
| | | version of the | Drissi, Y (2002). A |
| | | weighted distance, | Characterization of |
| | | that captures how | Difficult Problems |
| | | dense or sparse is | in Classification. |
| | | the example | Proceedings of the |
| | | distribution. | 2002 International |
| | | | Conference on |
| | | | Machine Learning |
| | | | and Applications |
| | | | (pp. 133-138). |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| concept | conceptvar | Compute the concept | [1] Vilalta, R. |
| | | variation that | (1999). |
| | | estimates the | Understanding |
| | | variability of class | accuracy |
| | | labels among | performance through |
| | | examples. | concept |
| | | | characterization |
| | | | and algorithm |
| | | | analysis. In |
| | | | Proceedings of the |
| | | | ICML-99 workshop on |
| | | | recent advances in |
| | | | meta-learning and |
| | | | future work (pp. |
| | | | 3-9). |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| concept | impconceptvar | Compute the improved | [1] Vilalta, R and |
| | | concept variation | Drissi, Y (2002). A |
| | | that estimates the | Characterization of |
| | | variability of class | Difficult Problems |
| | | labels among | in Classification. |
| | | examples. | Proceedings of the |
| | | | 2002 International |
| | | | Conference on |
| | | | Machine Learning |
| | | | and Applications |
| | | | (pp. 133-138). |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| concept | wg_dist | Compute the weighted | [1] Vilalta, R. |
| | | distance, that | (1999). |
| | | captures how dense | Understanding |
| | | or sparse is the | accuracy |
| | | example | performance through |
| | | distribution. | concept |
| | | | characterization |
| | | | and algorithm |
| | | | analysis. In |
| | | | Proceedings of the |
| | | | ICML-99 workshop on |
| | | | recent advances in |
| | | | meta-learning and |
| | | | future work (pp. |
| | | | 3-9). |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| general | attr_to_inst | Compute the ratio | [1] Alexandros |
| | | between the number | Kalousis and |
| | | of attributes. | Theoharis |
| | | | Theoharis. NOEMON: |
| | | | Design, |
| | | | implementation and |
| | | | performance results |
| | | | of an intelligent |
| | | | assistant for |
| | | | classifier |
| | | | selection. |
| | | | Intelligent Data |
| | | | Analysis, |
| | | | 3(5):319–337, 1999. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| general | cat_to_num | Compute the ratio | [1] Matthias |
| | | between the number | Feurer, Jost Tobias |
| | | of categoric and | Springenberg, and |
| | | numeric features. | Frank Hutter. Using |
| | | | meta-learning |
| | | | toinitialize |
| | | | bayesian |
| | | | optimization of |
| | | | hyperparameters. In |
| | | | International |
| | | | Conference on Meta- |
| | | | learning and |
| | | | Algorithm Selection |
| | | | (MLAS), pages 3 – |
| | | | 10, 2014. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| general | freq_class | Compute the relative | [1] Guido Lindner |
| | | frequency of each | and Rudi Studer. |
| | | distinct class. | AST: Support for |
| | | | algorithm selection |
| | | | with a CBR |
| | | | approach. In |
| | | | European Conference |
| | | | on Principles of |
| | | | Data Mining and |
| | | | Knowledge Discovery |
| | | | (PKDD), pages 418 – |
| | | | 423, 1999. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| general | inst_to_attr | Compute the ratio | [1] Petr Kuba, |
| | | between the number | Pavel Brazdil, |
| | | of instances and | Carlos Soares, and |
| | | attributes. | Adam Woznica. |
| | | | Exploiting sampling |
| | | | andmeta-learning |
| | | | for parameter |
| | | | setting for support |
| | | | vector machines. In |
| | | | 8th IBERAMIA |
| | | | Workshop on |
| | | | Learning and Data |
| | | | Mining, pages 209 – |
| | | | 216, 2002. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| general | nr_attr | Compute the total | [1] Donald Michie, |
| | | number of | David J. |
| | | attributes. | Spiegelhalter, |
| | | | Charles C. Taylor, |
| | | | and John Campbell. |
| | | | Machine Learning, |
| | | | Neural and |
| | | | Statistical |
| | | | Classification, |
| | | | volume 37. Ellis |
| | | | Horwood Upper |
| | | | Saddle River, 1994. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| general | nr_bin | Compute the number | [1] Donald Michie, |
| | | of binary | David J. |
| | | attributes. | Spiegelhalter, |
| | | | Charles C. Taylor, |
| | | | and John Campbell. |
| | | | Machine Learning, |
| | | | Neural and |
| | | | Statistical |
| | | | Classification, |
| | | | volume 37. Ellis |
| | | | Horwood Upper |
| | | | Saddle River, 1994. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| general | nr_cat | Compute the number | [1] Robert Engels |
| | | of categorical | and Christiane |
| | | attributes. | Theusinger. Using a |
| | | | data metric for |
| | | | preprocessing |
| | | | advice for data |
| | | | mining |
| | | | applications. In |
| | | | 13th European |
| | | | Conference on on |
| | | | Artificial |
| | | | Intelligence |
| | | | (ECAI), pages 430 – |
| | | | 434, 1998. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| general | nr_class | Compute the number | [1] Donald Michie, |
| | | of distinct classes. | David J. |
| | | | Spiegelhalter, |
| | | | Charles C. Taylor, |
| | | | and John Campbell. |
| | | | Machine Learning, |
| | | | Neural and |
| | | | Statistical |
| | | | Classification, |
| | | | volume 37. Ellis |
| | | | Horwood Upper |
| | | | Saddle River, 1994. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| general | nr_inst | Compute the number | [1] Donald Michie, |
| | | of instances (rows) | David J. |
| | | in the dataset. | Spiegelhalter, |
| | | | Charles C. Taylor, |
| | | | and John Campbell. |
| | | | Machine Learning, |
| | | | Neural and |
| | | | Statistical |
| | | | Classification, |
| | | | volume 37. Ellis |
| | | | Horwood Upper |
| | | | Saddle River, 1994. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| general | nr_num | Compute the number | [1] Robert Engels |
| | | of numeric features. | and Christiane |
| | | | Theusinger. Using a |
| | | | data metric for |
| | | | preprocessing |
| | | | advice for data |
| | | | mining |
| | | | applications. In |
| | | | 13th European |
| | | | Conference on on |
| | | | Artificial |
| | | | Intelligence |
| | | | (ECAI), pages 430 – |
| | | | 434, 1998. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| general | num_to_cat | Compute the number | [1] Matthias |
| | | of numerical and | Feurer, Jost Tobias |
| | | categorical | Springenberg, and |
| | | features. | Frank Hutter. Using |
| | | | meta-learning |
| | | | toinitialize |
| | | | bayesian |
| | | | optimization of |
| | | | hyperparameters. In |
| | | | International |
| | | | Conference on Meta- |
| | | | learning and |
| | | | Algorithm Selection |
| | | | (MLAS), pages 3 – |
| | | | 10, 2014. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| info-theory | attr_conc | Compute | [1] Alexandros |
| | | concentration coef. | Kalousis and |
| | | of each pair of | Melanie Hilario. |
| | | distinct attributes. | Model selection via |
| | | | meta-learning: a |
| | | | comparative study. |
| | | | International |
| | | | Journal on |
| | | | Artificial |
| | | | Intelligence Tools, |
| | | | 10(4):525–554, |
| | | | 2001. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| info-theory | attr_ent | Compute Shannon's | [1] Donald Michie, |
| | | entropy for each | David J. |
| | | predictive | Spiegelhalter, |
| | | attribute. | Charles C. Taylor, |
| | | | and John Campbell. |
| | | | Machine Learning, |
| | | | Neural and |
| | | | Statistical |
| | | | Classification, |
| | | | volume 37. Ellis |
| | | | Horwood Upper |
| | | | Saddle River, 1994. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| info-theory | class_conc | Compute | [1] Alexandros |
| | | concentration | Kalousis and |
| | | coefficient between | Melanie Hilario. |
| | | each attribute and | Model selection via |
| | | class. | meta-learning: a |
| | | | comparative study. |
| | | | International |
| | | | Journal on |
| | | | Artificial |
| | | | Intelligence Tools, |
| | | | 10(4):525–554, |
| | | | 2001. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| info-theory | class_ent | Compute target | [1] Donald Michie, |
| | | attribute Shannon's | David J. |
| | | entropy. | Spiegelhalter, |
| | | | Charles C. Taylor, |
| | | | and John Campbell. |
| | | | Machine Learning, |
| | | | Neural and |
| | | | Statistical |
| | | | Classification, |
| | | | volume 37. Ellis |
| | | | Horwood Upper |
| | | | Saddle River, 1994. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| info-theory | eq_num_attr | Compute the number | [1] Donald Michie, |
| | | of attributes | David J. |
| | | equivalent for a | Spiegelhalter, |
| | | predictive task. | Charles C. Taylor, |
| | | | and John Campbell. |
| | | | Machine Learning, |
| | | | Neural and |
| | | | Statistical |
| | | | Classification, |
| | | | volume 37. Ellis |
| | | | Horwood Upper |
| | | | Saddle River, 1994. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| info-theory | joint_ent | Compute the joint | [1] Donald Michie, |
| | | entropy between each | David J. |
| | | attribute and class. | Spiegelhalter, |
| | | | Charles C. Taylor, |
| | | | and John Campbell. |
| | | | Machine Learning, |
| | | | Neural and |
| | | | Statistical |
| | | | Classification, |
| | | | volume 37. Ellis |
| | | | Horwood Upper |
| | | | Saddle River, 1994. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| info-theory | mut_inf | Compute the mutual | [1] Donald Michie, |
| | | information between | David J. |
| | | each attribute and | Spiegelhalter, |
| | | target. | Charles C. Taylor, |
| | | | and John Campbell. |
| | | | Machine Learning, |
| | | | Neural and |
| | | | Statistical |
| | | | Classification, |
| | | | volume 37. Ellis |
| | | | Horwood Upper |
| | | | Saddle River, 1994. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| info-theory | ns_ratio | Compute the | [1] Donald Michie, |
| | | noisiness of | David J. |
| | | attributes. | Spiegelhalter, |
| | | | Charles C. Taylor, |
| | | | and John Campbell. |
| | | | Machine Learning, |
| | | | Neural and |
| | | | Statistical |
| | | | Classification, |
| | | | volume 37. Ellis |
| | | | Horwood Upper |
| | | | Saddle River, 1994. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| itemset | one_itemset | Compute the one | [1] Song, Q., Wang, |
| | | itemset meta- | G., & Wang, C. |
| | | feature. | (2012). Automatic |
| | | | recommendation of |
| | | | classification |
| | | | algorithms based on |
| | | | data set |
| | | | characteristics. |
| | | | Pattern |
| | | | recognition, 45(7), |
| | | | 2672-2689. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| itemset | two_itemset | Compute the two | [1] Song, Q., Wang, |
| | | itemset meta- | G., & Wang, C. |
| | | feature. | (2012). Automatic |
| | | | recommendation of |
| | | | classification |
| | | | algorithms based on |
| | | | data set |
| | | | characteristics. |
| | | | Pattern |
| | | | recognition, 45(7), |
| | | | 2672-2689. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| landmarking | best_node | Performance of a the | [1] Hilan Bensusan |
| | | best single decision | and Christophe |
| | | tree node. | Giraud-Carrier. |
| | | | Discovering task |
| | | | neighbourhoods |
| | | | through landmark |
| | | | learning |
| | | | performances. In |
| | | | 4th European |
| | | | Conference on |
| | | | Principles of Data |
| | | | Mining and |
| | | | Knowledge Discovery |
| | | | (PKDD), pages 325 – |
| | | | 330, 2000. |
| | | | [2] Johannes |
| | | | Furnkranz and |
| | | | Johann Petrak. An |
| | | | evaluation of |
| | | | landmarking |
| | | | variants. In 1st |
| | | | ECML/PKDD |
| | | | International |
| | | | Workshop on |
| | | | Integration and |
| | | | Collaboration |
| | | | Aspects of Data |
| | | | Mining, Decision |
| | | | Support and Meta- |
| | | | Learning (IDDM), |
| | | | pages 57 – 68, |
| | | | 2001. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| landmarking | elite_nn | Performance of Elite | [1] Hilan Bensusan |
| | | Nearest Neighbor. | and Christophe |
| | | | Giraud-Carrier. |
| | | | Discovering task |
| | | | neighbourhoods |
| | | | through landmark |
| | | | learning |
| | | | performances. In |
| | | | 4th European |
| | | | Conference on |
| | | | Principles of Data |
| | | | Mining and |
| | | | Knowledge Discovery |
| | | | (PKDD), pages 325 – |
| | | | 330, 2000. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| landmarking | linear_discr | Performance of the | [1] Hilan Bensusan |
| | | Linear Discriminant | and Christophe |
| | | classifier. | Giraud-Carrier. |
| | | | Discovering task |
| | | | neighbourhoods |
| | | | through landmark |
| | | | learning |
| | | | performances. In |
| | | | 4th European |
| | | | Conference on |
| | | | Principles of Data |
| | | | Mining and |
| | | | Knowledge Discovery |
| | | | (PKDD), pages 325 – |
| | | | 330, 2000. |
| | | | [2] Johannes |
| | | | Furnkranz and |
| | | | Johann Petrak. An |
| | | | evaluation of |
| | | | landmarking |
| | | | variants. In 1st |
| | | | ECML/PKDD |
| | | | International |
| | | | Workshop on |
| | | | Integration and |
| | | | Collaboration |
| | | | Aspects of Data |
| | | | Mining, Decision |
| | | | Support and Meta- |
| | | | Learning (IDDM), |
| | | | pages 57 – 68, |
| | | | 2001. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| landmarking | naive_bayes | Performance of the | [1] Hilan Bensusan |
| | | Naive Bayes | and Christophe |
| | | classifier. | Giraud-Carrier. |
| | | | Discovering task |
| | | | neighbourhoods |
| | | | through landmark |
| | | | learning |
| | | | performances. In |
| | | | 4th European |
| | | | Conference on |
| | | | Principles of Data |
| | | | Mining and |
| | | | Knowledge Discovery |
| | | | (PKDD), pages 325 – |
| | | | 330, 2000. |
| | | | [2] Johannes |
| | | | Furnkranz and |
| | | | Johann Petrak. An |
| | | | evaluation of |
| | | | landmarking |
| | | | variants. In 1st |
| | | | ECML/PKDD |
| | | | International |
| | | | Workshop on |
| | | | Integration and |
| | | | Collaboration |
| | | | Aspects of Data |
| | | | Mining, Decision |
| | | | Support and Meta- |
| | | | Learning (IDDM), |
| | | | pages 57 – 68, |
| | | | 2001. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| landmarking | one_nn | Performance of the | [1] Hilan Bensusan |
| | | 1-Nearest Neighbor | and Christophe |
| | | classifier. | Giraud-Carrier. |
| | | | Discovering task |
| | | | neighbourhoods |
| | | | through landmark |
| | | | learning |
| | | | performances. In |
| | | | 4th European |
| | | | Conference on |
| | | | Principles of Data |
| | | | Mining and |
| | | | Knowledge Discovery |
| | | | (PKDD), pages 325 – |
| | | | 330, 2000. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| landmarking | random_node | Performance of the | [1] Hilan Bensusan |
| | | single decision tree | and Christophe |
| | | node model induced | Giraud-Carrier. |
| | | by a random | Discovering task |
| | | attribute. | neighbourhoods |
| | | | through landmark |
| | | | learning |
| | | | performances. In |
| | | | 4th European |
| | | | Conference on |
| | | | Principles of Data |
| | | | Mining and |
| | | | Knowledge Discovery |
| | | | (PKDD), pages 325 – |
| | | | 330, 2000. |
| | | | [2] Johannes |
| | | | Furnkranz and |
| | | | Johann Petrak. An |
| | | | evaluation of |
| | | | landmarking |
| | | | variants. In 1st |
| | | | ECML/PKDD |
| | | | International |
| | | | Workshop on |
| | | | Integration and |
| | | | Collaboration |
| | | | Aspects of Data |
| | | | Mining, Decision |
| | | | Support and Meta- |
| | | | Learning (IDDM), |
| | | | pages 57 – 68, |
| | | | 2001. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| landmarking | worst_node | Performance of the | [1] Hilan Bensusan |
| | | single decision tree | and Christophe |
| | | node model induced | Giraud-Carrier. |
| | | by the worst | Discovering task |
| | | informative | neighbourhoods |
| | | attribute. | through landmark |
| | | | learning |
| | | | performances. In |
| | | | 4th European |
| | | | Conference on |
| | | | Principles of Data |
| | | | Mining and |
| | | | Knowledge Discovery |
| | | | (PKDD), pages 325 – |
| | | | 330, 2000. |
| | | | [2] Johannes |
| | | | Furnkranz and |
| | | | Johann Petrak. An |
| | | | evaluation of |
| | | | landmarking |
| | | | variants. In 1st |
| | | | ECML/PKDD |
| | | | International |
| | | | Workshop on |
| | | | Integration and |
| | | | Collaboration |
| | | | Aspects of Data |
| | | | Mining, Decision |
| | | | Support and Meta- |
| | | | Learning (IDDM), |
| | | | pages 57 – 68, |
| | | | 2001. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| model-based | leaves | Compute the number | [1] Yonghong Peng, |
| | | of leaf nodes in the | PA Flach, Pavel |
| | | DT model. | Brazdil, and Carlos |
| | | | Soares. Decision |
| | | | tree-based data |
| | | | characterization |
| | | | for meta-learning. |
| | | | In 2nd ECML/PKDD |
| | | | International |
| | | | Workshop on |
| | | | Integration and |
| | | | Collaboration |
| | | | Aspects of Data |
| | | | Mining, Decision |
| | | | Support and Meta- |
| | | | Learning(IDDM), |
| | | | pages 111 – 122, |
| | | | 2002a. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| model-based | leaves_branch | Compute the size of | [1] Yonghong Peng, |
| | | branches in the DT | PA Flach, Pavel |
| | | model. | Brazdil, and Carlos |
| | | | Soares. Decision |
| | | | tree-based data |
| | | | characterization |
| | | | for meta-learning. |
| | | | In 2nd ECML/PKDD |
| | | | International |
| | | | Workshop on |
| | | | Integration and |
| | | | Collaboration |
| | | | Aspects of Data |
| | | | Mining, Decision |
| | | | Support and Meta- |
| | | | Learning(IDDM), |
| | | | pages 111 – 122, |
| | | | 2002a. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| model-based | leaves_corrob | Compute the leaves | [1] Hilan Bensusan, |
| | | corroboration of the | Christophe Giraud- |
| | | DT model. | Carrier, and Claire |
| | | | Kennedy. A higher- |
| | | | order approachto |
| | | | meta-learning. In |
| | | | 10th International |
| | | | Conference |
| | | | Inductive Logic |
| | | | Programming (ILP), |
| | | | pages 33 – 42, |
| | | | 2000. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| model-based | leaves_homo | Compute the DT model | [1] Hilan Bensusan, |
| | | Homogeneity for | Christophe Giraud- |
| | | every leaf node. | Carrier, and Claire |
| | | | Kennedy. A higher- |
| | | | order approachto |
| | | | meta-learning. In |
| | | | 10th International |
| | | | Conference |
| | | | Inductive Logic |
| | | | Programming (ILP), |
| | | | pages 33 – 42, |
| | | | 2000. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| model-based | leaves_per_class | Compute the | [1] Andray |
| | | proportion of leaves | Filchenkov and |
| | | per class in DT | Arseniy Pendryak. |
| | | model. | Datasets meta- |
| | | | feature description |
| | | | for recom-mending |
| | | | feature selection |
| | | | algorithm. In |
| | | | Artificial |
| | | | Intelligence and |
| | | | Natural Language |
| | | | and Information |
| | | | Extraction, Social |
| | | | Media and Web |
| | | | Search FRUCT |
| | | | Conference (AINL- |
| | | | ISMWFRUCT), pages |
| | | | 11 – 18, 2015. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| model-based | nodes | Compute the number | [1] Yonghong Peng, |
| | | of non-leaf nodes in | PA Flach, Pavel |
| | | DT model. | Brazdil, and Carlos |
| | | | Soares. Decision |
| | | | tree-based data |
| | | | characterization |
| | | | for meta-learning. |
| | | | In 2nd ECML/PKDD |
| | | | International |
| | | | Workshop on |
| | | | Integration and |
| | | | Collaboration |
| | | | Aspects of Data |
| | | | Mining, Decision |
| | | | Support and Meta- |
| | | | Learning(IDDM), |
| | | | pages 111 – 122, |
| | | | 2002a. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| model-based | nodes_per_attr | Compute the ratio of | [1] Hilan Bensusan, |
| | | nodes per number of | Christophe Giraud- |
| | | attributes in DT | Carrier, and Claire |
| | | model. | Kennedy. A higher- |
| | | | order approachto |
| | | | meta-learning. In |
| | | | 10th International |
| | | | Conference |
| | | | Inductive Logic |
| | | | Programming (ILP), |
| | | | pages 33 – 42, |
| | | | 2000. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| model-based | nodes_per_inst | Compute the ratio of | [1] Hilan Bensusan, |
| | | non-leaf nodes per | Christophe Giraud- |
| | | number of instances | Carrier, and Claire |
| | | in DT model. | Kennedy. A higher- |
| | | | order approachto |
| | | | meta-learning. In |
| | | | 10th International |
| | | | Conference |
| | | | Inductive Logic |
| | | | Programming (ILP), |
| | | | pages 33 – 42, |
| | | | 2000. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| model-based | nodes_per_level | Compute the ratio of | [1] Yonghong Peng, |
| | | number of nodes per | PA Flach, Pavel |
| | | tree level in DT | Brazdil, and Carlos |
| | | model. | Soares. Decision |
| | | | tree-based data |
| | | | characterization |
| | | | for meta-learning. |
| | | | In 2nd ECML/PKDD |
| | | | International |
| | | | Workshop on |
| | | | Integration and |
| | | | Collaboration |
| | | | Aspects of Data |
| | | | Mining, Decision |
| | | | Support and Meta- |
| | | | Learning(IDDM), |
| | | | pages 111 – 122, |
| | | | 2002a. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| model-based | nodes_repeated | Compute the number | [1] Hilan Bensusan, |
| | | of repeated nodes in | Christophe Giraud- |
| | | DT model. | Carrier, and Claire |
| | | | Kennedy. A higher- |
| | | | order approachto |
| | | | meta-learning. In |
| | | | 10th International |
| | | | Conference |
| | | | Inductive Logic |
| | | | Programming (ILP), |
| | | | pages 33 – 42, |
| | | | 2000. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| model-based | tree_depth | Compute the depth of | [1] Yonghong Peng, |
| | | every node in the DT | PA Flach, Pavel |
| | | model. | Brazdil, and Carlos |
| | | | Soares. Decision |
| | | | tree-based data |
| | | | characterization |
| | | | for meta-learning. |
| | | | In 2nd ECML/PKDD |
| | | | International |
| | | | Workshop on |
| | | | Integration and |
| | | | Collaboration |
| | | | Aspects of Data |
| | | | Mining, Decision |
| | | | Support and Meta- |
| | | | Learning(IDDM), |
| | | | pages 111 – 122, |
| | | | 2002a. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| model-based | tree_imbalance | Compute the tree | [1] Hilan Bensusan, |
| | | imbalance for each | Christophe Giraud- |
| | | leaf node. | Carrier, and Claire |
| | | | Kennedy. A higher- |
| | | | order approachto |
| | | | meta-learning. In |
| | | | 10th International |
| | | | Conference |
| | | | Inductive Logic |
| | | | Programming (ILP), |
| | | | pages 33 – 42, |
| | | | 2000. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| model-based | tree_shape | Compute the tree | [1] Hilan Bensusan, |
| | | shape for every leaf | Christophe Giraud- |
| | | node. | Carrier, and Claire |
| | | | Kennedy. A higher- |
| | | | order approachto |
| | | | meta-learning. In |
| | | | 10th International |
| | | | Conference |
| | | | Inductive Logic |
| | | | Programming (ILP), |
| | | | pages 33 – 42, |
| | | | 2000. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| model-based | var_importance | Compute the features | [1] Hilan Bensusan, |
| | | importance of the DT | Christophe Giraud- |
| | | model for each | Carrier, and Claire |
| | | attribute. | Kennedy. A higher- |
| | | | order approachto |
| | | | meta-learning. In |
| | | | 10th International |
| | | | Conference |
| | | | Inductive Logic |
| | | | Programming (ILP), |
| | | | pages 33 – 42, |
| | | | 2000. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | can_cor | Compute canonical | [1] Alexandros |
| | | correlations of | Kalousis. Algorithm |
| | | data. | Selection via Meta- |
| | | | Learning. PhD |
| | | | thesis, Faculty of |
| | | | Science of the |
| | | | University of |
| | | | Geneva, 2002. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | cor | Compute the absolute | [1] Ciro Castiello, |
| | | value of the | Giovanna |
| | | correlation of | Castellano, and |
| | | distinct dataset | Anna Maria Fanelli. |
| | | column pairs. | Meta-data: |
| | | | Characterization of |
| | | | input features for |
| | | | meta-learning. In |
| | | | 2nd International |
| | | | Conference on |
| | | | Modeling Decisions |
| | | | for Artificial |
| | | | Intelligence |
| | | | (MDAI), pages |
| | | | 457–468, 2005. |
| | | | [2] Matthias Reif, |
| | | | Faisal Shafait, |
| | | | Markus Goldstein, |
| | | | Thomas Breuel, and |
| | | | Andreas Dengel. |
| | | | Automatic |
| | | | classifier |
| | | | selection for non- |
| | | | experts. Pattern |
| | | | Analysis and |
| | | | Applications, |
| | | | 17(1):83–96, 2014. |
| | | | [3] Donald Michie, |
| | | | David J. |
| | | | Spiegelhalter, |
| | | | Charles C. Taylor, |
| | | | and John Campbell. |
| | | | Machine Learning, |
| | | | Neural and |
| | | | Statistical |
| | | | Classification, |
| | | | volume 37. Ellis |
| | | | Horwood Upper |
| | | | Saddle River, 1994. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | cov | Compute the absolute | [1] Ciro Castiello, |
| | | value of the | Giovanna |
| | | covariance of | Castellano, and |
| | | distinct dataset | Anna Maria Fanelli. |
| | | attribute pairs. | Meta-data: |
| | | | Characterization of |
| | | | input features for |
| | | | meta-learning. In |
| | | | 2nd International |
| | | | Conference on |
| | | | Modeling Decisions |
| | | | for Artificial |
| | | | Intelligence |
| | | | (MDAI), pages |
| | | | 457–468, 2005. |
| | | | [2] Donald Michie, |
| | | | David J. |
| | | | Spiegelhalter, |
| | | | Charles C. Taylor, |
| | | | and John Campbell. |
| | | | Machine Learning, |
| | | | Neural and |
| | | | Statistical |
| | | | Classification, |
| | | | volume 37. Ellis |
| | | | Horwood Upper |
| | | | Saddle River, 1994. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | eigenvalues | Compute the | [1] Shawkat Ali and |
| | | eigenvalues of | Kate A. Smith. On |
| | | covariance matrix | learning algorithm |
| | | from dataset. | selection for |
| | | | classification. |
| | | | Applied Soft |
| | | | Computing, 6(2):119 |
| | | | – 138, 2006. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | g_mean | Compute the | [1] Shawkat Ali and |
| | | geometric mean of | Kate A. Smith- |
| | | each attribute. | Miles. A meta- |
| | | | learning approach |
| | | | to automatic kernel |
| | | | selection for |
| | | | support vector |
| | | | machines. |
| | | | Neurocomputing, |
| | | | 70(1):173 – 186, |
| | | | 2006. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | gravity | Compute the distance | [1] Shawkat Ali and |
| | | between minority and | Kate A. Smith. On |
| | | majority classes | learning algorithm |
| | | center of mass. | selection for |
| | | | classification. |
| | | | Applied Soft |
| | | | Computing, 6(2):119 |
| | | | – 138, 2006. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | h_mean | Compute the harmonic | [1] Shawkat Ali and |
| | | mean of each | Kate A. Smith- |
| | | attribute. | Miles. A meta- |
| | | | learning approach |
| | | | to automatic kernel |
| | | | selection for |
| | | | support vector |
| | | | machines. |
| | | | Neurocomputing, |
| | | | 70(1):173 – 186, |
| | | | 2006. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | iq_range | Compute the | [1] Shawkat Ali and |
| | | interquartile range | Kate A. Smith- |
| | | (IQR) of each | Miles. A meta- |
| | | attribute. | learning approach |
| | | | to automatic kernel |
| | | | selection for |
| | | | support vector |
| | | | machines. |
| | | | Neurocomputing, |
| | | | 70(1):173 – 186, |
| | | | 2006. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | kurtosis | Compute the kurtosis | [1] Donald Michie, |
| | | of each attribute. | David J. |
| | | | Spiegelhalter, |
| | | | Charles C. Taylor, |
| | | | and John Campbell. |
| | | | Machine Learning, |
| | | | Neural and |
| | | | Statistical |
| | | | Classification, |
| | | | volume 37. Ellis |
| | | | Horwood Upper |
| | | | Saddle River, 1994. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | lh_trace | Compute the Lawley- | [1] Lawley D. A |
| | | Hotelling trace. | Generalization of |
| | | | Fisher’s z Test. |
| | | | Biometrika. |
| | | | 1938;30(1):180-187. |
| | | | [2] Hotelling H. A |
| | | | generalized T test |
| | | | and measure of |
| | | | multivariate |
| | | | dispersion. In: |
| | | | Neyman J, ed. |
| | | | Proceedings of the |
| | | | Second Berkeley |
| | | | Symposium on |
| | | | Mathematical |
| | | | Statistics and |
| | | | Probability. |
| | | | Berkeley: |
| | | | University of |
| | | | California Press; |
| | | | 1951:23-41. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | mad | Compute the Median | [1] Shawkat Ali and |
| | | Absolute Deviation | Kate A. Smith. On |
| | | (MAD) adjusted by a | learning algorithm |
| | | factor. | selection for |
| | | | classification. |
| | | | Applied Soft |
| | | | Computing, 6(2):119 |
| | | | – 138, 2006. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | max | Compute the maximum | [1] Robert Engels |
| | | value from each | and Christiane |
| | | attribute. | Theusinger. Using a |
| | | | data metric for |
| | | | preprocessing |
| | | | advice for data |
| | | | mining |
| | | | applications. In |
| | | | 13th European |
| | | | Conference on on |
| | | | Artificial |
| | | | Intelligence |
| | | | (ECAI), pages 430 – |
| | | | 434, 1998. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | mean | Compute the mean | [1] Robert Engels |
| | | value of each | and Christiane |
| | | attribute. | Theusinger. Using a |
| | | | data metric for |
| | | | preprocessing |
| | | | advice for data |
| | | | mining |
| | | | applications. In |
| | | | 13th European |
| | | | Conference on on |
| | | | Artificial |
| | | | Intelligence |
| | | | (ECAI), pages 430 – |
| | | | 434, 1998. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | median | Compute the median | [1] Robert Engels |
| | | value from each | and Christiane |
| | | attribute. | Theusinger. Using a |
| | | | data metric for |
| | | | preprocessing |
| | | | advice for data |
| | | | mining |
| | | | applications. In |
| | | | 13th European |
| | | | Conference on on |
| | | | Artificial |
| | | | Intelligence |
| | | | (ECAI), pages 430 – |
| | | | 434, 1998. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | min | Compute the minimum | [1] Robert Engels |
| | | value from each | and Christiane |
| | | attribute. | Theusinger. Using a |
| | | | data metric for |
| | | | preprocessing |
| | | | advice for data |
| | | | mining |
| | | | applications. In |
| | | | 13th European |
| | | | Conference on on |
| | | | Artificial |
| | | | Intelligence |
| | | | (ECAI), pages 430 – |
| | | | 434, 1998. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | nr_cor_attr | Compute the number | [1] Mostafa A. |
| | | of distinct highly | Salama, Aboul Ella |
| | | correlated pair of | Hassanien, and |
| | | attributes. | Kenneth Revett. |
| | | | Employment of |
| | | | neural network and |
| | | | rough set in meta- |
| | | | learning. Memetic |
| | | | Computing, 5(3):165 |
| | | | – 177, 2013. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | nr_disc | Compute the number | [1] Guido Lindner |
| | | of canonical | and Rudi Studer. |
| | | correlation between | AST: Support for |
| | | each attribute and | algorithm selection |
| | | class. | with a CBR |
| | | | approach. In |
| | | | European Conference |
| | | | on Principles of |
| | | | Data Mining and |
| | | | Knowledge Discovery |
| | | | (PKDD), pages 418 – |
| | | | 423, 1999. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | nr_norm | Compute the number | [1] Christian Kopf, |
| | | of attributes | Charles Taylor, and |
| | | normally distributed | Jorg Keller. Meta- |
| | | based in a given | Analysis: From data |
| | | method. | characterisation |
| | | | for meta-learning |
| | | | to meta-regression. |
| | | | In PKDD Workshop on |
| | | | Data Mining, |
| | | | Decision Support, |
| | | | Meta-Learning and |
| | | | Inductive Logic |
| | | | Programming, pages |
| | | | 15 – 26, 2000. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | nr_outliers | Compute the number | [1] Christian Kopf |
| | | of attributes with | and Ioannis |
| | | at least one outlier | Iglezakis. |
| | | value. | Combination of task |
| | | | description |
| | | | strategies and case |
| | | | base properties for |
| | | | meta-learning. In |
| | | | 2nd ECML/PKDD |
| | | | International |
| | | | Workshop on |
| | | | Integration and |
| | | | Collaboration |
| | | | Aspects of Data |
| | | | Mining, Decision |
| | | | Support and Meta- |
| | | | Learning(IDDM), |
| | | | pages 65 – 76, |
| | | | 2002. |
| | | | [2] Peter J. |
| | | | Rousseeuw and Mia |
| | | | Hubert. Robust |
| | | | statistics for |
| | | | outlier detection. |
| | | | Wiley |
| | | | Interdisciplinary |
| | | | Reviews: Data |
| | | | Mining and |
| | | | Knowledge |
| | | | Discovery, 1(1):73 |
| | | | – 79, 2011. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | p_trace | Compute the Pillai's | [1] Pillai K.C.S |
| | | trace. | (1955). Some New |
| | | | test criteria in |
| | | | multivariate |
| | | | analysis. Ann Math |
| | | | Stat: 26(1):117–21. |
| | | | Seber, G.A.F. |
| | | | (1984). |
| | | | Multivariate |
| | | | Observations. New |
| | | | York: John Wiley |
| | | | and Sons. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | range | Compute the range | [1] Shawkat Ali and |
| | | (max - min) of each | Kate A. Smith- |
| | | attribute. | Miles. A meta- |
| | | | learning approach |
| | | | to automatic kernel |
| | | | selection for |
| | | | support vector |
| | | | machines. |
| | | | Neurocomputing, |
| | | | 70(1):173 – 186, |
| | | | 2006. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | roy_root | Compute the Roy's | [1] Roy SN. On a |
| | | largest root. | Heuristic Method of |
| | | | Test Construction |
| | | | and its use in |
| | | | Multivariate |
| | | | Analysis. Ann Math |
| | | | Stat. |
| | | | 1953;24(2):220-238. |
| | | | [2] A note on Roy's |
| | | | largest root. |
| | | | Kuhfeld, W.F. |
| | | | Psychometrika |
| | | | (1986) 51: 479. htt |
| | | | ps://doi.org/10.100 |
| | | | 7/BF02294069 |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | sd | Compute the standard | [1] Robert Engels |
| | | deviation of each | and Christiane |
| | | attribute. | Theusinger. Using a |
| | | | data metric for |
| | | | preprocessing |
| | | | advice for data |
| | | | mining |
| | | | applications. In |
| | | | 13th European |
| | | | Conference on on |
| | | | Artificial |
| | | | Intelligence |
| | | | (ECAI), pages 430 – |
| | | | 434, 1998. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | sd_ratio | Compute a | [1] Donald Michie, |
| | | statistical test for | David J. |
| | | homogeneity of | Spiegelhalter, |
| | | covariances. | Charles C. Taylor, |
| | | | and John Campbell. |
| | | | Machine Learning, |
| | | | Neural and |
| | | | Statistical |
| | | | Classification, |
| | | | volume 37. Ellis |
| | | | Horwood Upper |
| | | | Saddle River, 1994. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | skewness | Compute the skewness | [1] Donald Michie, |
| | | for each attribute. | David J. |
| | | | Spiegelhalter, |
| | | | Charles C. Taylor, |
| | | | and John Campbell. |
| | | | Machine Learning, |
| | | | Neural and |
| | | | Statistical |
| | | | Classification, |
| | | | volume 37. Ellis |
| | | | Horwood Upper |
| | | | Saddle River, 1994. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | sparsity | Compute (possibly | [1] Mostafa A. |
| | | normalized) sparsity | Salama, Aboul Ella |
| | | metric for each | Hassanien, and |
| | | attribute. | Kenneth Revett. |
| | | | Employment of |
| | | | neural network and |
| | | | rough set in meta- |
| | | | learning. Memetic |
| | | | Computing, 5(3):165 |
| | | | – 177, 2013. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | t_mean | Compute the trimmed | [1] Robert Engels |
| | | mean of each | and Christiane |
| | | attribute. | Theusinger. Using a |
| | | | data metric for |
| | | | preprocessing |
| | | | advice for data |
| | | | mining |
| | | | applications. In |
| | | | 13th European |
| | | | Conference on on |
| | | | Artificial |
| | | | Intelligence |
| | | | (ECAI), pages 430 – |
| | | | 434, 1998. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | var | Compute the variance | [1] Ciro Castiello, |
| | | of each attribute. | Giovanna |
| | | | Castellano, and |
| | | | Anna Maria Fanelli. |
| | | | Meta-data: |
| | | | Characterization of |
| | | | input features for |
| | | | meta-learning. In |
| | | | 2nd International |
| | | | Conference on |
| | | | Modeling Decisions |
| | | | for Artificial |
| | | | Intelligence |
| | | | (MDAI), pages |
| | | | 457–468, 2005. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
| statistical | w_lambda | Compute the Wilks' | [1] Guido Lindner |
| | | Lambda value. | and Rudi Studer. |
| | | | AST: Support for |
| | | | algorithm selection |
| | | | with a CBR |
| | | | approach. In |
| | | | European Conference |
| | | | on Principles of |
| | | | Data Mining and |
| | | | Knowledge Discovery |
| | | | (PKDD), pages 418 – |
| | | | 423, 1999. |
| | | | |
+-------------+-------------------+----------------------+---------------------+
You also can get the table instead of printing it.
MFE.metafeature_description(print_table=False)
([['Group', 'Meta-feature name', 'Description'], ['clustering', 'ch', 'Compute the Calinski and Harabasz index.'], ['clustering', 'int', 'Compute the INT index.'], ['clustering', 'nre', 'Compute the normalized relative entropy.'], ['clustering', 'pb', 'Compute the pearson correlation between class matching and instance distances.'], ['clustering', 'sc', 'Compute the number of clusters with size smaller than a given size.'], ['clustering', 'sil', 'Compute the mean silhouette value.'], ['clustering', 'vdb', 'Compute the Davies and Bouldin Index.'], ['clustering', 'vdu', 'Compute the Dunn Index.'], ['complexity', 'c1', 'Compute the entropy of class proportions.'], ['complexity', 'c2', 'Compute the imbalance ratio.'], ['complexity', 'cls_coef', 'Clustering coefficient.'], ['complexity', 'density', 'Average density of the network.'], ['complexity', 'f1', "Maximum Fisher's discriminant ratio."], ['complexity', 'f1v', "Directional-vector maximum Fisher's discriminant ratio."], ['complexity', 'f2', 'Volume of the overlapping region.'], ['complexity', 'f3', 'Compute feature maximum individual efficiency.'], ['complexity', 'f4', 'Compute the collective feature efficiency.'], ['complexity', 'hubs', 'Hub score.'], ['complexity', 'l1', 'Sum of error distance by linear programming.'], ['complexity', 'l2', 'Compute the OVO subsets error rate of linear classifier.'], ['complexity', 'l3', 'Non-Linearity of a linear classifier.'], ['complexity', 'lsc', 'Local set average cardinality.'], ['complexity', 'n1', 'Compute the fraction of borderline points.'], ['complexity', 'n2', 'Ratio of intra and extra class nearest neighbor distance.'], ['complexity', 'n3', 'Error rate of the nearest neighbor classifier.'], ['complexity', 'n4', 'Compute the non-linearity of the k-NN Classifier.'], ['complexity', 't1', 'Fraction of hyperspheres covering data.'], ['complexity', 't2', 'Compute the average number of features per dimension.'], ['complexity', 't3', 'Compute the average number of PCA dimensions per points.'], ['complexity', 't4', 'Compute the ratio of the PCA dimension to the original dimension.'], ['concept', 'cohesiveness', 'Compute the improved version of the weighted distance, that captures how dense or sparse is the example distribution.'], ['concept', 'conceptvar', 'Compute the concept variation that estimates the variability of class labels among examples.'], ['concept', 'impconceptvar', 'Compute the improved concept variation that estimates the variability of class labels among examples.'], ['concept', 'wg_dist', 'Compute the weighted distance, that captures how dense or sparse is the example distribution.'], ['info-theory', 'attr_conc', 'Compute concentration coef. of each pair of distinct attributes.'], ['info-theory', 'attr_ent', "Compute Shannon's entropy for each predictive attribute."], ['info-theory', 'class_conc', 'Compute concentration coefficient between each attribute and class.'], ['info-theory', 'class_ent', "Compute target attribute Shannon's entropy."], ['info-theory', 'eq_num_attr', 'Compute the number of attributes equivalent for a predictive task.'], ['info-theory', 'joint_ent', 'Compute the joint entropy between each attribute and class.'], ['info-theory', 'mut_inf', 'Compute the mutual information between each attribute and target.'], ['info-theory', 'ns_ratio', 'Compute the noisiness of attributes.'], ['landmarking', 'best_node', 'Performance of a the best single decision tree node.'], ['landmarking', 'elite_nn', 'Performance of Elite Nearest Neighbor.'], ['landmarking', 'linear_discr', 'Performance of the Linear Discriminant classifier.'], ['landmarking', 'naive_bayes', 'Performance of the Naive Bayes classifier.'], ['landmarking', 'one_nn', 'Performance of the 1-Nearest Neighbor classifier.'], ['landmarking', 'random_node', 'Performance of the single decision tree node model induced by a random attribute.'], ['landmarking', 'worst_node', 'Performance of the single decision tree node model induced by the worst informative attribute.'], ['general', 'attr_to_inst', 'Compute the ratio between the number of attributes.'], ['general', 'cat_to_num', 'Compute the ratio between the number of categoric and numeric features.'], ['general', 'freq_class', 'Compute the relative frequency of each distinct class.'], ['general', 'inst_to_attr', 'Compute the ratio between the number of instances and attributes.'], ['general', 'nr_attr', 'Compute the total number of attributes.'], ['general', 'nr_bin', 'Compute the number of binary attributes.'], ['general', 'nr_cat', 'Compute the number of categorical attributes.'], ['general', 'nr_class', 'Compute the number of distinct classes.'], ['general', 'nr_inst', 'Compute the number of instances (rows) in the dataset.'], ['general', 'nr_num', 'Compute the number of numeric features.'], ['general', 'num_to_cat', 'Compute the number of numerical and categorical features.'], ['statistical', 'can_cor', 'Compute canonical correlations of data.'], ['statistical', 'cor', 'Compute the absolute value of the correlation of distinct dataset column pairs.'], ['statistical', 'cov', 'Compute the absolute value of the covariance of distinct dataset attribute pairs.'], ['statistical', 'eigenvalues', 'Compute the eigenvalues of covariance matrix from dataset.'], ['statistical', 'g_mean', 'Compute the geometric mean of each attribute.'], ['statistical', 'gravity', 'Compute the distance between minority and majority classes center of mass.'], ['statistical', 'h_mean', 'Compute the harmonic mean of each attribute.'], ['statistical', 'iq_range', 'Compute the interquartile range (IQR) of each attribute.'], ['statistical', 'kurtosis', 'Compute the kurtosis of each attribute.'], ['statistical', 'lh_trace', 'Compute the Lawley-Hotelling trace.'], ['statistical', 'mad', 'Compute the Median Absolute Deviation (MAD) adjusted by a factor.'], ['statistical', 'max', 'Compute the maximum value from each attribute.'], ['statistical', 'mean', 'Compute the mean value of each attribute.'], ['statistical', 'median', 'Compute the median value from each attribute.'], ['statistical', 'min', 'Compute the minimum value from each attribute.'], ['statistical', 'nr_cor_attr', 'Compute the number of distinct highly correlated pair of attributes.'], ['statistical', 'nr_disc', 'Compute the number of canonical correlation between each attribute and class.'], ['statistical', 'nr_norm', 'Compute the number of attributes normally distributed based in a given method.'], ['statistical', 'nr_outliers', 'Compute the number of attributes with at least one outlier value.'], ['statistical', 'p_trace', "Compute the Pillai's trace."], ['statistical', 'range', 'Compute the range (max - min) of each attribute.'], ['statistical', 'roy_root', "Compute the Roy's largest root."], ['statistical', 'sd', 'Compute the standard deviation of each attribute.'], ['statistical', 'sd_ratio', 'Compute a statistical test for homogeneity of covariances.'], ['statistical', 'skewness', 'Compute the skewness for each attribute.'], ['statistical', 'sparsity', 'Compute (possibly normalized) sparsity metric for each attribute.'], ['statistical', 't_mean', 'Compute the trimmed mean of each attribute.'], ['statistical', 'var', 'Compute the variance of each attribute.'], ['statistical', 'w_lambda', "Compute the Wilks' Lambda value."], ['model-based', 'leaves', 'Compute the number of leaf nodes in the DT model.'], ['model-based', 'leaves_branch', 'Compute the size of branches in the DT model.'], ['model-based', 'leaves_corrob', 'Compute the leaves corroboration of the DT model.'], ['model-based', 'leaves_homo', 'Compute the DT model Homogeneity for every leaf node.'], ['model-based', 'leaves_per_class', 'Compute the proportion of leaves per class in DT model.'], ['model-based', 'nodes', 'Compute the number of non-leaf nodes in DT model.'], ['model-based', 'nodes_per_attr', 'Compute the ratio of nodes per number of attributes in DT model.'], ['model-based', 'nodes_per_inst', 'Compute the ratio of non-leaf nodes per number of instances in DT model.'], ['model-based', 'nodes_per_level', 'Compute the ratio of number of nodes per tree level in DT model.'], ['model-based', 'nodes_repeated', 'Compute the number of repeated nodes in DT model.'], ['model-based', 'tree_depth', 'Compute the depth of every node in the DT model.'], ['model-based', 'tree_imbalance', 'Compute the tree imbalance for each leaf node.'], ['model-based', 'tree_shape', 'Compute the tree shape for every leaf node.'], ['model-based', 'var_importance', 'Compute the features importance of the DT model for each attribute.'], ['itemset', 'one_itemset', 'Compute the one itemset meta-feature.'], ['itemset', 'two_itemset', 'Compute the two itemset meta-feature.']], "+-------------+-------------------+--------------------------------------------+\n| Group | Meta-feature name | Description |\n+=============+===================+============================================+\n| clustering | ch | Compute the Calinski and Harabasz index. |\n+-------------+-------------------+--------------------------------------------+\n| clustering | int | Compute the INT index. |\n+-------------+-------------------+--------------------------------------------+\n| clustering | nre | Compute the normalized relative entropy. |\n+-------------+-------------------+--------------------------------------------+\n| clustering | pb | Compute the pearson correlation between |\n| | | class matching and instance distances. |\n+-------------+-------------------+--------------------------------------------+\n| clustering | sc | Compute the number of clusters with size |\n| | | smaller than a given size. |\n+-------------+-------------------+--------------------------------------------+\n| clustering | sil | Compute the mean silhouette value. |\n+-------------+-------------------+--------------------------------------------+\n| clustering | vdb | Compute the Davies and Bouldin Index. |\n+-------------+-------------------+--------------------------------------------+\n| clustering | vdu | Compute the Dunn Index. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | c1 | Compute the entropy of class proportions. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | c2 | Compute the imbalance ratio. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | cls_coef | Clustering coefficient. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | density | Average density of the network. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | f1 | Maximum Fisher's discriminant ratio. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | f1v | Directional-vector maximum Fisher's |\n| | | discriminant ratio. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | f2 | Volume of the overlapping region. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | f3 | Compute feature maximum individual |\n| | | efficiency. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | f4 | Compute the collective feature efficiency. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | hubs | Hub score. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | l1 | Sum of error distance by linear |\n| | | programming. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | l2 | Compute the OVO subsets error rate of |\n| | | linear classifier. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | l3 | Non-Linearity of a linear classifier. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | lsc | Local set average cardinality. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | n1 | Compute the fraction of borderline points. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | n2 | Ratio of intra and extra class nearest |\n| | | neighbor distance. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | n3 | Error rate of the nearest neighbor |\n| | | classifier. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | n4 | Compute the non-linearity of the k-NN |\n| | | Classifier. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | t1 | Fraction of hyperspheres covering data. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | t2 | Compute the average number of features per |\n| | | dimension. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | t3 | Compute the average number of PCA |\n| | | dimensions per points. |\n+-------------+-------------------+--------------------------------------------+\n| complexity | t4 | Compute the ratio of the PCA dimension to |\n| | | the original dimension. |\n+-------------+-------------------+--------------------------------------------+\n| concept | cohesiveness | Compute the improved version of the |\n| | | weighted distance, that captures how dense |\n| | | or sparse is the example distribution. |\n+-------------+-------------------+--------------------------------------------+\n| concept | conceptvar | Compute the concept variation that |\n| | | estimates the variability of class labels |\n| | | among examples. |\n+-------------+-------------------+--------------------------------------------+\n| concept | impconceptvar | Compute the improved concept variation |\n| | | that estimates the variability of class |\n| | | labels among examples. |\n+-------------+-------------------+--------------------------------------------+\n| concept | wg_dist | Compute the weighted distance, that |\n| | | captures how dense or sparse is the |\n| | | example distribution. |\n+-------------+-------------------+--------------------------------------------+\n| info-theory | attr_conc | Compute concentration coef. of each pair |\n| | | of distinct attributes. |\n+-------------+-------------------+--------------------------------------------+\n| info-theory | attr_ent | Compute Shannon's entropy for each |\n| | | predictive attribute. |\n+-------------+-------------------+--------------------------------------------+\n| info-theory | class_conc | Compute concentration coefficient between |\n| | | each attribute and class. |\n+-------------+-------------------+--------------------------------------------+\n| info-theory | class_ent | Compute target attribute Shannon's |\n| | | entropy. |\n+-------------+-------------------+--------------------------------------------+\n| info-theory | eq_num_attr | Compute the number of attributes |\n| | | equivalent for a predictive task. |\n+-------------+-------------------+--------------------------------------------+\n| info-theory | joint_ent | Compute the joint entropy between each |\n| | | attribute and class. |\n+-------------+-------------------+--------------------------------------------+\n| info-theory | mut_inf | Compute the mutual information between |\n| | | each attribute and target. |\n+-------------+-------------------+--------------------------------------------+\n| info-theory | ns_ratio | Compute the noisiness of attributes. |\n+-------------+-------------------+--------------------------------------------+\n| landmarking | best_node | Performance of a the best single decision |\n| | | tree node. |\n+-------------+-------------------+--------------------------------------------+\n| landmarking | elite_nn | Performance of Elite Nearest Neighbor. |\n+-------------+-------------------+--------------------------------------------+\n| landmarking | linear_discr | Performance of the Linear Discriminant |\n| | | classifier. |\n+-------------+-------------------+--------------------------------------------+\n| landmarking | naive_bayes | Performance of the Naive Bayes classifier. |\n+-------------+-------------------+--------------------------------------------+\n| landmarking | one_nn | Performance of the 1-Nearest Neighbor |\n| | | classifier. |\n+-------------+-------------------+--------------------------------------------+\n| landmarking | random_node | Performance of the single decision tree |\n| | | node model induced by a random attribute. |\n+-------------+-------------------+--------------------------------------------+\n| landmarking | worst_node | Performance of the single decision tree |\n| | | node model induced by the worst |\n| | | informative attribute. |\n+-------------+-------------------+--------------------------------------------+\n| general | attr_to_inst | Compute the ratio between the number of |\n| | | attributes. |\n+-------------+-------------------+--------------------------------------------+\n| general | cat_to_num | Compute the ratio between the number of |\n| | | categoric and numeric features. |\n+-------------+-------------------+--------------------------------------------+\n| general | freq_class | Compute the relative frequency of each |\n| | | distinct class. |\n+-------------+-------------------+--------------------------------------------+\n| general | inst_to_attr | Compute the ratio between the number of |\n| | | instances and attributes. |\n+-------------+-------------------+--------------------------------------------+\n| general | nr_attr | Compute the total number of attributes. |\n+-------------+-------------------+--------------------------------------------+\n| general | nr_bin | Compute the number of binary attributes. |\n+-------------+-------------------+--------------------------------------------+\n| general | nr_cat | Compute the number of categorical |\n| | | attributes. |\n+-------------+-------------------+--------------------------------------------+\n| general | nr_class | Compute the number of distinct classes. |\n+-------------+-------------------+--------------------------------------------+\n| general | nr_inst | Compute the number of instances (rows) in |\n| | | the dataset. |\n+-------------+-------------------+--------------------------------------------+\n| general | nr_num | Compute the number of numeric features. |\n+-------------+-------------------+--------------------------------------------+\n| general | num_to_cat | Compute the number of numerical and |\n| | | categorical features. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | can_cor | Compute canonical correlations of data. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | cor | Compute the absolute value of the |\n| | | correlation of distinct dataset column |\n| | | pairs. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | cov | Compute the absolute value of the |\n| | | covariance of distinct dataset attribute |\n| | | pairs. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | eigenvalues | Compute the eigenvalues of covariance |\n| | | matrix from dataset. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | g_mean | Compute the geometric mean of each |\n| | | attribute. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | gravity | Compute the distance between minority and |\n| | | majority classes center of mass. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | h_mean | Compute the harmonic mean of each |\n| | | attribute. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | iq_range | Compute the interquartile range (IQR) of |\n| | | each attribute. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | kurtosis | Compute the kurtosis of each attribute. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | lh_trace | Compute the Lawley-Hotelling trace. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | mad | Compute the Median Absolute Deviation |\n| | | (MAD) adjusted by a factor. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | max | Compute the maximum value from each |\n| | | attribute. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | mean | Compute the mean value of each attribute. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | median | Compute the median value from each |\n| | | attribute. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | min | Compute the minimum value from each |\n| | | attribute. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | nr_cor_attr | Compute the number of distinct highly |\n| | | correlated pair of attributes. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | nr_disc | Compute the number of canonical |\n| | | correlation between each attribute and |\n| | | class. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | nr_norm | Compute the number of attributes normally |\n| | | distributed based in a given method. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | nr_outliers | Compute the number of attributes with at |\n| | | least one outlier value. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | p_trace | Compute the Pillai's trace. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | range | Compute the range (max - min) of each |\n| | | attribute. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | roy_root | Compute the Roy's largest root. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | sd | Compute the standard deviation of each |\n| | | attribute. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | sd_ratio | Compute a statistical test for homogeneity |\n| | | of covariances. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | skewness | Compute the skewness for each attribute. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | sparsity | Compute (possibly normalized) sparsity |\n| | | metric for each attribute. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | t_mean | Compute the trimmed mean of each |\n| | | attribute. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | var | Compute the variance of each attribute. |\n+-------------+-------------------+--------------------------------------------+\n| statistical | w_lambda | Compute the Wilks' Lambda value. |\n+-------------+-------------------+--------------------------------------------+\n| model-based | leaves | Compute the number of leaf nodes in the DT |\n| | | model. |\n+-------------+-------------------+--------------------------------------------+\n| model-based | leaves_branch | Compute the size of branches in the DT |\n| | | model. |\n+-------------+-------------------+--------------------------------------------+\n| model-based | leaves_corrob | Compute the leaves corroboration of the DT |\n| | | model. |\n+-------------+-------------------+--------------------------------------------+\n| model-based | leaves_homo | Compute the DT model Homogeneity for every |\n| | | leaf node. |\n+-------------+-------------------+--------------------------------------------+\n| model-based | leaves_per_class | Compute the proportion of leaves per class |\n| | | in DT model. |\n+-------------+-------------------+--------------------------------------------+\n| model-based | nodes | Compute the number of non-leaf nodes in DT |\n| | | model. |\n+-------------+-------------------+--------------------------------------------+\n| model-based | nodes_per_attr | Compute the ratio of nodes per number of |\n| | | attributes in DT model. |\n+-------------+-------------------+--------------------------------------------+\n| model-based | nodes_per_inst | Compute the ratio of non-leaf nodes per |\n| | | number of instances in DT model. |\n+-------------+-------------------+--------------------------------------------+\n| model-based | nodes_per_level | Compute the ratio of number of nodes per |\n| | | tree level in DT model. |\n+-------------+-------------------+--------------------------------------------+\n| model-based | nodes_repeated | Compute the number of repeated nodes in DT |\n| | | model. |\n+-------------+-------------------+--------------------------------------------+\n| model-based | tree_depth | Compute the depth of every node in the DT |\n| | | model. |\n+-------------+-------------------+--------------------------------------------+\n| model-based | tree_imbalance | Compute the tree imbalance for each leaf |\n| | | node. |\n+-------------+-------------------+--------------------------------------------+\n| model-based | tree_shape | Compute the tree shape for every leaf |\n| | | node. |\n+-------------+-------------------+--------------------------------------------+\n| model-based | var_importance | Compute the features importance of the DT |\n| | | model for each attribute. |\n+-------------+-------------------+--------------------------------------------+\n| itemset | one_itemset | Compute the one itemset meta-feature. |\n+-------------+-------------------+--------------------------------------------+\n| itemset | two_itemset | Compute the two itemset meta-feature. |\n+-------------+-------------------+--------------------------------------------+")
Total running time of the script: ( 0 minutes 0.165 seconds)
Note
Click here to download the full example code
Listing available metafeatures, groups, and summaries
In this example, we will show you how to list the types of metafeatures, groups, and summaries available.
from sklearn.datasets import load_iris
from pymfe.mfe import MFE
Print all available metafeature groups from the PyMFE package.
model = MFE()
model_groups = model.valid_groups()
print(model_groups)
('landmarking', 'general', 'statistical', 'model-based', 'info-theory', 'relative', 'clustering', 'complexity', 'itemset', 'concept')
Actually, there’s no need to instantiate a model for that
model_groups = MFE.valid_groups()
print(model_groups)
('landmarking', 'general', 'statistical', 'model-based', 'info-theory', 'relative', 'clustering', 'complexity', 'itemset', 'concept')
Print all available metafeatures from some groups of the PyMFE package If no parameter is given (or is ‘None’), then all available will be returned.
model = MFE()
mtfs_all = model.valid_metafeatures()
print(mtfs_all)
('ch', 'int', 'nre', 'pb', 'sc', 'sil', 'vdb', 'vdu', 'c1', 'c2', 'cls_coef', 'density', 'f1', 'f1v', 'f2', 'f3', 'f4', 'hubs', 'l1', 'l2', 'l3', 'lsc', 'n1', 'n2', 'n3', 'n4', 't1', 't2', 't3', 't4', 'cohesiveness', 'conceptvar', 'impconceptvar', 'wg_dist', 'attr_conc', 'attr_ent', 'class_conc', 'class_ent', 'eq_num_attr', 'joint_ent', 'mut_inf', 'ns_ratio', 'best_node', 'elite_nn', 'linear_discr', 'naive_bayes', 'one_nn', 'random_node', 'worst_node', 'attr_to_inst', 'cat_to_num', 'freq_class', 'inst_to_attr', 'nr_attr', 'nr_bin', 'nr_cat', 'nr_class', 'nr_inst', 'nr_num', 'num_to_cat', 'can_cor', 'cor', 'cov', 'eigenvalues', 'g_mean', 'gravity', 'h_mean', 'iq_range', 'kurtosis', 'lh_trace', 'mad', 'max', 'mean', 'median', 'min', 'nr_cor_attr', 'nr_disc', 'nr_norm', 'nr_outliers', 'p_trace', 'range', 'roy_root', 'sd', 'sd_ratio', 'skewness', 'sparsity', 't_mean', 'var', 'w_lambda', 'leaves', 'leaves_branch', 'leaves_corrob', 'leaves_homo', 'leaves_per_class', 'nodes', 'nodes_per_attr', 'nodes_per_inst', 'nodes_per_level', 'nodes_repeated', 'tree_depth', 'tree_imbalance', 'tree_shape', 'var_importance', 'one_itemset', 'two_itemset')
Again, there’s no need to instantiate a model to invoke this method
mtfs_all = MFE.valid_metafeatures()
print(mtfs_all)
('ch', 'int', 'nre', 'pb', 'sc', 'sil', 'vdb', 'vdu', 'c1', 'c2', 'cls_coef', 'density', 'f1', 'f1v', 'f2', 'f3', 'f4', 'hubs', 'l1', 'l2', 'l3', 'lsc', 'n1', 'n2', 'n3', 'n4', 't1', 't2', 't3', 't4', 'cohesiveness', 'conceptvar', 'impconceptvar', 'wg_dist', 'attr_conc', 'attr_ent', 'class_conc', 'class_ent', 'eq_num_attr', 'joint_ent', 'mut_inf', 'ns_ratio', 'best_node', 'elite_nn', 'linear_discr', 'naive_bayes', 'one_nn', 'random_node', 'worst_node', 'attr_to_inst', 'cat_to_num', 'freq_class', 'inst_to_attr', 'nr_attr', 'nr_bin', 'nr_cat', 'nr_class', 'nr_inst', 'nr_num', 'num_to_cat', 'can_cor', 'cor', 'cov', 'eigenvalues', 'g_mean', 'gravity', 'h_mean', 'iq_range', 'kurtosis', 'lh_trace', 'mad', 'max', 'mean', 'median', 'min', 'nr_cor_attr', 'nr_disc', 'nr_norm', 'nr_outliers', 'p_trace', 'range', 'roy_root', 'sd', 'sd_ratio', 'skewness', 'sparsity', 't_mean', 'var', 'w_lambda', 'leaves', 'leaves_branch', 'leaves_corrob', 'leaves_homo', 'leaves_per_class', 'nodes', 'nodes_per_attr', 'nodes_per_inst', 'nodes_per_level', 'nodes_repeated', 'tree_depth', 'tree_imbalance', 'tree_shape', 'var_importance', 'one_itemset', 'two_itemset')
You can specify a group name or a collection of group names to check their correspondent available metafeatures only
mtfs_landmarking = MFE.valid_metafeatures(groups="landmarking")
print(mtfs_landmarking)
mtfs_subset = MFE.valid_metafeatures(groups=["general", "relative"])
print(mtfs_subset)
('best_node', 'elite_nn', 'linear_discr', 'naive_bayes', 'one_nn', 'random_node', 'worst_node')
('attr_to_inst', 'cat_to_num', 'freq_class', 'inst_to_attr', 'nr_attr', 'nr_bin', 'nr_cat', 'nr_class', 'nr_inst', 'nr_num', 'num_to_cat', 'best_node', 'elite_nn', 'linear_discr', 'naive_bayes', 'one_nn', 'random_node', 'worst_node')
Print all available summary functions from the PyMFE package
model = MFE()
summaries = model.valid_summary()
print(summaries)
('mean', 'nanmean', 'sd', 'nansd', 'var', 'nanvar', 'count', 'nancount', 'histogram', 'nanhistogram', 'iq_range', 'naniq_range', 'kurtosis', 'nankurtosis', 'max', 'nanmax', 'median', 'nanmedian', 'min', 'nanmin', 'quantiles', 'nanquantiles', 'range', 'nanrange', 'skewness', 'nanskewness', 'sum', 'nansum', 'powersum', 'pnorm', 'nanpowersum', 'nanpnorm')
Once again, there’s no need to instantiate a model to accomplish this
summaries = MFE.valid_summary()
print(summaries)
('mean', 'nanmean', 'sd', 'nansd', 'var', 'nanvar', 'count', 'nancount', 'histogram', 'nanhistogram', 'iq_range', 'naniq_range', 'kurtosis', 'nankurtosis', 'max', 'nanmax', 'median', 'nanmedian', 'min', 'nanmin', 'quantiles', 'nanquantiles', 'range', 'nanrange', 'skewness', 'nanskewness', 'sum', 'nansum', 'powersum', 'pnorm', 'nanpowersum', 'nanpnorm')
Total running time of the script: ( 0 minutes 0.012 seconds)
Note
Click here to download the full example code
Working with the results
In this example, we will show you how to work with the results of metafeatures extraction.
from sklearn.datasets import load_iris
from pymfe.mfe import MFE
data = load_iris()
y = data.target
X = data.data
Parsing subset of metafeaure
After extracting metafeatures, parse a subset of interest from the results.
model = MFE(groups=["relative", "general", "model-based"], measure_time="avg")
model.fit(X, y)
ft = model.extract()
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
attr_to_inst 0.02666666666666667
best_node.mean.relative 3.0
best_node.sd.relative 1.0
cat_to_num 0.0
elite_nn.mean.relative 4.0
elite_nn.sd.relative 6.0
freq_class.mean 0.3333333333333333
freq_class.sd 0.0
inst_to_attr 37.5
leaves 9
leaves_branch.mean 3.7777777777777777
leaves_branch.sd 1.2018504251546631
leaves_corrob.mean 0.1111111111111111
leaves_corrob.sd 0.15051762539834182
leaves_homo.mean 37.46666666666667
leaves_homo.sd 13.142298124757328
leaves_per_class.mean 0.3333333333333333
leaves_per_class.sd 0.22222222222222224
linear_discr.mean.relative 7.0
linear_discr.sd.relative 2.5
naive_bayes.mean.relative 5.0
naive_bayes.sd.relative 2.5
nodes 8
nodes_per_attr 2.0
nodes_per_inst 0.05333333333333334
nodes_per_level.mean 1.6
nodes_per_level.sd 0.8944271909999159
nodes_repeated.mean 2.6666666666666665
nodes_repeated.sd 0.5773502691896258
nr_attr 4
nr_bin 0
nr_cat 0
nr_class 3
nr_inst 150
nr_num 4
num_to_cat nan
one_nn.mean.relative 6.0
one_nn.sd.relative 5.0
random_node.mean.relative 2.0
random_node.sd.relative 4.0
tree_depth.mean 3.0588235294117645
tree_depth.sd 1.4348601079588785
tree_imbalance.mean 0.19491705385114738
tree_imbalance.sd 0.13300709991513865
tree_shape.mean 0.2708333333333333
tree_shape.sd 0.10711960313126631
var_importance.mean 0.25
var_importance.sd 0.27845186989521703
worst_node.mean.relative 1.0
worst_node.sd.relative 7.0
From the extract output, parse only the ‘general’ metafeatures
ft_general = model.parse_by_group("general", ft)
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft_general[0],
ft_general[1])))
attr_to_inst 0.02666666666666667
cat_to_num 0.0
freq_class.mean 0.3333333333333333
freq_class.sd 0.0
inst_to_attr 37.5
nr_attr 4
nr_bin 0
nr_cat 0
nr_class 3
nr_inst 150
nr_num 4
num_to_cat nan
Actually, you can parse by various groups at once. In this case, the selected metafeatures must be from one of the given groups.
ft_subset = model.parse_by_group(["general", "model-based"], ft)
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft_subset[0],
ft_subset[1])))
attr_to_inst 0.02666666666666667
cat_to_num 0.0
freq_class.mean 0.3333333333333333
freq_class.sd 0.0
inst_to_attr 37.5
leaves 9
leaves_branch.mean 3.7777777777777777
leaves_branch.sd 1.2018504251546631
leaves_corrob.mean 0.1111111111111111
leaves_corrob.sd 0.15051762539834182
leaves_homo.mean 37.46666666666667
leaves_homo.sd 13.142298124757328
leaves_per_class.mean 0.3333333333333333
leaves_per_class.sd 0.22222222222222224
nodes 8
nodes_per_attr 2.0
nodes_per_inst 0.05333333333333334
nodes_per_level.mean 1.6
nodes_per_level.sd 0.8944271909999159
nodes_repeated.mean 2.6666666666666665
nodes_repeated.sd 0.5773502691896258
nr_attr 4
nr_bin 0
nr_cat 0
nr_class 3
nr_inst 150
nr_num 4
num_to_cat nan
tree_depth.mean 3.0588235294117645
tree_depth.sd 1.4348601079588785
tree_imbalance.mean 0.19491705385114738
tree_imbalance.sd 0.13300709991513865
tree_shape.mean 0.2708333333333333
tree_shape.sd 0.10711960313126631
var_importance.mean 0.25
var_importance.sd 0.27845186989521703
Maybe an uncommon scenario, given that the user already have instantiated some MFE model to extract the metafeatures, but actually there’s no need to instantiate a MFE model to parse the results.
ft_subset = MFE.parse_by_group(["general", "model-based"], ft)
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft_subset[0],
ft_subset[1])))
attr_to_inst 0.02666666666666667
cat_to_num 0.0
freq_class.mean 0.3333333333333333
freq_class.sd 0.0
inst_to_attr 37.5
leaves 9
leaves_branch.mean 3.7777777777777777
leaves_branch.sd 1.2018504251546631
leaves_corrob.mean 0.1111111111111111
leaves_corrob.sd 0.15051762539834182
leaves_homo.mean 37.46666666666667
leaves_homo.sd 13.142298124757328
leaves_per_class.mean 0.3333333333333333
leaves_per_class.sd 0.22222222222222224
nodes 8
nodes_per_attr 2.0
nodes_per_inst 0.05333333333333334
nodes_per_level.mean 1.6
nodes_per_level.sd 0.8944271909999159
nodes_repeated.mean 2.6666666666666665
nodes_repeated.sd 0.5773502691896258
nr_attr 4
nr_bin 0
nr_cat 0
nr_class 3
nr_inst 150
nr_num 4
num_to_cat nan
tree_depth.mean 3.0588235294117645
tree_depth.sd 1.4348601079588785
tree_imbalance.mean 0.19491705385114738
tree_imbalance.sd 0.13300709991513865
tree_shape.mean 0.2708333333333333
tree_shape.sd 0.10711960313126631
var_importance.mean 0.25
var_importance.sd 0.27845186989521703
Total running time of the script: ( 0 minutes 0.088 seconds)
Note
Click here to download the full example code
Plotting elapsed time in a meta-feature extraction
In this example, we will show you how the default value max_attr_num of meta-feature attr_conc was defined based on the total elapsed time of Iris dataset.

0. Number of attributes: 4 ...
1. Number of attributes: 8 ...
2. Number of attributes: 12 ...
3. Number of attributes: 16 ...
4. Number of attributes: 20 ...
5. Number of attributes: 24 ...
6. Number of attributes: 28 ...
7. Number of attributes: 32 ...
8. Number of attributes: 36 ...
9. Number of attributes: 40 ...
# Load a dataset
from sklearn.datasets import load_iris
import numpy as np
import pymfe.mfe
import matplotlib.pyplot as plt
iris = load_iris()
# Added a default value for `max_attr_num` parameter of the `attr_conc`
# meta-feature extraction method, which is the most expensive meta-feature
# extraction method by far.
# The default parameter was determined by a simple inspection at the feature
# extraction time growing rate to the number of attributes on the fitted data.
# The threshold accepted for the time extraction is a value less than 2
# seconds.
# The test dataset was the iris dataset. The test code used is reproduced
# below.
np.random.seed(0)
arrsize = np.zeros(10)
time = np.zeros(10)
X = np.empty((iris.target.size, 0))
for i in np.arange(10):
X = np.hstack((X, iris.data))
print(f"{i}. Number of attributes: {X.shape[1]} ...")
model = pymfe.mfe.MFE(features="attr_conc",
summary="mean",
measure_time="total").fit(X)
res = model.extract(suppress_warnings=True)
arrsize[i] = model._custom_args_ft["C"].shape[1]
time[i] = res[2][0]
plt.plot(arrsize, time, label="time elapsed")
plt.hlines(y=np.arange(1, 1 + int(np.ceil(np.max(time)))),
xmin=0,
xmax=arrsize[-1],
linestyle="dotted",
color="red")
plt.legend()
plt.show()
# The time cost of extraction for the attr_conc meta-feature does not grow
# significantly with the number of instance and, hence, it is not necessary to
# sample in the instance axis.
Total running time of the script: ( 0 minutes 9.215 seconds)
Note
Click here to download the full example code
Using Pandas, CSV and ARFF files
In this example we will show you how to use Pandas, CSV and ARFF in PyMFE.
# Necessary imports
import pandas as pd
import numpy as np
from numpy import genfromtxt
from pymfe.mfe import MFE
import csv
import arff
Pandas
Generating synthetic dataset
np.random.seed(42)
sample_size = 150
numeric = pd.DataFrame({
'num1': np.random.randint(0, 100, size=sample_size),
'num2': np.random.randint(0, 100, size=sample_size)
})
categoric = pd.DataFrame({
'cat1': np.repeat(('cat1-1', 'cat1-2'), sample_size/2),
'cat2': np.repeat(('cat2-1', 'cat2-2', 'cat2-3'), sample_size/3)
})
X = numeric.join(categoric)
y = pd.Series(np.repeat(['C1', 'C2'], sample_size/2))
Exploring characteristics of the data
print("X shape --> ", X.shape)
print("y shape --> ", y.shape)
print("classes --> ", np.unique(y.values))
print("X dtypes --> \n", X.dtypes)
print("y dtypes --> ", y.dtypes)
X shape --> (150, 4)
y shape --> (150,)
classes --> ['C1' 'C2']
X dtypes -->
num1 int64
num2 int64
cat1 object
cat2 object
dtype: object
y dtypes --> object
For extracting meta-features, you should send X
and y
as a sequence,
like numpy array or Python list.
It is easy to make this using pandas:
mfe = MFE(
groups=["general", "statistical", "info-theory"],
random_state=42
)
mfe.fit(X.values, y.values)
ft = mfe.extract(cat_cols='auto', suppress_warnings=True)
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
attr_conc.mean 0.09803782687811123
attr_conc.sd 0.20092780162502177
attr_ent.mean 1.806723506674862
attr_ent.sd 0.6400186923915645
attr_to_inst 0.02666666666666667
can_cor.mean 0.9999999999999939
can_cor.sd nan
cat_to_num 1.0
class_conc.mean 0.3367786962694272
class_conc.sd 0.46816436703211783
class_ent 1.0
cor.mean 0.21555756110154345
cor.sd 0.3346883517496457
cov.mean 1.1208620432513037
cov.sd 2.935012658229703
eigenvalues.mean 302.5088590604026
eigenvalues.sd 468.34378838676076
eq_num_attr 2.342391810715466
freq_class.mean 0.5
freq_class.sd 0.0
g_mean.mean nan
g_mean.sd nan
gravity 2.526235671244205
h_mean.mean 0.0
h_mean.sd 0.0
inst_to_attr 37.5
iq_range.mean 17.875
iq_range.sd 26.169519483551852
joint_ent.mean 2.3798094458309773
joint_ent.sd 1.1272619013187726
kurtosis.mean -1.590742398569767
kurtosis.sd 0.3510329462751763
lh_trace 81883629588553.47
mad.mean 13.0963
mad.sd 19.739563046227747
max.mean 33.5
max.sd 50.34977656355587
mean.mean 16.575555555555553
mean.sd 25.03349601959269
median.mean 17.083333333333332
median.sd 26.079525813685088
min.mean 0.0
min.sd 0.0
mut_inf.mean 0.42691406084388483
mut_inf.sd 0.4886487667308632
nr_attr 4
nr_bin 1
nr_cat 2
nr_class 2
nr_cor_attr 0.26666666666666666
nr_disc 1
nr_inst 150
nr_norm 0.0
nr_num 2
nr_outliers 0
ns_ratio 3.232054346262326
num_to_cat 1.0
p_trace 0.9999999999999878
range.mean 33.5
range.sd 50.34977656355587
roy_root 81883629588553.47
sd.mean 10.363955372064863
sd.sd 15.300874018418051
sd_ratio nan
skewness.mean 0.21813480959596565
skewness.sd 0.37469793054651285
sparsity.mean 0.20977689098494468
sparsity.sd 0.24417958923140248
t_mean.mean 16.63888888888889
t_mean.sd 25.218526160907878
var.mean 302.50885906040264
var.sd 468.28423639285654
w_lambda 1.2212453270876724e-14
Pandas CSV
Getting data from CSV format
df = pd.read_csv('../data/data.csv')
X, y = df.drop('class', axis=1), df['class']
Exploring characteristics of the data
print("X shape --> ", X.shape)
print("y shape --> ", y.shape)
print("classes --> ", np.unique(y))
print("X dtypes --> \n", X.dtypes)
print("y dtypes --> ", y.dtypes)
X shape --> (150, 4)
y shape --> (150,)
classes --> ['C1' 'C2']
X dtypes -->
num1 int64
num2 int64
cat1 object
cat2 object
dtype: object
y dtypes --> object
For extracting meta-features, you should send X
and y
as a sequence,
like numpy array or Python list.
It is easy to make this using pandas:
mfe = MFE(
groups=["general", "statistical", "info-theory"],
random_state=42
)
mfe.fit(X.values, y.values)
ft = mfe.extract(cat_cols='auto', suppress_warnings=True)
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
attr_conc.mean 0.09803782687811123
attr_conc.sd 0.20092780162502177
attr_ent.mean 1.806723506674862
attr_ent.sd 0.6400186923915645
attr_to_inst 0.02666666666666667
can_cor.mean 0.9999999999999939
can_cor.sd nan
cat_to_num 1.0
class_conc.mean 0.3367786962694272
class_conc.sd 0.46816436703211783
class_ent 1.0
cor.mean 0.21555756110154345
cor.sd 0.3346883517496457
cov.mean 1.1208620432513037
cov.sd 2.935012658229703
eigenvalues.mean 302.5088590604026
eigenvalues.sd 468.34378838676076
eq_num_attr 2.342391810715466
freq_class.mean 0.5
freq_class.sd 0.0
g_mean.mean nan
g_mean.sd nan
gravity 2.526235671244205
h_mean.mean 0.0
h_mean.sd 0.0
inst_to_attr 37.5
iq_range.mean 17.875
iq_range.sd 26.169519483551852
joint_ent.mean 2.3798094458309773
joint_ent.sd 1.1272619013187726
kurtosis.mean -1.590742398569767
kurtosis.sd 0.3510329462751763
lh_trace 81883629588553.47
mad.mean 13.0963
mad.sd 19.739563046227747
max.mean 33.5
max.sd 50.34977656355587
mean.mean 16.575555555555553
mean.sd 25.03349601959269
median.mean 17.083333333333332
median.sd 26.079525813685088
min.mean 0.0
min.sd 0.0
mut_inf.mean 0.42691406084388483
mut_inf.sd 0.4886487667308632
nr_attr 4
nr_bin 1
nr_cat 2
nr_class 2
nr_cor_attr 0.26666666666666666
nr_disc 1
nr_inst 150
nr_norm 0.0
nr_num 2
nr_outliers 0
ns_ratio 3.232054346262326
num_to_cat 1.0
p_trace 0.9999999999999878
range.mean 33.5
range.sd 50.34977656355587
roy_root 81883629588553.47
sd.mean 10.363955372064863
sd.sd 15.300874018418051
sd_ratio nan
skewness.mean 0.21813480959596565
skewness.sd 0.37469793054651285
sparsity.mean 0.20977689098494468
sparsity.sd 0.24417958923140248
t_mean.mean 16.63888888888889
t_mean.sd 25.218526160907878
var.mean 302.50885906040264
var.sd 468.28423639285654
w_lambda 1.2212453270876724e-14
ARFF
Getting data from ARFF format:
data = arff.load(open('../data/data.arff', 'r'))['data']
X = [i[:4] for i in data]
y = [i[-1] for i in data]
Exploring characteristics of the data
print("X shape --> ", len(X))
print("y shape --> ", len(y))
print("classes --> ", np.unique(y))
print("X dtypes --> ", type(X))
print("y dtypes --> ", type(y))
X shape --> 150
y shape --> 150
classes --> ['C1' 'C2']
X dtypes --> <class 'list'>
y dtypes --> <class 'list'>
For extracting meta-features, you should send X
and y
as a sequence,
like numpy array or Python list.
You can do this directly:
mfe = MFE(
groups=["general", "statistical", "info-theory"],
random_state=42
)
mfe.fit(X, y)
ft = mfe.extract(cat_cols='auto', suppress_warnings=True)
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
attr_conc.mean 0.09803782687811123
attr_conc.sd 0.20092780162502177
attr_ent.mean 1.806723506674862
attr_ent.sd 0.6400186923915645
attr_to_inst 0.02666666666666667
can_cor.mean 0.9999999999999939
can_cor.sd nan
cat_to_num 1.0
class_conc.mean 0.3367786962694272
class_conc.sd 0.46816436703211783
class_ent 1.0
cor.mean 0.21555756110154345
cor.sd 0.3346883517496457
cov.mean 1.1208620432513037
cov.sd 2.935012658229703
eigenvalues.mean 302.5088590604026
eigenvalues.sd 468.34378838676076
eq_num_attr 2.342391810715466
freq_class.mean 0.5
freq_class.sd 0.0
g_mean.mean nan
g_mean.sd nan
gravity 2.526235671244205
h_mean.mean 0.0
h_mean.sd 0.0
inst_to_attr 37.5
iq_range.mean 17.875
iq_range.sd 26.169519483551852
joint_ent.mean 2.3798094458309773
joint_ent.sd 1.1272619013187726
kurtosis.mean -1.590742398569767
kurtosis.sd 0.3510329462751763
lh_trace 81883629588553.47
mad.mean 13.0963
mad.sd 19.739563046227747
max.mean 33.5
max.sd 50.34977656355587
mean.mean 16.575555555555553
mean.sd 25.03349601959269
median.mean 17.083333333333332
median.sd 26.079525813685088
min.mean 0.0
min.sd 0.0
mut_inf.mean 0.42691406084388483
mut_inf.sd 0.4886487667308632
nr_attr 4
nr_bin 1
nr_cat 2
nr_class 2
nr_cor_attr 0.26666666666666666
nr_disc 1
nr_inst 150
nr_norm 0.0
nr_num 2
nr_outliers 0
ns_ratio 3.232054346262326
num_to_cat 1.0
p_trace 0.9999999999999878
range.mean 33.5
range.sd 50.34977656355587
roy_root 81883629588553.47
sd.mean 10.363955372064863
sd.sd 15.300874018418051
sd_ratio nan
skewness.mean 0.21813480959596565
skewness.sd 0.37469793054651285
sparsity.mean 0.20977689098494468
sparsity.sd 0.24417958923140248
t_mean.mean 16.63888888888889
t_mean.sd 25.218526160907878
var.mean 302.50885906040264
var.sd 468.28423639285654
w_lambda 1.2212453270876724e-14
As a final example, we do not use the automatic detection of feature type here. We use the ids provided by the liac-arff package.
classid = 4
data = arff.load(open('../data/data.arff', 'r'), encode_nominal=True)
cat_cols = [n for n, i in enumerate(data['attributes'][:classid])
if isinstance(i[1], list)]
data = np.array(data['data'])
X = data[:, :classid]
y = data[:, classid]
Exploring characteristics of the data
print("X shape --> ", len(X))
print("y shape --> ", len(y))
print("classes --> ", np.unique(y))
print("X dtypes --> ", type(X))
print("y dtypes --> ", type(y))
X shape --> 150
y shape --> 150
classes --> [0. 1.]
X dtypes --> <class 'numpy.ndarray'>
y dtypes --> <class 'numpy.ndarray'>
For extracting meta-features, you should send X
and y
as a sequence,
like numpy array or python list.
mfe = MFE(
groups=["general", "statistical", "info-theory"],
random_state=42
)
mfe.fit(X, y, cat_cols=cat_cols)
ft = mfe.extract(suppress_warnings=True)
print("\n".join("{:50} {:30}".format(x, y) for x, y in zip(ft[0], ft[1])))
attr_conc.mean 0.09803782687811123
attr_conc.sd 0.20092780162502177
attr_ent.mean 1.806723506674862
attr_ent.sd 0.6400186923915645
attr_to_inst 0.02666666666666667
can_cor.mean 0.9999999999999939
can_cor.sd nan
cat_to_num 1.0
class_conc.mean 0.3367786962694272
class_conc.sd 0.46816436703211783
class_ent 1.0
cor.mean 0.21555756110154345
cor.sd 0.3346883517496457
cov.mean 1.1208620432513037
cov.sd 2.935012658229703
eigenvalues.mean 302.5088590604026
eigenvalues.sd 468.34378838676076
eq_num_attr 2.342391810715466
freq_class.mean 0.5
freq_class.sd 0.0
g_mean.mean nan
g_mean.sd nan
gravity 2.526235671244205
h_mean.mean 0.0
h_mean.sd 0.0
inst_to_attr 37.5
iq_range.mean 17.875
iq_range.sd 26.169519483551852
joint_ent.mean 2.3798094458309773
joint_ent.sd 1.1272619013187726
kurtosis.mean -1.590742398569767
kurtosis.sd 0.3510329462751763
lh_trace 81883629588553.47
mad.mean 13.0963
mad.sd 19.739563046227747
max.mean 33.5
max.sd 50.34977656355587
mean.mean 16.575555555555553
mean.sd 25.03349601959269
median.mean 17.083333333333332
median.sd 26.079525813685088
min.mean 0.0
min.sd 0.0
mut_inf.mean 0.42691406084388483
mut_inf.sd 0.4886487667308632
nr_attr 4
nr_bin 1
nr_cat 2
nr_class 2
nr_cor_attr 0.26666666666666666
nr_disc 1
nr_inst 150
nr_norm 0.0
nr_num 2
nr_outliers 0
ns_ratio 3.232054346262326
num_to_cat 1.0
p_trace 0.9999999999999878
range.mean 33.5
range.sd 50.34977656355587
roy_root 81883629588553.47
sd.mean 10.363955372064863
sd.sd 15.300874018418051
sd_ratio nan
skewness.mean 0.21813480959596565
skewness.sd 0.37469793054651285
sparsity.mean 0.20977689098494468
sparsity.sd 0.24417958923140248
t_mean.mean 16.63888888888889
t_mean.sd 25.218526160907878
var.mean 302.50885906040264
var.sd 468.28423639285654
w_lambda 1.2212453270876724e-14
Total running time of the script: ( 0 minutes 0.881 seconds)
Examples for Developers
These examples are dedicated to any person that wish contribute to the development of the package or understand more about it. We expect that these examples show you the basic about PYMFE architecture and inspire you to contribute.
Note
Click here to download the full example code
A developer sample class for Metafeature groups.
This class was built to give a model of how you should write a metafeature group class as a Pymfe developer. Please read this entire guide with attention before programming your own class.
- At the end of this reading, you will know:
How to register your class as a valid MFE metafeature class
Which are the special method name prefixes, and how to properly use each of them
Which are the rules involving precomputation, metafeature extraction and post-processing methods
Which are the coding practices usually adopted in this library, that you should follow in order to get your changes accepted in the master branch
Also, feel free to copy this file to use as template for your own class.
First, some tips and tricks which may help you follow the code standards stabilished in this library.
1. Use type annotations as much as possible
Always run mypy to check if the variable types was specified correctly. You can install it with pip using the following command line:
>>> pip install -U mypy
Use the following command before pushing your modifications to the remote repository:
>>> mypy pymfe --ignore-missing-imports
The expected output for this command is no output.
Note that all warnings must be fixed to your modifications be accepted in the master branch, so take your time to fix your variable types carefully.
2. Use pylint to check your code style and auto-formatters such as yapf
Pylint can be used to check if your code follow some coding practices adopted by the python community. You can install with with pip using the following command:
>>> pip install -U pylint
It can be harsh sometimes, so we have decided to disable some of the verifications. You can use the following command to check if your code met the standards stabilished in this library,
>>> pylint pymfe -d 'C0103, R0913, R0902, R0914, C0302, R0904, R0801, E1101'
The expected output is something like
>>> Your code has been rated at 10.00/10 (previous run: x/10, y)
Your code will not be accepted in the master branch unless it gets the maximum pylint score.
Yapf is a code auto-formatter which usually solves a large amount of coding style related issues automatically.
>>> pip install -U yapf
If you use the flag -i
, Yapf changes your code in-place.
>>> yapf -i yourModulename.py
3. Make all verifications with the provided Makefile
You can use the Makefile provided in the root directory to run mypy, pylint, and also pytest. Obviously, all tests (both for coding style and programming logic) must pass in order to your modifications be accepted.
You can use the tag test-cov
for run tests and get the code coverage:
>>> make test-cov
You can use the tag test
for only run the tests:
>>> make test
You can use the tag code-check
for make all verifications with mypy,
pylint and pep8:
>>> make code-check
Remember that your code must pass all verifications included in both
code-check
and test
/test-cov
to your changes be accepted in
the master branch.
Note
This example shows how to create a new group of meta-features. If you want only to add a new meta-feature, you should insert it in the meta-feature group file and create an “ft_” method to it. The new meta-feature will be automatically picked up (as the method “ft_metafeature_name” in this example). You should not forget to use the precompute methods to save time.
Note
You should not forget to create tests for all new functionalities that you implemented. All tests can be found in ./tests/ fold. Please follow the existing code style while creating your tests as much as possible.
Note
This class is being updated in GitHub, check this link to see the current version.
import typing as t
import time
import numpy as np
class MFEBoilerplate:
"""The class name must start with ``MFE`` (just to keep code consistency)
concatenated with the corresponding metafeature group name (e.g.,
``MFEStatistical`` or ``MFEGeneral``) in CamelCase format.
Also, the class must be registered in the ``_internal.py`` module to be
an official MFE class, because the pymfe framework is supposed to detect
the metafeature extraction methods automatically, so you must explain
where it is supposed to look for those methods.
Three tuples at module level in ``_internal.py`` module must be updated
to your new class be detected correctly:
1. VALID_GROUPS: str
Here you should write the name of your metafeature group. (e.g.,
``statistical`` or ``general``. This name is the value that will
be given by the user in the ``groups`` MFE parameter to extract
the all the metafeatures programmed here. Please select a
sufficiently representative name for your metafeature group.
2. GROUP_PREREQUISITES : str or :obj:`tuple` of str
Use this tuple to register other MFE metafeature group classes
as dependencies of your class. This means that, if the user ask
to extract the metafeatures of your class, then all metafeature
groups in the prerequisites will also be extracted also (even if
the user doesn't ask explicity for these groups). Note that the
possible consequences this may imply must be solved within this
class post-processing methods (these methods will be explained
later in this same guide.)
The values of this tuple can be strings (which means one single
dependency), sequences with strings (which means your class has
multiple dependencies), or simply None (which means your class
has no dependencies). Generally your class will not have any
dependency, so just stick to the last option if you are not sure
so far.
3. VALID_MFECLASSES : MFE Classes
In this tuple you should just insert a reference to your class.
Note that this imply that this module must be imported at the top
of the module ``_internal.py``.
These three tuples have one-to-one correspondence using the indexes,
so the order of values does matter. Please insert your class in the
same index for all three tuples.
===================================================================
For example, if we want to make this specific template class an official
MFE class, those three tuples should be updated as follows: (Remember that
all tuples below are found in ``_internal.py`` module.)
-------------------------------------------------------------------
# 1. First, choose carefully a metafeature group name. This value will be
# used directly by the user when extracting the metafeatures programmed
# in this class, so it must be meaningful and as short as possible.
VALID_GROUPS = (
...,
"boilerplate",
)
-------------------------------------------------------------------
# 2. Generally your class will not have any dependency, so you should
# just register ``None`` as prerequisites. Remember that a class can
# have any number of dependencies (0, 1 or more than 1.)
GROUP_PREREQUISITES = (
...,
None,
)
-------------------------------------------------------------------
# 3. The last step is to insert your class in this tuple below.
# Remember to import your module in the ``_internal.py`` module.
# So, for instance, to register this class, 'MFEBoilerplate', as
# an official MFE metafeature extractor class, we should make the
# following modifications:
import pymfe._dev as _dev
VALID_MFECLASSES = (
...,
_dev.MFEBoilerplate,
)
After this three simple steps, your class is now an official MFE
metafeature extraction class. From now on you no longer need to
worry about the ``_internal.py`` module and any other external
pymfe module.
===================================================================
Now that you know how to handle the issues related to the
``_internal.py`` module, let's start with the actual MFE class
development.
This tutorial is built to introduce all the different elements
following the natural order of how a regular MFE Class is usually
presented.
Therefore, the order that we shall see the different concepts in
this guide is:
1. Precomputation methods (prefixed with ``precompute_``)
Methods related to this subject:
1.1 precompute_basic_precomp_method
1.2 precompute_more_info
1.3 precompute_random_values
2. Feature extraction methods (prefixed with ``ft_``)
Methods related to this subject:
2.1 ft_metafeature_name
2.2 ft_fitted_data_arguments
2.3 ft_using_precomputed_values
2.4 ft_about_return_values
3. Regular/auxiliary methods (non-prefixed )
Methods related to this subject:
3.1 _protected_methods
3.2 non_protected_methods_without_any_prefixes
4. Postprocessing methods (prefixed with ``postprocess_``)
Methods related to this subject:
4.1 postprocess_groupName1_groupName2
So, we shall start looking at a example of a precomputation
method.
"""
# Important detail: all methods must be classmethods; there is no class
# instantiation in the pymfe framework.
@classmethod
def precompute_basic_precomp_method(cls,
y: t.Optional[np.ndarray] = None,
argument_bar: t.Optional[int] = None,
**kwargs) -> t.Dict[str, t.Any]:
"""A precomputation method example.
The pydoc of each method must explain cleary what is the purpose of
that method. This method is supposed to introduce a powerful concept
of the pymfe framework: precomputation methods.
1. Why precomputation methods?
-----------------------------------------------------------------
All methods whose name is prefixed with ``precompute_`` are
executed automatically before the metafeature extraction. These
methods are extremely important to improve the performance of
the Pymfe library, as it is quite common that different metafeature
extraction methods uses the very same information.
The idea behind this type of methods is to fill up a shared cache
with all values that can be shared by different metafeature extraction
methods, and also between different metafeature group classes. This
means that the values precomputed in ``MFEFoo`` class can also be used
in the ``MFEBar`` class methods.
2. Naming convention of a precomputation method
-----------------------------------------------------------------
The name of the method does not matter, as long as it starts with
the prefix ``precompute_``. This prefix is used to tell the Pymfe
framework that this is a precomputation method. As you will see during
this guide, the Pymfe rely heavily on prefixes in the method names, so
it is important that you don't forget them, and use them appropriately.
3. Arguments of a precomputation method
-----------------------------------------------------------------
The structure of these precomputation methods is pretty simple. In the
arguments you can specify custom parameters such as ``X`` and ``y``
that are automatically given by the MFE class. Those attributes can be
registered in a special attribute in the MFE class, or also given by
the user, but you should not rely on this feature; just stick to the
MFE registered arguments, and let all user-customizable attributes
have a default value. How exactly those arguments arrive as method
arguments is not important to develop an MFE metafeature extraction
class. If you're curious, you should examine the ``mfe.py`` and
``_internal.py`` modules by yourself, but it will take some time and
is not encouraged unless you plan an actual framework redesign.
It is obligatory to receive the ``kwargs`` in every precomputation
method. You are free to pick up values from it. We recommend you to
use the ``get`` method for this task. However, it is forbidden to
remove or modify the existing values in it. This parameter must be
considered ``read-only`` except to the insertion of new key-value
pairs. The reason behind this is that there's no guarantee of any
execution order of the precomputation methods within a class and
neither between classes, so all precomputation methods must have
the chance to read the same values.
4. Return values of precomputation methods
-----------------------------------------------------------------
All precomputation methods must return a dictionary with strings as
keys. The value data type can be anything. Note that the name of the
keys will be used later to match the argument names of feature
extraction methods. It means that, if you return a dictionary in the
form: {'foo': 1, 'bar': ('a', 'b')}, then all feature extraction
methods with an argument named ``foo`` will receive value ``1``, and
every method with argument named ``bar`` will receive a tuple with 'a'
and 'b' elements. Always choose meaningful key/argument names.
As this framework rely on a dictionary to distribute the parameters
between feature extraction methods, your precomputed keys should never
replace existing keys with different values, and you should not give
the same name to parameters with different semantics or purposes.
The rule of thumb for the pymfe lybrary is: 'if two things have the
same name, then they are the same thing'. Therefore, avoid extremely
generic argument names such as ``freqs``, ``mean``, ``model`` etc.
5. The user can disable precomputation methods
-----------------------------------------------------------------
Keep in mind that the user can disable the precomputation methods,
mainly due to memory constraints.
Never rely on these methods to produce any mandatory arguments. All
the precomputed values here should go to optional parameters and all
receptor metafeature extraction methods must be responsible to verify
if all values were effectively precomputed (i.e., they are not
``None``). If this is not the case, unfortunately these methods must
compute those arguments for themselves. If it is not clear how it
works for you by now, it will probably be easier to grasp when we
reach our first actual metafeature extraction method. For now, it is
just important to keep in mind that: you will need to recompute all
the stuff precomputed in every precomputations methods inside other
methods whenever those values are needed for the case when the user
disable the precomputation methods.
Parameters
----------
y : :obj:`np.ndarray`, optional
Always give clear and meaningful description to every argument.
argument_bar : int, optional
Some user-given attribute.
**kwargs:
Additional arguments. May have previously precomputed before
this method from other precomputed methods, so they can help
speed up this precomputation avoiding duplicated work.
Returns
-------
:obj:`dict`
The following precomputed items are returned:
* ``y_unique`` (:obj:`np.ndarray`): unique values from
``y``, if it is not None.
* ``absolute_bar`` (float): absolute value of
``argument_bar``, if it is not None.
"""
precomp_vals = {} # type: t.Dict[str, t.Any]
# Always consider that your precomputation argument could
# be precomputed by another precomputation method (even if
# from a different module), so check if the new key is not
# already in kwargs before calculating anything.
if argument_bar is not None and "absolute_bar" not in kwargs:
precomp_vals["absolute_bar"] = abs(argument_bar)
# The number of precomputed values within a single precomputation
# method vary greatly, from just a single value to a few amount.
# As long as all values are semantically sufficiently related with
# each other, you don't need to create new precomputation methods.
if y is not None and "y_unique" not in kwargs:
y_unique = np.unique(y, return_counts=False)
precomp_vals["y_unique"] = y_unique
# Always return a dictionary, even if it is empty
return precomp_vals
@classmethod
def precompute_more_info(cls,
argument_bar: t.Optional[int] = None,
**kwargs) -> t.Dict[str, t.Any]:
"""Highly relevant information about precomputation methods.
1. How many precomputation methods per class?
-----------------------------------------------------------------
Every MFE metafeature extraction class can have as many of
precomputation methods as needed. Don't hesitate to create
new precomputation methods whenever you think it will help
to improve the performance of the package.
2. How many precomputed values per precomputation method?
-----------------------------------------------------------------
There is no limit of how many values can be precomputed within
a single precomputation method.
However, try to keep every precomputation method precompute only
related values to avoid confusion. Prefer to calculate dissociated
values in distinct precomputation methods.
3. Using other precomputed values in a precomputation method
-----------------------------------------------------------------
Don't rely on the execution order of precomputation methods. Always
assume that the precomputation methods (even within the same class)
can be executed in any order. However, it does not mean that you
can't at least try to use previously precomputed methods: that's why
the 'kwargs' is used in all precomputation methods.
If needed, try to get a value from 'kwargs' using the 'get' method
(i.e., kwargs.get('argument_name') - remember 'kwargs' is just a
Python dictionary.) Then, check whether that value was successfully
gotten (i.e., is not None).
Parameters
----------
argument_bar : int, optional
Some user-given attribute. Note that it has the same value as
in the previous precomputation method, because it is the same
argument (it has the same name.)
**kwargs:
Additional arguments. May have previously precomputed before
this method from other precomputed methods, so they can help
speed up this precomputation avoiding duplicated work.
Returns
-------
:obj:`dict`
The following precomputed items are returned:
* ``double_absolute_bar`` (int): two times the
value of ``absolute_bar``, which may or may not
be precomputed in the previous precomputation
method. If it is not the case, we precompute
``absolute_bar`` here and also store its value.
* ``qux`` (float): value is equal to 1.0.
* ``quux`` (:obj:`complex`) Imaginary value related to
``qux``.
* ``quuz`` (:obj:`np.ndarray`): an sequence based
on ``qux``.
"""
precomp_vals = {} # type: t.Dict[str, t.Any]
if argument_bar is not None and "double_absolute_bar" not in kwargs:
# May have been precomputed from another precomputation method
absolute_bar = kwargs.get("absolute_bar")
# Wrong! 'absolute_bar' may be None
# precomp_vals["double_absolute_bar"] = 2 * absolute_bar
if absolute_bar is None:
absolute_bar = abs(argument_bar)
# Because we needed to calculate 'absolute_bar' here, does
# not hurt also storing this value also, to prevent it
# being recalculated in 'precompute_basic_precomp_method'.
precomp_vals["absolute_bar"] = absolute_bar
# Correct: now 'absolute_bar' is guaranteed to be not None
precomp_vals["double_absolute_bar"] = 2 * absolute_bar
if not {"qux", "quux", "quuz"}.issubset(kwargs):
precomp_vals["qux"] = 1.0
precomp_vals["quux"] = 5 + 1.0j * (precomp_vals["qux"])
precomp_vals["quuz"] = np.array(
[precomp_vals["qux"] + i for i in np.arange(5)])
return precomp_vals
@classmethod
def precompute_random_values(cls,
random_state: t.Optional[int] = None,
**kwargs) -> t.Dict[str, t.Any]:
"""Precomputation method with pseudo-random behavior.
1. An important pymfe default argument for you: 'random_state'
-----------------------------------------------------------------
If you are using anything with pseudo-random properties, you shall
always get the pymfe framework global random seed using the
``random_state`` argument. This seed is user defined. You can get
it for any precomputation, metafeature extraction or post-processing
methods.
2. Important aspects related to pseudo-random behaviour
-----------------------------------------------------------------
Uncontrolled pseudo-random behavior is absolutely forbidden in
this package.
Also, pseudo-random methods must have related automated tests.
Therefore, setting up the random seed (as long as the user define
it) is never optional.
Parameters
----------
random_state : int, optional
If given, controls the pseudo-random behavior inside this
method, so the results will be reproducible.
**kwargs:
Additional arguments. May have previously precomputed before
this method from other precomputed methods, so they can help
speed up this precomputation avoiding duplicated work.
Returns
-------
:obj:`dict`
The following precomputed items are returned:
* ``random_special_num`` (float): a random value
that must be controlled by the random seed specified
by the user using the ``random_state`` pymfe framework
global argument.
"""
precomp_vals = {} # type: t.Dict[str, t.Any]
if "random_special_num" not in kwargs:
if random_state is not None:
np.random.seed(random_state)
aux = np.random.randint(-5, 5, size=10)
precomp_vals["random_special_num"] = np.random.choice(aux, size=1)
return precomp_vals
@classmethod
def ft_metafeature_name(
cls,
X: np.ndarray,
y: np.ndarray,
random_state: t.Optional[int] = None,
opt_arg_bar: float = 1.0,
opt_arg_baz: np.ndarray = None,
) -> int:
"""Single-line description of this feature extraction method.
The purpose of this method is to introduce the first actual
metafeature extraction method.
1. Metafeature extraction methods: the most important ones
-----------------------------------------------------------------
Similarly to the precomputation methods, the feature extraction
method names are also prefixed. All your feature extraction method
names must be prefixed with ``ft_``.
2. The pymfe framework provides arguments automatically
-----------------------------------------------------------------
As mentioned in the documentation of the very first precomputation
method, the pymfe framework is responsible to provide to every
precomputation (those prefixed with ``precompute_``, metafeature
extraction (those prefixed with ``ft_``) and post-processing (we
will see those later) methods its arguments. 'How?', you may ask.
The short answer is dictionary unpacking: the MFE class holds some
dictionaries that are unpacked while calling those prefixed methods.
Then, if a method's argument happens to match with a dictionary
key, that argument will assume the matched key value.
All precomputed values are packed into one of those dictionaries
(and it happens automatically; you don't need to worry about it.)
Therefore, the same value provided as the key of some precomputed
dictionary is used to match directly the parameter name. All
parameters must be treated as read-only values; it is forbidden to
modify any value inside any feature extraction method.
We will see more about which default parameters are given by the
pymfe framework soon in the ``ft_fitted_data_arguments`` method
just below. However, if you want to see with your own eyes the
actual values, you can check out search for the instance attribute
``mfe.MFE._custom_args_ft`` of the MFE class (inside the ``mfe.py``
module). This attribute is set up inside the ``mfe.MFE.fit`` method.
If you have a very good reason, feel free to insert new values
in there if (and only if) they are needed. Note that it is highly
unlikely.
2. Mandatory & optional arguments of metafeature extraction methods
-----------------------------------------------------------------
The only arguments allowed to be mandatory (i.e., arguments without
any default value) are the ones registered inside the MFE attribute
``_custom_args_ft`` (check this out in the ``mfe.py`` module.)
All other values must have a default value, without any exception.
Remember that all arguments can be customized directly by the user
while calling the ``extract`` MFE method. You usually don't need
to worry about if the user uses incorrect data types for the
arguments, as it will most probably raise an TypeError exception.
However, sometimes you should consider handling incorrect values
(such as probability arguments with values not within the range
0 and 1.) Usually, just returning ``np.nan`` (if your metafeature
is non-summarizable) or ``np.array([np.nan])`` (if your metafeature
is summarizable) is one way to go when handling incorrect arguments.
3. Return values of metafeature extraction methods
-----------------------------------------------------------------
We'll see about this soon in the ``ft_about_return_values`` method.
Arguments
---------
X : :obj:`np.ndarray`
All attributes fitted in the model (numerical and categorical
ones). While writing your method documentations, you don't need
to write about very common arguments such as ``X``, ``y``, ``N``
and ``C``. In fact, you are encouraged to just omit these.
y : :obj:`np.ndarray`
Target attributes. Again, no need to write about these type of
arguments in the method documentation, as it can get way too
much repetitive without any information gain.
random_state : int, optional
Extremely important argument. This one is a fixed feature from the
MFE framework. If your method has ANY pseudo-random behaviour,
you should use specifically this argument to provide the random
seed. In this case, it would be nice if you write about what
is the random behaviour of your method to make clear to the
user why he or she ever needs a random seed in the first place.
opt_arg_bar : float, optional
Argument used to detect carbon footprints of hungry dinosaurs.
opt_arg_baz : :obj:`np.ndarray`, optional
If None, this argument is foo. Otherwise, this argument is bar.
Returns
-------
int
Give a clear description about the returned value.
Notes
-----
You can use the notes section of the documentation to provide
references, and also ``very specific`` details of the method.
"""
# Inside the feature extraction method you can do whenever you
# want, just make sure to:
# 1. Always return a single number, a single np.nan or a numpy
# array with numeric values (or np.nan) - no exceptions!
# 2. Make it run as fast as possible. Metafeatures with high
# computational complexity are discouraged.
# You can raise ValueError, TypeError and LinAlgError exceptions.
if opt_arg_bar <= 0.0:
raise ValueError("'opt_arg_bar' must be positive!")
# When using pseudo-random functions, ALWAYS use random_state
# to enforce experiment replication. Uncontrolled pseudo-random
# behavior is absolutely forbidden.
if opt_arg_baz is None:
np.random.seed(random_state)
opt_arg_baz = np.random.choice(10, size=5, replace=False)
aux_1, aux_2 = np.array(X.shape) * y.size
np.random.seed(random_state)
random_ind = np.random.randint(opt_arg_baz.size)
ret = aux_1 * opt_arg_bar / (aux_2 + opt_arg_baz[random_ind])
return ret
@classmethod
def ft_fitted_data_arguments(cls, X: np.ndarray, N: np.ndarray,
C: np.ndarray, y: np.ndarray) -> int:
"""Information about some arguments related to fitted data.
1. Handling Numerical, Categorical and Mixed data types
-----------------------------------------------------------------
Not all feature extraction methods handles all type of data. Some
methods only work for numerical values, while others works only for
categorical values. A few ones work for both data types, but this
is generally not the case.
The Pymfe framework provides easy access to the fitted data
attributes separated by data type (numerical and categorical).
You can use the attribute ``X`` to get all the original fitted
data (without any data transformations), attribute ``N`` to get
only the numerical attributes and, similarly, ``C`` to get only
the categorical attributes.
Arguments
---------
X : :obj:`np.ndarray`
All fitted original data, without any data transformation
such as discretization or one-hot encoding.
N : :obj:`np.ndarray`
Just numerical attributes of the fitted data, with possibly
categorical data one-hot encoded (if the user uses this
type of transformation.)
C : :obj:`np.ndarray`
Just the categorical attributes of the fitted data, with
possibly numerical data discretized (if the user uses
this type of transformation.)
y : :obj:`np.ndarray`
Target attribute.
Returns
-------
int
Some important return value.
Notes
-----
You can even receive more than one of these attributes in the
same method, but keep in mind that this may cause confusion as
the user may enable or disable data transformations (encoding
for categorical values and discretization for numerical values).
"""
ret = np.array(X.shape) + np.array(N.shape) + np.array(C.shape)
return np.prod(ret) * y.size
@classmethod
def ft_using_precomputed_values(
cls,
y: np.ndarray,
# y_unique: np.ndarray, # Wrong! Need an default value.
y_unique: t.Optional[np.ndarray] = None) -> np.ndarray:
"""Metafeature extraction method using precomputed values.
1. How to use precomputed arguments
-----------------------------------------------------------------
Within any metafeature extraction method, you can safely assume that
all precomputation methods (even the ones of other MFE classes) were
all executed (successfully or not!), and their values are hopefully
ready to be used as arguments. Note that the pymfe framework has a
huge resilience against exceptions, so the code will most probably
continue to flow even if a few precomputation methods were not
successful for some reasons (e.g., math domain errors.)
To get precomputed values is no different than getting a pymfe
default automatic argument (such as ``X`` and ``y``): just match
the argument name with the precomputed dictionary key. For
instance, the argument ``y_unique`` was precomputed in the
``precompute_basic_precomp_method`` and is probably ready to be used in
this metafeature extraction method, IF the user does not
disabled the precomputations. As we can't guarantee whether the
user will or will not disable the precomputations, we need to
always check if ``y_unique`` is different than ``None`` before
using it. If, unfortunatelly, it is not the case, then we need
to compute ``y_unique`` inside this method.
2. When to use precomputed arguments
-----------------------------------------------------------------
Always! :)
3. The precomputation cache is shared among all pymfe classes
-----------------------------------------------------------------
Remember that you can also use precomputed values from other
pymfe metafeature extraction classes (and, therefore, your
precomputed values will also be automatically available to the
other classes aswell.)
Arguments
---------
y : :obj:`np.ndarray`
Target attribute.
y_unique : :obj:`np.ndarray`, optional
Argument precomputed in the ``precompute_basic_precomp_method``
precomputation method. Note that it must be an optional
argument (because it is forbidden to rely on precomputation
methods to fill mandatory arguments, as the user can disable
precomputation methods whenever he or she wants.) Note also
that the argument name must match exatcly the corresponding
dictionary key given inside the precomputation method.
Returns
-------
:obj:`np.ndarray`
Describe your return value.
"""
# res = -1.0 * y_unique # Wrong! 'y_unique' may be None!
# You need to verify if precomputed values is None. If this
# is the case, you need to manually compute it inside the method
# that needs that value.
if y_unique is None:
# If ``y_unique`` is None, it means probably that the user
# disabled the precomputations (or something went wrong inside
# the precomputation method,) so we need to compute
# it now as this argument is needed to compute the
# method's output.
# Obviously, the computation inside the metafeature
# extraction method must be identical to the computation
# in the precomputation method, as both results must
# always match. Once again, remember:
# 'If two things have the same name, then they are the
# same thing'.
y_unique = np.unique(y, return_counts=False)
res = -1.0 * y_unique # Correct: 'y_unique' is surely not None
return res
@classmethod
def ft_about_return_values(
cls,
y: np.ndarray,
) -> np.ndarray:
"""Information about return values of feature extraction methods.
1. You have two return options for metafeature extraction methods
-----------------------------------------------------------------
The return value of any feature extraction method should be
a single value (int, float, numpy number, or a :obj:`np.nan`,)
or a numpy array. This array must contain only numbers or
:obj:`np.nan`.
2. What's the difference?
-----------------------------------------------------------------
If the return value is a single number, the output value of this
method will be transformed directly into a MFE class extract output.
If it is a numpy array, then this output will automatically be
summarized using every user selected summary functions.
3. A more detailed explanation
-----------------------------------------------------------------
If you return a single value, your metafeature is said to be
'non-summarizable'. It means that the value your method return is
the value the user will get. If you need to return an invalid
value, always return 'np.nan'.
If you return an numpy array, then your metafeature is said to be
'summarizable', and the user will get a few statistics related to
the values your method returns (instead of the actual values):
its mean, standard deviation, quantiles, variance etc. It will
happen automatically, and you should not worry about this. You
can put 'np.nan' inside your array. If you need to return an
entire invalid array, consider returning 'np.array([np.nan])'.
DO NOT return a single 'np.nan', as it is reserved for the
'non-summarizable' metafeature extraction methods.
Arguments
---------
y : :obj:`np.ndarray`
Target attribute.
Returns
-------
:obj:`np.ndarray`
This method returns a numpy array, so its output value will
be summarized automatically by the MFE framework before
outputting to the user.
"""
# Either your method return a single value, or it return an
# numpy array. You can't mix both within a single metafeature
# extraction method.
if np.any(y < 0):
# My metafeature can't handle negative 'y' values, so I
# can return an invalid array
# return np.nan # Wrong! It is not an array!
return np.array([np.nan]) # Correct.
if y.size > 20:
return np.power(y, 1 / 4) + np.arange(y.size)
return np.sqrt(y) + np.arange(y.size)
@classmethod
def _protected_methods(cls, arg_foo: float) -> float:
"""Tips for using protected methods.
1. How to use Python's protected methods in pymfe code
-----------------------------------------------------------------
Protected methods (methods whose name starts with a underscore)
should be used whenever you need to modularize better your code,
and even more if you need to use the same piece of code between
two or more different metafeature extraction methods.
2. Using private methods
-----------------------------------------------------------------
Private methods (methods prefixed with two underscores) are not
really necessary, and their use must be justified somehow.
So far, there is not even a single private method in any pymfe
code.
3. Protected method documentation
-----------------------------------------------------------------
You don't need to follow the standard documentation format for
protected methods (method description, argument list, return value
description etc.) Instead, you can be more technical since the
documentation will probably be more suitable for other developers
and maintainers of the package. If you fell more confortable with
the standard format (just like the public methods), there is no
harm to follow it in the protected method documentation then.
"""
def inner_functions(x: float, lamb: float = 1.0) -> float:
"""Usage of inner functions.
1. When to use inner functions
---------------------------------------------------------
Use them whenever you need modularize a piece of code that
is way too much specific for the method that contains it.
Therefore, it is highly unlikely that this same piece of
code may ever be used from another method.
2. How many inner functions per method?
---------------------------------------------------------
These functions are quite useful for very complex feature
extraction methods with many steps needed to reach the final
result. In that case, consider creating a separated inner
function for every step.
"""
return np.abs(np.tanh(x * lamb) * 0.7651j)
return np.max(inner_functions(arg_foo), 0.0)
@classmethod
def non_protected_methods_without_any_prefixes(cls) -> None:
"""Don't use non-protected regular methods.
The main reason to avoid this type of methods is because
it will be shown in the package documentation despite the
fact that it is not of the user's interest.
"""
raise NotImplementedError(
"Hide me prefixing my name with a single '_'.")
@classmethod
def postprocess_groupName1_groupName2(
cls, mtf_names: t.List[str], mtf_vals: t.List[float],
mtf_time: t.List[float], class_indexes: t.Sequence[int],
groups: t.Tuple[str, ...], inserted_group_dep: t.FrozenSet[str],
**kwargs
) -> t.Optional[t.Tuple[t.List[str], t.List[float], t.List[float]]]:
"""Introduction to post-processing methods.
1. What is a post-processing method?
-----------------------------------------------------------------
The post-processing methods can be used to either modify in-place
previously generated metafeatures (not necessarily from the same
group) or to generate new metafeatures using previously extracted
metafeatures just before outputting the results to the user. The
popularity of this type of method is not even close to the
preprocessing ones, but they may be useful in some specific cases
(mainly related to `somehow` merge the dependencies data with the
generated data from the dependent class.)
For instance, the 'Relative Landmarking' metafeature group is
entirely based on post-processing methods: that specific group needs
every 'Landmarking' metafeature results and, therefore, it can be
computed only after the metafeature extraction process finishes
(because we have no guarantees of the metafeature extraction order.)
So, if your MFE class does not have any external dependencies, nor it
is supposed to somehow merge two or more metafeature values, you
don't need to read this section, and you are already good to go
and develop your own MFE class. If it is not your case, then stay
with us for a couple of extra minutes more.
2. Structure of a post-processing method
-----------------------------------------------------------------
All post-processing methods receive all previously extracted
metafeatures from every MFE class. It will not receive just the
metafeatures related to the metafeature extraction methods of this
class. It is very import to keep this in mind.
There's a very important trick with the naming of these post-processing
methods, other than just prefixing they with ``postprocess_``.
You can put names of metafeature groups of interest separated by
underscores. All metafeature indexes related to any of the selected
groups will arrive in the ``class_indexes`` argument automatically.
For example, suppose a post-processing method named like:
postprocess_infotheory_statistical(...)
This implies that the indices of both `information theory` and
`statistical` metafeature groups will arrive inside the
``class_indexes`` sequence. Using this feature, one can easily
work with these metafeatures without needing to separate them by
hand. Of course, you can give as many metafeature group names as
needed. If you need them all, then simply don't put any metafeature
group name, as every metafeature is an metafeature of interest in
this case.
There were various arguments that are automatically filled for
this type of methods (as you can see just above in this method
signature). Check the ``arguments`` section for more details
about each one.
3. How many post-processing methods are necessary?
-----------------------------------------------------------------
Just like the preprocessing and metafeature extraction methods,
an MFE class may have any number post-processing methods, including
none. In fact, no post-processing method is by far the common case.
4. Return value of post-processing methods
-----------------------------------------------------------------
The return value of post-processing methods must be either None,
or a tuple with exactly three lists. In the first case (returning
None), the post-processing method is probably supposed to modify
the received metafeature values in-place (which is perfectly
fine). In the second case (returning three lists), these lists
will be considered new metafeatures and will be appended to the
MFE output before given to the user. These lists must follow the
order given below:
1. New metafeature names
2. New metafeature values
3. Time elapsed to extract every new metafeature
Now, let's take a quick look at the common post-processing method
arguments. Note that all the arguments listed below are actual
arguments from the pymfe framework, and you can use they in your
post-processing methods.
Arguments
---------
mtf_names : :obj:`list` of str
A list containing all previously extracted metafeature names.
mtf_vals : :obj:`list` of float
A list containing all previously extracted metafeature values.
mtf_time : :obj:`list` of float
A list containing all time elapsed for each metafeature
previously extracted.
class_indexes : Sequence of int
Indexes of the metafeatures related to this method ``groups of
interest``. The ``groups of interest`` are the metafeature groups
whose name are in this method's name after the ``postprocess_``
prefix, separated with underscores (in this example, they are
``groupName1`` and ``groupName2``.)
If it is not clear for you so far, the metafeatures received
in this method are all the metafeatures extracted in every MFE
classes, not just the ones related to this class. Then, this
argument can be used as reference to target only the metafeatures
effectively used in this post-processing method.
If you need every single metafeature extracted for your
post-processing method, then this argument does not matter (nor
your post-processing method name, as long as it is correctly
prefixed with ``postprocess_``) as every metafeature is of your
particular interest, and there is no need for an auxiliary
list to split the metafeatures.
groups : :obj:`tuple` of str
Extracted metafeature groups (including metafeature groups
inserted due to group dependencies). Can be used as reference
inside the post-processing method.
inserted_group_dep : :obj:`tuple` of :obj:`str
Extracted metafeature groups due to class dependencies. Can be
used as a reference inside the post-processing method.
**kwargs:
Just like the preprocessing methods, the kwargs is also
mandatory in post-processing methods. It can be used to
retrieve additional arguments using the ``get`` method.
Returns
-------
if not None:
:obj:`tuple` with three :obj:`list`
These lists are (necessarily in this order):
1. New metafeature names
2. New metafeature values
3. Time elapsed to extract every new metafeature
"""
# Sometimes you can cheat pylint in case you are not using some
# arguments, such as kwargs. Keep in mind that this fact should
# not be abused just to avoid pylint warnings. Always take some
# time to fix your code.
# pylint: disable=W0613
new_mtf_names = [] # type: t.List[str]
new_mtf_vals = [] # type: t.List[float]
new_mtf_time = [] # type: t.List[float]
# In this example, this post-processing method returns
# new metafeatures conditionally. Note that this variable
# ``change_in_place`` is fabricated for this example; it
# is not a true feature of the Pymfe framework!!! The
# decision of whether or not to change metafeatures in
# place depends on your particular context!
change_in_place = kwargs.get("change_in_place", False)
if change_in_place:
# Make changes in-place using the ``class_indexes`` as
# reference. Note that these indexes are collected using
# this post-processing method name as reference (check the
# documentation of this method for a clear explanation.)
for index in class_indexes:
time_start = time.time()
mtf_vals[index] *= 2.0
mtf_names[index] += ".twice"
mtf_time[index] += time.time() - time_start
# Don't return new metafeatures, as the changes made are
# in-place in this particular situation.
return None
# The previous branch was not taken: therefore, the changes
# are not in-place. This means that new metafeatures will be
# created and appended to the previously existing ones. Note
# that whether the new feature values are supposed to be identical
# to its in-place variants are context dependent. If you have
# good reasons to do make they different, then you are allowed to.
# Create new metafeatures (in this case, the user will receive
# twice as many values as separated metafeatures.) Note that the
# number of new metafeatures also is context dependent: your
# post-processing method may return as many as new metafeatures it
# is supposed to return.
for index in class_indexes:
time_start = time.time()
new_mtf_vals.append(-1.0 * new_mtf_vals[index])
new_mtf_names.append("{}.negative".format(new_mtf_names[index]))
new_mtf_time.append(new_mtf_time[index] + time.time() - time_start)
# Finally:
# Return new metafeatures produced in this method. Pay attention to the
# order of these lists, as it must be preserved for any post-processing
# method.
return new_mtf_names, new_mtf_vals, new_mtf_time
Total running time of the script: ( 0 minutes 0.005 seconds)
What is new on PyMFE package?
The PyMFE releases are available in PyPI and GitHub.
Version 0.3.0
Metafeature extraction with confidence intervals
Pydoc fixes and package documentation/code consistency improvements
Reformatted ‘model-based’ group metafeature extraction methods arguments to a consistent format (all model-based metafeatures now receive a single mandatory argument ‘dt_model’, and all other arguments are optional arguments from precomputations.) Now it is much easier to use those methods directly without the main class (mfe) filter, if desired.
Now accepting user custom arguments in precomputation methods.
Added ‘extract_from_model’ MFE method, making easy to extract model-based metafeatures from a pre-fitted model without using the training data.
Memory issues
Now handling memory errors in precomputations, postcomputations and metafeature extraction as a regular exception.
Categorical attributes one-hot encoding option
Added option to encode categorical attributes using one-hot encoding instead of the current gray encoding.
New nan-resilient summary functions
All summary functions now can be calculated ignoring ‘nan’ values, using its nan-resilient version.
Online documentation improvement
Version 0.2.0
New meta-feature groups
Complexity
Itemset
Concept
New feature in MFE to list meta-feature description and references
Dev class update
Integration, system tests, tests updates
Old module reviews
Docstring improvement
Online documentation improvement
Clustering group updated
Landmarking group updated
Statistical group updated
Version 0.1.1
Bugs solved
False positive of mypy fixed
Contributing link now is working
We added a note about how to add a new meta-feature
Modified ‘verbosity’ (from ‘extract’ method) argument type from boolean to integer. Now the user can choose the desired level of verbosity. Verbosity = 1 means that a progress bar will be shown during the meta-feature extraction process. Verbosity = 2 maintains all the previous verbose messages (i.e., it logs every “extract” step) plus additional information about the current percentage of progress done so far.
Version 0.1.0
Meta-feature groups available
Relative landmarking
Clustering-based
Relative subsampling landmarking
Makefile to help developers
New Functionalities
Now you can list available groups
Now you can list available meta-features
Documentation
New examples
New README
Bugs
Problems in parse categoric meta-features solved
Categorization of attributes with constant values solved
Test
Several new tests added
Version 0.0.3
Documentation improvement
Setup improvement
Meta-feature groups available:
Simple
Statistical
Information-theoretic
Model-based
Landmarking
About us
Contributors
You can find the contributors of this package here.
Citing PyMFE
If you use the pymfe in scientific publication, we would appreciate citations to the following paper:
You can also use the bibtex format:
@article{JMLR:v21:19-348,
author = {Edesio Alcobaça and
Felipe Siqueira and
Adriano Rivolli and
Luís P. F. Garcia and
Jefferson T. Oliva and
André C. P. L. F. de Carvalho
},
title = {MFE: Towards reproducible meta-feature extraction},
journal = {Journal of Machine Learning Research},
year = {2020},
volume = {21},
number = {111},
pages = {1-5},
url = {http://jmlr.org/papers/v21/19-348.html}
}
Getting started
Information to install, test, and contribute to the package.
API Documentation
In this section, we document expected types and allowed features for all functions, and all parameters available for the meta-feature extraction.
Examples
A set of examples illustrating the use of PyMFE package. You will learn in this section how PyMFE works, patter, tips, and more.
What’s new ?
Log of the PyMFE history.
About PyMFE
If you would like to know more about this project, how to cite it, and the contributors, see this section.