Listing available metafeatures, groups, and summaries

In this example, we will show you how to list the types of metafeatures, groups, and summaries available.

from sklearn.datasets import load_iris
from pymfe.mfe import MFE

Print all available metafeature groups from the PyMFE package.

('landmarking', 'general', 'statistical', 'model-based', 'info-theory', 'relative', 'clustering', 'complexity', 'itemset', 'concept')

Actually, there’s no need to instantiate a model for that

('landmarking', 'general', 'statistical', 'model-based', 'info-theory', 'relative', 'clustering', 'complexity', 'itemset', 'concept')

Print all available metafeatures from some groups of the PyMFE package If no parameter is given (or is ‘None’), then all available will be returned.

('best_node', 'elite_nn', 'linear_discr', 'naive_bayes', 'one_nn', 'random_node', 'worst_node', 'ch', 'int', 'nre', 'pb', 'sc', 'sil', 'vdb', 'vdu', 'attr_to_inst', 'cat_to_num', 'freq_class', 'inst_to_attr', 'nr_attr', 'nr_bin', 'nr_cat', 'nr_class', 'nr_inst', 'nr_num', 'num_to_cat', 'attr_conc', 'attr_ent', 'class_conc', 'class_ent', 'eq_num_attr', 'joint_ent', 'mut_inf', 'ns_ratio', 'leaves', 'leaves_branch', 'leaves_corrob', 'leaves_homo', 'leaves_per_class', 'nodes', 'nodes_per_attr', 'nodes_per_inst', 'nodes_per_level', 'nodes_repeated', 'tree_depth', 'tree_imbalance', 'tree_shape', 'var_importance', 'c1', 'c2', 'cls_coef', 'density', 'f1', 'f1v', 'f2', 'f3', 'f4', 'hubs', 'l1', 'l2', 'l3', 'lsc', 'n1', 'n2', 'n3', 'n4', 't1', 't2', 't3', 't4', 'cohesiveness', 'conceptvar', 'impconceptvar', 'wg_dist', 'can_cor', 'cor', 'cov', 'eigenvalues', 'g_mean', 'gravity', 'h_mean', 'iq_range', 'kurtosis', 'lh_trace', 'mad', 'max', 'mean', 'median', 'min', 'nr_cor_attr', 'nr_disc', 'nr_norm', 'nr_outliers', 'p_trace', 'range', 'roy_root', 'sd', 'sd_ratio', 'skewness', 'sparsity', 't_mean', 'var', 'w_lambda', 'one_itemset', 'two_itemset')

Again, there’s no need to instantiate a model to invoke this method

('best_node', 'elite_nn', 'linear_discr', 'naive_bayes', 'one_nn', 'random_node', 'worst_node', 'ch', 'int', 'nre', 'pb', 'sc', 'sil', 'vdb', 'vdu', 'attr_to_inst', 'cat_to_num', 'freq_class', 'inst_to_attr', 'nr_attr', 'nr_bin', 'nr_cat', 'nr_class', 'nr_inst', 'nr_num', 'num_to_cat', 'attr_conc', 'attr_ent', 'class_conc', 'class_ent', 'eq_num_attr', 'joint_ent', 'mut_inf', 'ns_ratio', 'leaves', 'leaves_branch', 'leaves_corrob', 'leaves_homo', 'leaves_per_class', 'nodes', 'nodes_per_attr', 'nodes_per_inst', 'nodes_per_level', 'nodes_repeated', 'tree_depth', 'tree_imbalance', 'tree_shape', 'var_importance', 'c1', 'c2', 'cls_coef', 'density', 'f1', 'f1v', 'f2', 'f3', 'f4', 'hubs', 'l1', 'l2', 'l3', 'lsc', 'n1', 'n2', 'n3', 'n4', 't1', 't2', 't3', 't4', 'cohesiveness', 'conceptvar', 'impconceptvar', 'wg_dist', 'can_cor', 'cor', 'cov', 'eigenvalues', 'g_mean', 'gravity', 'h_mean', 'iq_range', 'kurtosis', 'lh_trace', 'mad', 'max', 'mean', 'median', 'min', 'nr_cor_attr', 'nr_disc', 'nr_norm', 'nr_outliers', 'p_trace', 'range', 'roy_root', 'sd', 'sd_ratio', 'skewness', 'sparsity', 't_mean', 'var', 'w_lambda', 'one_itemset', 'two_itemset')

You can specify a group name or a collection of group names to check their correspondent available metafeatures only

mtfs_landmarking = MFE.valid_metafeatures(groups="landmarking")
print(mtfs_landmarking)

mtfs_subset = MFE.valid_metafeatures(groups=["general", "relative"])
print(mtfs_subset)
('best_node', 'elite_nn', 'linear_discr', 'naive_bayes', 'one_nn', 'random_node', 'worst_node')
('attr_to_inst', 'cat_to_num', 'freq_class', 'inst_to_attr', 'nr_attr', 'nr_bin', 'nr_cat', 'nr_class', 'nr_inst', 'nr_num', 'num_to_cat', 'best_node', 'elite_nn', 'linear_discr', 'naive_bayes', 'one_nn', 'random_node', 'worst_node')

Print all available summary functions from the PyMFE package

('mean', 'nanmean', 'sd', 'nansd', 'var', 'nanvar', 'count', 'nancount', 'histogram', 'nanhistogram', 'iq_range', 'naniq_range', 'kurtosis', 'nankurtosis', 'max', 'nanmax', 'median', 'nanmedian', 'min', 'nanmin', 'quantiles', 'nanquantiles', 'range', 'nanrange', 'skewness', 'nanskewness', 'sum', 'nansum', 'powersum', 'pnorm', 'nanpowersum', 'nanpnorm')

Once again, there’s no need to instantiate a model to accomplish this

('mean', 'nanmean', 'sd', 'nansd', 'var', 'nanvar', 'count', 'nancount', 'histogram', 'nanhistogram', 'iq_range', 'naniq_range', 'kurtosis', 'nankurtosis', 'max', 'nanmax', 'median', 'nanmedian', 'min', 'nanmin', 'quantiles', 'nanquantiles', 'range', 'nanrange', 'skewness', 'nanskewness', 'sum', 'nansum', 'powersum', 'pnorm', 'nanpowersum', 'nanpnorm')

Total running time of the script: ( 0 minutes 0.011 seconds)

Gallery generated by Sphinx-Gallery