Listing available metafeatures, groups, and summaries

In this example, we will show you how to list the types of metafeatures, groups, and summaries available.

from sklearn.datasets import load_iris
from pymfe.mfe import MFE

Print all available metafeature groups from the PyMFE package.

('landmarking', 'general', 'statistical', 'model-based', 'info-theory', 'relative', 'clustering', 'complexity', 'itemset', 'concept')

Actually, there’s no need to instantiate a model for that

('landmarking', 'general', 'statistical', 'model-based', 'info-theory', 'relative', 'clustering', 'complexity', 'itemset', 'concept')

Print all available metafeatures from some groups of the PyMFE package If no parameter is given (or is ‘None’), then all available will be returned.

('ch', 'int', 'nre', 'pb', 'sc', 'sil', 'vdb', 'vdu', 'c1', 'c2', 'cls_coef', 'density', 'f1', 'f1v', 'f2', 'f3', 'f4', 'hubs', 'l1', 'l2', 'l3', 'lsc', 'n1', 'n2', 'n3', 'n4', 't1', 't2', 't3', 't4', 'cohesiveness', 'conceptvar', 'impconceptvar', 'wg_dist', 'attr_conc', 'attr_ent', 'class_conc', 'class_ent', 'eq_num_attr', 'joint_ent', 'mut_inf', 'ns_ratio', 'best_node', 'elite_nn', 'linear_discr', 'naive_bayes', 'one_nn', 'random_node', 'worst_node', 'attr_to_inst', 'cat_to_num', 'freq_class', 'inst_to_attr', 'nr_attr', 'nr_bin', 'nr_cat', 'nr_class', 'nr_inst', 'nr_num', 'num_to_cat', 'can_cor', 'cor', 'cov', 'eigenvalues', 'g_mean', 'gravity', 'h_mean', 'iq_range', 'kurtosis', 'lh_trace', 'mad', 'max', 'mean', 'median', 'min', 'nr_cor_attr', 'nr_disc', 'nr_norm', 'nr_outliers', 'p_trace', 'range', 'roy_root', 'sd', 'sd_ratio', 'skewness', 'sparsity', 't_mean', 'var', 'w_lambda', 'leaves', 'leaves_branch', 'leaves_corrob', 'leaves_homo', 'leaves_per_class', 'nodes', 'nodes_per_attr', 'nodes_per_inst', 'nodes_per_level', 'nodes_repeated', 'tree_depth', 'tree_imbalance', 'tree_shape', 'var_importance', 'one_itemset', 'two_itemset')

Again, there’s no need to instantiate a model to invoke this method

('ch', 'int', 'nre', 'pb', 'sc', 'sil', 'vdb', 'vdu', 'c1', 'c2', 'cls_coef', 'density', 'f1', 'f1v', 'f2', 'f3', 'f4', 'hubs', 'l1', 'l2', 'l3', 'lsc', 'n1', 'n2', 'n3', 'n4', 't1', 't2', 't3', 't4', 'cohesiveness', 'conceptvar', 'impconceptvar', 'wg_dist', 'attr_conc', 'attr_ent', 'class_conc', 'class_ent', 'eq_num_attr', 'joint_ent', 'mut_inf', 'ns_ratio', 'best_node', 'elite_nn', 'linear_discr', 'naive_bayes', 'one_nn', 'random_node', 'worst_node', 'attr_to_inst', 'cat_to_num', 'freq_class', 'inst_to_attr', 'nr_attr', 'nr_bin', 'nr_cat', 'nr_class', 'nr_inst', 'nr_num', 'num_to_cat', 'can_cor', 'cor', 'cov', 'eigenvalues', 'g_mean', 'gravity', 'h_mean', 'iq_range', 'kurtosis', 'lh_trace', 'mad', 'max', 'mean', 'median', 'min', 'nr_cor_attr', 'nr_disc', 'nr_norm', 'nr_outliers', 'p_trace', 'range', 'roy_root', 'sd', 'sd_ratio', 'skewness', 'sparsity', 't_mean', 'var', 'w_lambda', 'leaves', 'leaves_branch', 'leaves_corrob', 'leaves_homo', 'leaves_per_class', 'nodes', 'nodes_per_attr', 'nodes_per_inst', 'nodes_per_level', 'nodes_repeated', 'tree_depth', 'tree_imbalance', 'tree_shape', 'var_importance', 'one_itemset', 'two_itemset')

You can specify a group name or a collection of group names to check their correspondent available metafeatures only

mtfs_landmarking = MFE.valid_metafeatures(groups="landmarking")
print(mtfs_landmarking)

mtfs_subset = MFE.valid_metafeatures(groups=["general", "relative"])
print(mtfs_subset)
('best_node', 'elite_nn', 'linear_discr', 'naive_bayes', 'one_nn', 'random_node', 'worst_node')
('attr_to_inst', 'cat_to_num', 'freq_class', 'inst_to_attr', 'nr_attr', 'nr_bin', 'nr_cat', 'nr_class', 'nr_inst', 'nr_num', 'num_to_cat', 'best_node', 'elite_nn', 'linear_discr', 'naive_bayes', 'one_nn', 'random_node', 'worst_node')

Print all available summary functions from the PyMFE package

('mean', 'nanmean', 'sd', 'nansd', 'var', 'nanvar', 'count', 'nancount', 'histogram', 'nanhistogram', 'iq_range', 'naniq_range', 'kurtosis', 'nankurtosis', 'max', 'nanmax', 'median', 'nanmedian', 'min', 'nanmin', 'quantiles', 'nanquantiles', 'range', 'nanrange', 'skewness', 'nanskewness', 'sum', 'nansum', 'powersum', 'pnorm', 'nanpowersum', 'nanpnorm')

Once again, there’s no need to instantiate a model to accomplish this

('mean', 'nanmean', 'sd', 'nansd', 'var', 'nanvar', 'count', 'nancount', 'histogram', 'nanhistogram', 'iq_range', 'naniq_range', 'kurtosis', 'nankurtosis', 'max', 'nanmax', 'median', 'nanmedian', 'min', 'nanmin', 'quantiles', 'nanquantiles', 'range', 'nanrange', 'skewness', 'nanskewness', 'sum', 'nansum', 'powersum', 'pnorm', 'nanpowersum', 'nanpnorm')

Total running time of the script: ( 0 minutes 0.012 seconds)

Gallery generated by Sphinx-Gallery