pymfe.itemset.MFEItemset
- class pymfe.itemset.MFEItemset[source]
Keep methods for metafeatures of
Itemsetgroup.The convention adopted for metafeature extraction related methods is to always start with
ft_prefix to allow automatic method detection. This prefix is predefined within_internalmodule.All method signature follows the conventions and restrictions listed below:
For independent attribute data,
Xmeansevery type of attribute,NmeansNumeric attributes onlyandCstands forCategorical attributes only. It is important to note that the categorical attribute sets betweenXandCand the numerical attribute sets betweenXandNmay differ due to data transformations, performed while fitting data into MFE model, enabled by, respectively,transform_numandtransform_catarguments fromfit(MFE method).Only arguments in MFE
_custom_args_ftattribute (set up insidefitmethod) are allowed to be required method arguments. All other arguments must be strictly optional (i.e., has a predefined default value).The initial assumption is that the user can change any optional argument, without any previous verification of argument value or its type, via kwargs argument of
extractmethod of MFE class.The return value of all feature extraction methods should be a single value or a generic List (preferably a
np.ndarray) type with numeric values.
There is another type of method adopted for automatic detection. It is adopted the prefix
precompute_for automatic detection of these methods. These methods run while fitting some data into an MFE model automatically, and their objective is to precompute some common value shared between more than one feature extraction method. This strategy is a trade-off between more system memory consumption and speeds up of feature extraction. Their return value must always be a dictionary whose keys are possible extra arguments for both feature extraction methods and other precomputation methods. Note that there is a share of precomputed values between all valid feature-extraction modules (e.g.,class_freqscomputed in modulestatisticalcan freely be used for any precomputation or feature extraction method of modulelandmarking).- __init__(*args, **kwargs)
Methods
__init__(*args, **kwargs)ft_one_itemset(C[, itemset_binary_matrix])Compute the one itemset meta-feature.
ft_two_itemset(C[, itemset_binary_matrix])Compute the two itemset meta-feature.
precompute_binary_matrix(C, **kwargs)Precompute the binary representation of attributes.
- classmethod ft_one_itemset(C: ndarray, itemset_binary_matrix: Optional[List[ndarray]] = None) ndarray[source]
Compute the one itemset meta-feature.
The one itemset is the individual frequency of each attribute in binary format.
- Parameters
- C
np.ndarray Categorical fitted data.
- itemset_binary_matrix
listofnp.ndarray, optional Binary representation of the attributes. Each list value has a binary representation of each attributes in the dataset.
- C
- Returns
np.ndarrayAn array with the one-item value for each attribute.
References
- 1
Song, Q., Wang, G., & Wang, C. (2012). Automatic recommendation of classification algorithms based on data set characteristics. Pattern recognition, 45(7), 2672-2689.
- classmethod ft_two_itemset(C: ndarray, itemset_binary_matrix: Optional[List[ndarray]] = None) ndarray[source]
Compute the two itemset meta-feature.
The two-item set meta-feature can be seen as the correlation information of each one attributes value pairs in binary format.
- Parameters
- C
np.ndarray Categorical fitted data.
- itemset_binary_matrix
listofnp.ndarray, optional Binary representation of the attributes. Each list value has a binary representation of each attributes in the dataset.
- C
- Returns
np.ndarrayAn array with the two-item value for each attribute.
References
- 1
Song, Q., Wang, G., & Wang, C. (2012). Automatic recommendation of classification algorithms based on data set characteristics. Pattern recognition, 45(7), 2672-2689.
- classmethod precompute_binary_matrix(C: Optional[ndarray], **kwargs) Dict[str, Any][source]
Precompute the binary representation of attributes.
- Parameters
- C
np.ndarray, optional Categorical fitted data.
- **kwargs
Additional arguments. May have previously precomputed before this method from other precomputed methods, so they can help speed up this precomputation.
- C
- Returns