pymfe.info_theory.MFEInfoTheory
- class pymfe.info_theory.MFEInfoTheory[source]
Keeps methods for metafeatures of
Information Theorygroup.The convention adopted for metafeature extraction related methods is to always start with
ft_prefix to allow automatic method detection. This prefix is predefined within_internalmodule.All method signature follows the conventions and restrictions listed below:
For independent attribute data,
Xmeansevery type of attribute,NmeansNumeric attributes onlyandCstands forCategorical attributes only. It is important to note that the categorical attribute sets betweenXandCand the numerical attribute sets betweenXandNmay differ due to data transformations, performed while fitting data into MFE model, enabled by, respectively,transform_numandtransform_catarguments fromfit(MFE method).Only arguments in MFE
_custom_args_ftattribute (set up insidefitmethod) are allowed to be required method arguments. All other arguments must be strictly optional (i.e., has a predefined default value).The initial assumption is that the user can change any optional argument, without any previous verification of argument value or its type, via kwargs argument of
extractmethod of MFE class.The return value of all feature extraction methods should be a single value or a generic List (preferably a
np.ndarray) type with numeric values.
There is another type of method adopted for automatic detection. It is adopted the prefix
precompute_for automatic detection of these methods. These methods run while fitting some data into an MFE model automatically, and their objective is to precompute some common value shared between more than one feature extraction method. This strategy is a trade-off between more system memory consumption and speeds up of feature extraction. Their return value must always be a dictionary whose keys are possible extra arguments for both feature extraction methods and other precomputation methods. Note that there is a share of precomputed values between all valid feature-extraction modules (e.g.,class_freqscomputed in modulestatisticalcan freely be used for any precomputation or feature extraction method of modulelandmarking).- __init__(*args, **kwargs)
Methods
__init__(*args, **kwargs)ft_attr_conc(C[, max_attr_num, random_state])Compute concentration coef.
ft_attr_ent(C[, attr_ent])Compute Shannon's entropy for each predictive attribute.
ft_class_conc(C, y)Compute concentration coefficient between each attribute and class.
ft_class_ent(y[, class_ent, class_freqs])Compute target attribute Shannon's entropy.
ft_eq_num_attr(C, y[, class_ent, ...])Compute the number of attributes equivalent for a predictive task.
ft_joint_ent(C, y[, joint_ent])Compute the joint entropy between each attribute and class.
ft_mut_inf(C, y[, mut_inf, attr_ent, ...])Compute the mutual information between each attribute and target.
ft_ns_ratio(C, y[, attr_ent, mut_inf])Compute the noisiness of attributes.
Precompute each distinct class (absolute) frequencies.
precompute_entropy([y, C, class_freqs])Precompute various values related to Shannon's Entropy.
- classmethod ft_attr_conc(C: ndarray, max_attr_num: Optional[int] = 12, random_state: Optional[int] = None) ndarray[source]
Compute concentration coef. of each pair of distinct attributes.
- Parameters
- C
np.ndarray Categorical fitted data.
- max_attr_numint, optional
Maximum number of attributes considered. If
Chas more attributes than this value, this feature will be calculated in a sample ofmax_attr_numrandom attributes. If None, then all attributes are considered. Note that this method cost is combinatorial to the number of attributes considered.- random_stateint, optional
Used only if
max_attr_numis given andChas more attributes than it. This random seed is set before samplingCattributes.
- C
- Returns
np.ndarrayConcentration coefficient for each pair of distinct predictive attribute.
References
- 1
Alexandros Kalousis and Melanie Hilario. Model selection via meta-learning: a comparative study. International Journal on Artificial Intelligence Tools, 10(4):525–554, 2001.
- classmethod ft_attr_ent(C: ndarray, attr_ent: Optional[ndarray] = None) ndarray[source]
Compute Shannon’s entropy for each predictive attribute.
The Shannon’s Entropy H of a vector x is defined as:
H(x) = - sum_{val in phi_x}(P(x = val) * log2(P(x = val))
Where phi_x is a set of all possible distinct values in vector x and P(x = val) is the probability of x assume some value val in phi_x.
- Parameters
- C
np.ndarray Categorical fitted data.
- attr_ent
np.ndarray, optional This argument is this method own return value, meant to exploit possible attribute entropy precomputations.
- C
- Returns
np.ndarrayEntropy of each predictive attribute.
References
- 1
Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
- classmethod ft_class_conc(C: ndarray, y: ndarray) ndarray[source]
Compute concentration coefficient between each attribute and class.
- Parameters
- C
np.ndarray Categorical fitted data.
- y
np.ndarray Target attribute.
- C
- Returns
np.ndarrayConcentration coefficient between each predictive attribute and the target attribute (class.)
References
- 1
Alexandros Kalousis and Melanie Hilario. Model selection via meta-learning: a comparative study. International Journal on Artificial Intelligence Tools, 10(4):525–554, 2001.
- classmethod ft_class_ent(y: ndarray, class_ent: Optional[float] = None, class_freqs: Optional[ndarray] = None) float[source]
Compute target attribute Shannon’s entropy.
The Shannon’s Entropy H of a vector y is defined as:
H(y) = - sum_{val in phi_y}(P(y = val) * log2(P(y = val))
Where phi_y is a set of all possible distinct values in vector
yand P(y = val) is the probability of y assume some value val in phi_y.- Parameters
- y
np.ndarray Target attribute.
- class_entfloat, optional
Entropy of the target attribute
y. Used to explot precomputations. IfNoneType, this argument is calculated using the methodft_class_ent.- class_freqs
np.ndarray, optional Absolute frequency of each distinct class in
y. This argument is meant to exploit precomputations, used ifclass_entisNoneType.
- y
- Returns
- float
Entropy of the target attribute.
References
- 1
Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
- classmethod ft_eq_num_attr(C: ndarray, y: ndarray, class_ent: Optional[float] = None, class_freqs: Optional[ndarray] = None, mut_inf: Optional[ndarray] = None) float[source]
Compute the number of attributes equivalent for a predictive task.
The attribute equivalence E is defined as:
E = attr_num * (H(y) / sum_x(MI(x, y)))
Where H(y) is the Shannon’s Entropy of the target attribute and MI(x,y) is the Mutual Information between the predictive attribute x and target attribute
y.- Parameters
- C
np.ndarray Categorical fitted data.
- y
np.ndarray Target attribute.
- class_entfloat, optional
Entropy of the target attribute
y. Used to explot precomputations. IfNoneType, this argument is calculated using the methodft_class_ent.- class_freqs
np.ndarray, optional Absolute frequency of each distinct class in
y. This argument is meant to exploit precomputations, used ifclass_entisNoneType.- mut_inf
np.ndarray, optional Values of mutual information between each numeric attribute of
Nand targety. Similarly, from the argument above, this argument purpose is to exploit the precomputations of mutual information. If this argument value isNoneType, then it is calculated using the methodft_mut_int.
- C
- Returns
- float
Estimated number of equivalent predictive attributes.
References
- 1
Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
- classmethod ft_joint_ent(C: ndarray, y: ndarray, joint_ent: Optional[ndarray] = None) ndarray[source]
Compute the joint entropy between each attribute and class.
The Joint Entropy H between a predictive attribute x and target attribute
yis defined as:H(x, y) = - sum_{phi_x}(sum_{phi_y}(p_i_j * log2(p_i_j)))
Where phi_x and phi_y are sets of possible distinct values for, respectively, x and
yand p_i_j is defined as:p_i_j = P(x = phi_x_i, y = phi_y_j)
That is, p_i_j is the joint probability of x to assume a specific value i in the set phi_x simultaneously with
yassuming a specific value j in the set phi_y.- Parameters
- C
np.ndarray Categorical fitted data.
- y
np.ndarray Target attribute.
- joint_ent
np.ndarray, optional This argument is this method own return value, meant to exploit possible joint entropy precomputations.
- C
- Returns
np.ndarrayEstimated joint entropy between each predictive attribute and the target attribute (class attribute.)
References
- 1
Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
- classmethod ft_mut_inf(C: ndarray, y: ndarray, mut_inf: Optional[ndarray] = None, attr_ent: Optional[ndarray] = None, class_ent: Optional[float] = None, joint_ent: Optional[ndarray] = None, class_freqs: Optional[ndarray] = None) ndarray[source]
Compute the mutual information between each attribute and target.
The mutual Information MI between an independent attribute x and target attribute
yis defined as:MI(x, y) = H(x) + H(y) - H(x, y)
Where H(x) and H(y) are, respectively, the Shannon’s Entropy (see the documentation of
ft_attr_entorft_class_entfor more information) for x andyand H(x, y) is the joint entropy of x andy(seeft_joint_entdocumentation more details.)- Parameters
- C
np.ndarray Categorical fitted data.
- y
np.ndarray Target attribute.
- mut_inf
np.ndarray, optional This argument is this method own return value, meant to exploit possible mutual information precomputations.
- attr_ent
np.ndarray, optional Values of each attribute entropy in
N. This argument purpose is to exploit possible precomputations of attribute entropy. IfNoneType, this argument is calculated usingft_attr_entmethod.- class_entfloat, optional
Entropy of the target attribute
y. Used to explot precomputations. IfNoneType, this argument is calculated using the methodft_class_ent.- joint_ent
np.ndarray, optional Joint entropy between each independent attribute in
Nand target attributey. IfNoneType, this argument is calculated using the methodft_joint_ent.- class_freqs
np.ndarray, optional Absolute frequency of each distinct class in
y.
- C
- Returns
np.ndarrayMutual information between each attribute and the target attribute.
References
- 1
Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
- classmethod ft_ns_ratio(C: ndarray, y: ndarray, attr_ent: Optional[ndarray] = None, mut_inf: Optional[ndarray] = None) float[source]
Compute the noisiness of attributes.
Let
ybe a target attribute and x one predictive attribute in a datasetN. NoisinessNis defined as:N = (sum_x(attr_entropy(x)) - sum_x(MI(x, y))) / sum_x(MI(x, y))
where MI(x, y) is the mutual information between target attribute
yand predictive attribute x, and all sum is performed over each distinct attribute x inN.- Parameters
- C
np.ndarray Categorical fitted data.
- y
np.ndarray Target attribute.
- attr_ent
np.ndarray, optional Values of each attribute entropy in
N. This argument purpose is to exploit possible precomputations of attribute entropy. IfNoneType, this argument is calculated usingft_attr_entmethod.- mut_inf
np.ndarray, optional Values of mutual information between each numeric attribute of
Nand targety. Similarly, from the argument above, this argument purpose is to exploit the precomputations of mutual information. If this argument value isNoneType, then it is calculated using the methodft_mut_int.
- C
- Returns
- float
Estimated noisiness of the predictive attributes.
References
- 1
Donald Michie, David J. Spiegelhalter, Charles C. Taylor, and John Campbell. Machine Learning, Neural and Statistical Classification, volume 37. Ellis Horwood Upper Saddle River, 1994.
- classmethod precompute_class_freq(y: Optional[ndarray] = None, **kwargs) Dict[str, Any][source]
Precompute each distinct class (absolute) frequencies.
- Parameters
- y
np.ndarray, optional Target attribute.
- kwargs:
Additional arguments. May have previously precomputed before this method from other precomputed methods, so they can help speed up this precomputation.
- y
- Returns
dict- With following precomputed items:
class_freqs(np.ndarray): absolute frequency of each distinct class iny, ifyis notNone-Type.
- classmethod precompute_entropy(y: Optional[ndarray] = None, C: Optional[ndarray] = None, class_freqs: Optional[ndarray] = None, **kwargs) Dict[str, Any][source]
Precompute various values related to Shannon’s Entropy.
- Parameters
- C
np.ndarray Categorical fitted data.
- y
np.ndarray Target attribute.
- class_freqs
np.ndarray, optional Absolute frequency of each distinct class in
y.- kwargs:
Additional arguments. May have previously precomputed before this method from other precomputed methods, so they can help speed up this precomputation.
- C
- Returns
dict- With following precomputed items:
class_ent(float): Shannon’s Entropy ofy, if it is notNoneType.attr_ent(np.ndarray): Shannon’s Entropy of each attribute inC, if it is notNoneType.joint_ent(np.ndarray): Joint Entropy between each attribute inCand target attributeyif both are notNoneType.mut_inf(np.ndarray): mutual information between each attribute inCandy, if they both are notNoneType.