pymfe
latest

Getting Started

  • Install
  • Using PyMFE

API Documentation

  • Meta-feature Description Table
  • API Documentation

Tutorial and Examples

  • The PyMFE example gallery
    • Introductory Examples
    • Advanced Examples
    • Miscellaneous Examples
    • Examples for Developers
      • Introductory Examples
      • Advanced Examples
      • Miscellaneous Examples
      • Examples for Developers

Addtional Information

  • What is new on PyMFE package?
  • About us
pymfe
  • The PyMFE example gallery
  • Edit on GitHub

The PyMFE example gallery

In this gallery, we will show a set of examples to help you to use this package and guide you on the meta-feature extraction process.

In the Meta-learning (MtL) literature, meta-features are measures used to characterize data sets and/or their relations with algorithm bias. According to Brazdil et al. (2008), “Meta-learning is the study of principled methods that exploit meta-knowledge to obtain efficient models and solutions by adapting the machine learning and data mining process”.

Meta-features are used in MtL and AutoML tasks in general, to represent/understand a dataset, to understanding a learning bias, to create machine learning (or data mining) recommendations systems, and to create surrogates models, to name a few.

Pinto et al. (2016) and Rivolli et al. (2018) defined a meta-feature as follows. Let \(D \in \mathcal{D}\) be a dataset, \(m\colon \mathcal{D} \to \mathbb{R}^{k'}\) be a characterization measure, and \(\sigma\colon \mathbb{R}^{k'} \to \mathbb{R}^{k}\) be a summarization function. Both \(m\) and \(\sigma\) have also hyperparameters associated, \(h_m\) and \(h_\sigma\) respectively. Thus, a meta-feature \(f\colon \mathcal{D} \to \mathbb{R}^{k}\) for a given dataset \(D\) is:

\[f\big(D\big) = \sigma\big(m(D,h_m), h_\sigma\big).\]

The measure :math: m can extract more than one value from each data set, i.e., \(k'\) can vary according to \(D\), which can be mapped to a vector of fixed length \(k\) using a summarization function :math: sigma.

In this package, We provided the following meta-features groups:

  • General: General information related to the dataset, also known as simple measures, such as the number of instances, attributes and classes.

  • Statistical: Standard statistical measures to describe the numerical properties of data distribution.

  • Information-theoretic: Particularly appropriate to describe discrete (categorical) attributes and their relationship with the classes.

  • Model-based: Measures designed to extract characteristics from simple machine learning models.

  • Landmarking: Performance of simple and efficient learning algorithms.

  • Relative Landmarking: Relative performance of simple and efficient learning algorithms.

  • Subsampling Landmarking: Performance of simple and efficient learning algorithms from a subsample of the dataset.

  • Clustering: Clustering measures extract information about dataset based on external validation indexes.

  • Concept: Estimate the variability of class labels among examples and the examples density.

  • Itemset: Compute the correlation between binary attributes.

  • Complexity: Estimate the difficulty in separating the data points into their expected classes.

Below is a gallery of examples:

Introductory Examples

Introductory examples for the PyMFE package.

Extracting meta-features from unsupervised learning

Extracting meta-features from unsupervised learning

Extracting meta-features from unsupervised learning
Meta-features from a model

Meta-features from a model

Meta-features from a model
Using Summaries

Using Summaries

Using Summaries
Select specific measures and summaries

Select specific measures and summaries

Select specific measures and summaries
Basic of meta-features extraction

Basic of meta-features extraction

Basic of meta-features extraction
Extracting meta-features by group

Extracting meta-features by group

Extracting meta-features by group

Advanced Examples

These examples will show you how to use some advanced configurations and tricks to code more comfortable.

Customizing measures arguments

Customizing measures arguments

Customizing measures arguments
Meta-feature confidence interval

Meta-feature confidence interval

Meta-feature confidence interval

Miscellaneous Examples

Miscellaneous examples for the pymfe package.

Extracting large number of metafeatures

Extracting large number of metafeatures

Extracting large number of metafeatures
Metafeature description

Metafeature description

Metafeature description
Listing available metafeatures, groups, and summaries

Listing available metafeatures, groups, and summaries

Listing available metafeatures, groups, and summaries
Working with the results

Working with the results

Working with the results
Plotting elapsed time in a meta-feature extraction

Plotting elapsed time in a meta-feature extraction

Plotting elapsed time in a meta-feature extraction
Using Pandas, CSV and ARFF files

Using Pandas, CSV and ARFF files

Using Pandas, CSV and ARFF files

Examples for Developers

These examples are dedicated to any person that wish contribute to the development of the package or understand more about it. We expect that these examples show you the basic about PYMFE architecture and inspire you to contribute.

A developer sample class for Metafeature groups.

A developer sample class for Metafeature groups.

A developer sample class for Metafeature groups.

Download all examples in Python source code: auto_examples_python.zip

Download all examples in Jupyter notebooks: auto_examples_jupyter.zip

Gallery generated by Sphinx-Gallery

Previous Next

© Copyright 2018-2021, Edesio Alcobaça, Felipe Siqueira. Revision 50131572.

Built with Sphinx using a theme provided by Read the Docs.
Read the Docs v: latest
Versions
latest
stable
Downloads
html
On Read the Docs
Project Home
Builds