OpenML

Leo Breiman (2001). Random Forests. Machine Learning. 45(1):5-32.

Ross Quinlan (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA.

J. Friedman, T. Hastie, R. Tibshirani (1998). Additive Logistic Regression: a Statistical View of Boosting. Stanford University.

Niels Landwehr, Mark Hall, Eibe Frank (2005). Logistic Model Trees. Machine Learning. 95(1-2):161-205. Marc Sumner, Eibe Frank, Mark Hall: Speeding up Logistic Model Tree Induction. In: 9th European…

Leo Breiman (1996). Bagging predictors. Machine Learning. 24(2):123-140.

Imputation transformer for completing missing values.

A decision tree classifier.

Pipeline of transforms with a final estimator. Sequentially apply a list of transforms and a final estimator. Intermediate steps of the pipeline must be 'transforms', that is, they must implement fit…

This flow is generated by the automl benchmark: https://github.com/openml/automlbenchmark.git Repository commit: 75567510ce887b7b8aa857b9a1f9f29d1775813c constantpredictor version: stable

A random forest classifier. A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and uses averaging to improve the predictive…

Learner mlr.classif.randomForest from package(s) randomForest.

Pipeline of transforms with a final estimator. Sequentially apply a list of transforms and a final estimator. Intermediate steps of the pipeline must be 'transforms', that is, they must implement fit…

Imputation transformer for completing missing values.

A decision tree classifier.

Implementation of the scikit-learn classifier API for Keras. Below are a list of SciKeras specific parameters. For details on other parameters, please see the see the `tf.keras.Model documentation…

This flow is generated by the automl benchmark: https://github.com/openml/automlbenchmark.git Repository commit: f0086d1bd6488395413bfe1f6caf8f9a34b8910d constantpredictor version: stable

Feature selector that removes all low-variance features. This feature selection algorithm looks only at the features (X), not the desired outputs (y), and can thus be used for unsupervised learning.

C-Support Vector Classification. The implementation is based on libsvm. The fit time complexity is more than quadratic with the number of samples which makes it hard to scale to dataset with more than…

Pipeline of transforms with a final estimator. Sequentially apply a list of transforms and a final estimator. Intermediate steps of the pipeline must be 'transforms', that is, they must implement fit…

Imputation transformer for completing missing values.

Encode categorical integer features using a one-hot aka one-of-K scheme. The input to this transformer should be a matrix of integers, denoting the values taken on by categorical (discrete) features.…

Standardize features by removing the mean and scaling to unit variance Centering and scaling happen independently on each feature by computing the relevant statistics on the samples in the training…

Automatically created tensorflow flow.

Automatically created tensorflow flow.

Automatically created tensorflow flow.

Automatically created tensorflow flow.

Automatically created tensorflow flow.

Automatically created tensorflow flow.

Imputation transformer for completing missing values.

Applies transformers to columns of an array or pandas DataFrame. This estimator allows different columns or column subsets of the input to be transformed separately and the features generated by each…

Encode categorical features as a one-hot numeric array. The input to this transformer should be an array-like of integers or strings, denoting the values taken on by categorical (discrete) features.…

Standardize features by removing the mean and scaling to unit variance The standard score of a sample `x` is calculated as: z = (x - u) / s where `u` is the mean of the training samples or zero if…

Feature selector that removes all low-variance features. This feature selection algorithm looks only at the features (X), not the desired outputs (y), and can thus be used for unsupervised learning.

C-Support Vector Classification. The implementation is based on libsvm. The fit time scales at least quadratically with the number of samples and may be impractical beyond tens of thousands of…

A decision tree classifier.

Imputation transformer for completing missing values.

Imputation transformer for completing missing values.

A random forest classifier. A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and uses averaging to improve the predictive…

Logistic Regression (aka logit, MaxEnt) classifier. In the multiclass case, the training algorithm uses the one-vs-rest (OvR) scheme if the 'multi_class' option is set to 'ovr', and uses the…

An extremely randomized tree regressor. Extra-trees differ from classic decision trees in the way they are built. When looking for the best split to separate the samples of a node into two groups,…

Randomized search on hyper parameters. RandomizedSearchCV implements a "fit" and a "score" method. It also implements "predict", "predict_proba", "decision_function", "transform" and…

Randomized search on hyper parameters. RandomizedSearchCV implements a "fit" and a "score" method. It also implements "predict", "predict_proba", "decision_function", "transform" and…

A random forest classifier. A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and uses averaging to improve the predictive…

Apply a power transform featurewise to make data more Gaussian-like. Power transforms are a family of parametric, monotonic transformations that are applied to make data more Gaussian-like. This is…

Scale features using statistics that are robust to outliers. This Scaler removes the median and scales the data according to the quantile range (defaults to IQR: Interquartile Range). The IQR is the…

Transform features by scaling each feature to a given range. This estimator scales and translates each feature individually such that it is in the given range on the training set, e.g. between zero…

A random forest regressor. A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive…

Regression based on k-nearest neighbors. The target is predicted by local interpolation of the targets associated of the nearest neighbors in the training set.

Epsilon-Support Vector Regression. The free parameters in the model are C and epsilon. The implementation is based on libsvm. The fit time complexity is more than quadratic with the number of samples…

A decision tree regressor.

Ordinary least squares Linear Regression. LinearRegression fits a linear model with coefficients w = (w1, ..., wp) to minimize the residual sum of squares between the observed targets in the dataset,…

Imputation transformer for completing missing values.

Applies transformers to columns of an array or pandas DataFrame. This estimator allows different columns or column subsets of the input to be transformed separately and the features generated by each…

Encode categorical features as a one-hot numeric array. The input to this transformer should be an array-like of integers or strings, denoting the values taken on by categorical (discrete) features.…

Feature selector that removes all low-variance features. This feature selection algorithm looks only at the features (X), not the desired outputs (y), and can thus be used for unsupervised learning.

An AdaBoost classifier. An AdaBoost [1] classifier is a meta-estimator that begins by fitting a classifier on the original dataset and then fits additional copies of the classifier on the same dataset…

A decision tree classifier.

A decision tree classifier.

Automatically created keras flow.

Automatically created keras flow.

A decision tree classifier.

A decision tree classifier.

A decision tree classifier.

Automatically created keras flow.

Automatically created keras flow.

Automatically created keras flow.

Automatically created keras flow.

Automatically created keras flow.

Automatically created keras flow.

Imputation transformer for completing missing values.

Encode categorical features as a one-hot numeric array. The input to this transformer should be an array-like of integers or strings, denoting the values taken on by categorical (discrete) features.…

A decision tree classifier.

Classifier implementing the k-nearest neighbors vote.

Automatically created keras flow.

Implementation of the scikit-learn API for XGBoost classification.

Standardize features by removing the mean and scaling to unit variance The standard score of a sample `x` is calculated as: z = (x - u) / s where `u` is the mean of the training samples or zero if…

C-Support Vector Classification. The implementation is based on libsvm. The fit time scales at least quadratically with the number of samples and may be impractical beyond tens of thousands of…

Standardize features by removing the mean and scaling to unit variance Centering and scaling happen independently on each feature by computing the relevant statistics on the samples in the training…

C-Support Vector Classification. The implementation is based on libsvm. The fit time complexity is more than quadratic with the number of samples which makes it hard to scale to dataset with more than…

Imputation transformer for completing missing values.

A decision tree classifier.

This flow is generated by the automl benchmark: https://github.com/openml/automlbenchmark Precise benchmark version information could not be determined. constantpredictor version: stable

C-Support Vector Classification. The implementation is based on libsvm. The fit time scales at least quadratically with the number of samples and may be impractical beyond tens of thousands of…

Automatically created keras flow.

Automatically created keras flow.

Automatically created keras flow.

Automatically created keras flow.