treelite
Functions
treelite::model_loader::sklearn Namespace Reference

Functions

std::unique_ptr< treelite::ModelLoadRandomForestRegressor (int n_estimators, int n_features, int n_targets, std::int64_t const *node_count, std::int64_t const **children_left, std::int64_t const **children_right, std::int64_t const **feature, double const **threshold, double const **value, std::int64_t const **n_node_samples, double const **weighted_n_node_samples, double const **impurity)
 Load a scikit-learn RandomForestRegressor model from a collection of arrays. Refer to https://scikit-learn.org/stable/auto_examples/tree/plot_unveil_tree_structure.html to learn the meaning of the arrays in detail. Note that this function can also be used to load an ensemble of extremely randomized trees (sklearn.ensemble.ExtraTreesRegressor). More...
 
std::unique_ptr< treelite::ModelLoadIsolationForest (int n_estimators, int n_features, std::int64_t const *node_count, std::int64_t const **children_left, std::int64_t const **children_right, std::int64_t const **feature, double const **threshold, double const **value, std::int64_t const **n_node_samples, double const **weighted_n_node_samples, double const **impurity, double ratio_c)
 Load a scikit-learn IsolationForest model from a collection of arrays. Refer to https://scikit-learn.org/stable/auto_examples/tree/plot_unveil_tree_structure.html to learn the meaning of the arrays in detail. More...
 
std::unique_ptr< treelite::ModelLoadRandomForestClassifier (int n_estimators, int n_features, int n_targets, int32_t const *n_classes, std::int64_t const *node_count, std::int64_t const **children_left, std::int64_t const **children_right, std::int64_t const **feature, double const **threshold, double const **value, std::int64_t const **n_node_samples, double const **weighted_n_node_samples, double const **impurity)
 Load a scikit-learn RandomForestClassifier model from a collection of arrays. Refer to https://scikit-learn.org/stable/auto_examples/tree/plot_unveil_tree_structure.html to learn the meaning of the arrays in detail. Note that this function can also be used to load an ensemble of extremely randomized trees (sklearn.ensemble.ExtraTreesClassifier). More...
 
std::unique_ptr< treelite::ModelLoadGradientBoostingRegressor (int n_iter, int n_features, std::int64_t const *node_count, std::int64_t const **children_left, std::int64_t const **children_right, std::int64_t const **feature, double const **threshold, double const **value, std::int64_t const **n_node_samples, double const **weighted_n_node_samples, double const **impurity, double const *baseline_prediction)
 Load a scikit-learn GradientBoostingRegressor model from a collection of arrays. Refer to https://scikit-learn.org/stable/auto_examples/tree/plot_unveil_tree_structure.html to learn the meaning of the arrays in detail. Note: GradientBoostingRegressor does not support multiple targets (outputs). More...
 
std::unique_ptr< treelite::ModelLoadGradientBoostingClassifier (int n_iter, int n_features, int n_classes, std::int64_t const *node_count, std::int64_t const **children_left, std::int64_t const **children_right, std::int64_t const **feature, double const **threshold, double const **value, std::int64_t const **n_node_samples, double const **weighted_n_node_samples, double const **impurity, double const *baseline_prediction)
 Load a scikit-learn GradientBoostingClassifier model from a collection of arrays. Refer to https://scikit-learn.org/stable/auto_examples/tree/plot_unveil_tree_structure.html to learn the meaning of the arrays in detail. Note: GradientBoostingClassifier does not support multiple targets (outputs). More...
 
std::unique_ptr< treelite::ModelLoadHistGradientBoostingRegressor (int n_iter, int n_features, std::int64_t const *node_count, void const **nodes, int expected_sizeof_node_struct, std::uint32_t n_categorical_splits, std::uint32_t const **raw_left_cat_bitsets, std::uint32_t const *known_cat_bitsets, std::uint32_t const *known_cat_bitsets_offset_map, std::int32_t const *features_map, std::int64_t const **categories_map, double const *base_scores)
 Load a scikit-learn HistGradientBoostingRegressor model from a collection of arrays. Note: HistGradientBoostingRegressor does not support multiple targets (outputs). More...
 
std::unique_ptr< treelite::ModelLoadHistGradientBoostingClassifier (int n_iter, int n_features, int n_classes, int64_t const *node_count, void const **nodes, int expected_sizeof_node_struct, std::uint32_t n_categorical_splits, std::uint32_t const **raw_left_cat_bitsets, std::uint32_t const *known_cat_bitsets, std::uint32_t const *known_cat_bitsets_offset_map, std::int32_t const *features_map, std::int64_t const **categories_map, double const *base_scores)
 Load a scikit-learn HistGradientBoostingClassifier model from a collection of arrays. Note: HistGradientBoostingClassifier does not support multiple targets (outputs). More...
 

Function Documentation

◆ LoadGradientBoostingClassifier()

std::unique_ptr<treelite::Model> treelite::model_loader::sklearn::LoadGradientBoostingClassifier ( int  n_iter,
int  n_features,
int  n_classes,
std::int64_t const *  node_count,
std::int64_t const **  children_left,
std::int64_t const **  children_right,
std::int64_t const **  feature,
double const **  threshold,
double const **  value,
std::int64_t const **  n_node_samples,
double const **  weighted_n_node_samples,
double const **  impurity,
double const *  baseline_prediction 
)

Load a scikit-learn GradientBoostingClassifier model from a collection of arrays. Refer to https://scikit-learn.org/stable/auto_examples/tree/plot_unveil_tree_structure.html to learn the meaning of the arrays in detail. Note: GradientBoostingClassifier does not support multiple targets (outputs).

Parameters
n_iterNumber of boosting iterations
n_featuresNumber of features in the training data
n_classesNumber of classes in the target variable
node_countnode_count[i] stores the number of nodes in the i-th tree
children_leftchildren_left[i][k] stores the ID of the left child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
children_rightchildren_right[i][k] stores the ID of the right child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
featurefeature[i][k] stores the ID of the feature used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
thresholdthreshold[i][k] stores the threshold used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
valuevalue[i][k] stores the leaf output of node k of the i-th tree. This is only defined if node k is a leaf node.
n_node_samplesn_node_samples[i][k] stores the number of data samples associated with node k of the i-th tree.
weighted_n_node_samplesweighted_n_node_samples[i][k] stores the sum of weighted data samples associated with node k of the i-th tree.
impurityimpurity[i][k] stores the impurity measure (gini, entropy etc) associated with node k of the i-th tree.
baseline_predictionBaseline predictions for outputs. At prediction, margin scores will be adjusted by this amount before applying the post-processing (link) function. Required shape: (n_classes,)
Returns
Loaded model

◆ LoadGradientBoostingRegressor()

std::unique_ptr<treelite::Model> treelite::model_loader::sklearn::LoadGradientBoostingRegressor ( int  n_iter,
int  n_features,
std::int64_t const *  node_count,
std::int64_t const **  children_left,
std::int64_t const **  children_right,
std::int64_t const **  feature,
double const **  threshold,
double const **  value,
std::int64_t const **  n_node_samples,
double const **  weighted_n_node_samples,
double const **  impurity,
double const *  baseline_prediction 
)

Load a scikit-learn GradientBoostingRegressor model from a collection of arrays. Refer to https://scikit-learn.org/stable/auto_examples/tree/plot_unveil_tree_structure.html to learn the meaning of the arrays in detail. Note: GradientBoostingRegressor does not support multiple targets (outputs).

Parameters
n_iterNumber of boosting iterations
n_featuresNumber of features in the training data
node_countnode_count[i] stores the number of nodes in the i-th tree
children_leftchildren_left[i][k] stores the ID of the left child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
children_rightchildren_right[i][k] stores the ID of the right child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
featurefeature[i][k] stores the ID of the feature used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
thresholdthreshold[i][k] stores the threshold used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
valuevalue[i][k] stores the leaf output of node k of the i-th tree. This is only defined if node k is a leaf node.
n_node_samplesn_node_samples[i][k] stores the number of data samples associated with node k of the i-th tree.
weighted_n_node_samplesweighted_n_node_samples[i][k] stores the sum of weighted data samples associated with node k of the i-th tree.
impurityimpurity[i][k] stores the impurity measure (gini, entropy etc) associated with node k of the i-th tree.
baseline_predictionBaseline predictions for outputs. At prediction, margin scores will be adjusted by this amount before applying the post-processing (link) function. Required shape: (1,)
Returns
Loaded model

◆ LoadHistGradientBoostingClassifier()

std::unique_ptr<treelite::Model> treelite::model_loader::sklearn::LoadHistGradientBoostingClassifier ( int  n_iter,
int  n_features,
int  n_classes,
int64_t const *  node_count,
void const **  nodes,
int  expected_sizeof_node_struct,
std::uint32_t  n_categorical_splits,
std::uint32_t const **  raw_left_cat_bitsets,
std::uint32_t const *  known_cat_bitsets,
std::uint32_t const *  known_cat_bitsets_offset_map,
std::int32_t const *  features_map,
std::int64_t const **  categories_map,
double const *  base_scores 
)

Load a scikit-learn HistGradientBoostingClassifier model from a collection of arrays. Note: HistGradientBoostingClassifier does not support multiple targets (outputs).

Parameters
n_iterNumber of boosting iterations
n_featuresNumber of features in the training data
n_classesNumber of classes in the target variable
node_countnode_count[i] stores the number of nodes in the i-th tree
nodesnodes[i][k] stores the k-th node of the i-th tree.
expected_sizeof_node_structExpected size of Node struct, in bytes
n_categorical_splitsn_categorical_splits[i] stores the number of categorical splits in the i-th tree.
raw_left_cat_bitsetsraw_left_cat_bitsets[i][k] stores the bitmaps for node k of tree i. The bitmaps are used to represent categorical tests. Shape of raw_left_cat_bitsets[i]: (n_categorical_splits, 8)
known_cat_bitsetsBitsets representing the list of known categories per categorical feature. Shape: (n_categorical_features, 8)
known_cat_bitsets_offset_mapMap from an original feature index to the corresponding index in the known_cat_bitsets array. Shape: (n_features,)
features_mapMapping to re-order features. This is needed because HistGradientBoosting estimator internally re-orders features using ColumnTransformer so that the categorical features come before the numerical features.
categories_mapMapping to transform categorical features. This is needed because HistGradientBoosting estimator embeds an OrdinalEncoder. categories_map[i] represents the mapping for i-th categorical feature.
base_scoresBaseline predictions for outputs. At prediction, margin scores will be adjusted by this amount before applying the post-processing (link) function. Required shape: (1,) for binary classification; (n_classes,) for multi-class classification
Returns
Loaded model

◆ LoadHistGradientBoostingRegressor()

std::unique_ptr<treelite::Model> treelite::model_loader::sklearn::LoadHistGradientBoostingRegressor ( int  n_iter,
int  n_features,
std::int64_t const *  node_count,
void const **  nodes,
int  expected_sizeof_node_struct,
std::uint32_t  n_categorical_splits,
std::uint32_t const **  raw_left_cat_bitsets,
std::uint32_t const *  known_cat_bitsets,
std::uint32_t const *  known_cat_bitsets_offset_map,
std::int32_t const *  features_map,
std::int64_t const **  categories_map,
double const *  base_scores 
)

Load a scikit-learn HistGradientBoostingRegressor model from a collection of arrays. Note: HistGradientBoostingRegressor does not support multiple targets (outputs).

Parameters
n_iterNumber of boosting iterations
n_featuresNumber of features in the training data
node_countnode_count[i] stores the number of nodes in the i-th tree
nodesnodes[i][k] stores the k-th node of the i-th tree.
expected_sizeof_node_structExpected size of Node struct, in bytes
n_categorical_splitsn_categorical_splits[i] stores the number of categorical splits in the i-th tree.
raw_left_cat_bitsetsraw_left_cat_bitsets[i][k] stores the bitmaps for node k of tree i. The bitmaps are used to represent categorical tests. Shape of raw_left_cat_bitsets[i]: (n_categorical_splits, 8)
known_cat_bitsetsBitsets representing the list of known categories per categorical feature. Shape: (n_categorical_features, 8)
known_cat_bitsets_offset_mapMap from an original feature index to the corresponding index in the known_cat_bitsets array. Shape: (n_features,)
features_mapMapping to re-order features. This is needed because HistGradientBoosting estimator internally re-orders features using ColumnTransformer so that the categorical features come before the numerical features.
categories_mapMapping to transform categorical features. This is needed because HistGradientBoosting estimator embeds an OrdinalEncoder. categories_map[i] represents the mapping for i-th categorical feature.
base_scoresBaseline predictions for outputs. At prediction, margin scores will be adjusted by this amount before applying the post-processing (link) function. Required shape: (1,)
Returns
Loaded model

◆ LoadIsolationForest()

std::unique_ptr<treelite::Model> treelite::model_loader::sklearn::LoadIsolationForest ( int  n_estimators,
int  n_features,
std::int64_t const *  node_count,
std::int64_t const **  children_left,
std::int64_t const **  children_right,
std::int64_t const **  feature,
double const **  threshold,
double const **  value,
std::int64_t const **  n_node_samples,
double const **  weighted_n_node_samples,
double const **  impurity,
double  ratio_c 
)

Load a scikit-learn IsolationForest model from a collection of arrays. Refer to https://scikit-learn.org/stable/auto_examples/tree/plot_unveil_tree_structure.html to learn the meaning of the arrays in detail.

Parameters
n_estimatorsNumber of trees in the isolation forest
n_featuresNumber of features in the training data
node_countnode_count[i] stores the number of nodes in the i-th tree
children_leftchildren_left[i][k] stores the ID of the left child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
children_rightchildren_right[i][k] stores the ID of the right child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
featurefeature[i][k] stores the ID of the feature used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
thresholdthreshold[i][k] stores the threshold used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
valuevalue[i][k] stores the expected isolation depth of node k of the i-th tree. This is only defined if node k is a leaf node.
n_node_samplesn_node_samples[i][k] stores the number of data samples associated with node k of the i-th tree.
weighted_n_node_samplesweighted_n_node_samples[i][k] stores the sum of weighted data samples associated with node k of the i-th tree.
impurityNot used, but must be passed as array of arrays for each tree and node.
ratio_cStandardizing constant to use for calculation of the anomaly score.
Returns
Loaded model

◆ LoadRandomForestClassifier()

std::unique_ptr<treelite::Model> treelite::model_loader::sklearn::LoadRandomForestClassifier ( int  n_estimators,
int  n_features,
int  n_targets,
int32_t const *  n_classes,
std::int64_t const *  node_count,
std::int64_t const **  children_left,
std::int64_t const **  children_right,
std::int64_t const **  feature,
double const **  threshold,
double const **  value,
std::int64_t const **  n_node_samples,
double const **  weighted_n_node_samples,
double const **  impurity 
)

Load a scikit-learn RandomForestClassifier model from a collection of arrays. Refer to https://scikit-learn.org/stable/auto_examples/tree/plot_unveil_tree_structure.html to learn the meaning of the arrays in detail. Note that this function can also be used to load an ensemble of extremely randomized trees (sklearn.ensemble.ExtraTreesClassifier).

Parameters
n_estimatorsNumber of trees in the random forest
n_featuresNumber of features in the training data
n_targetsNumber of targets (outputs)
n_classesn_classes[i] stores the number of classes in the i-th target
node_countnode_count[i] stores the number of nodes in the i-th tree
children_leftchildren_left[i][k] stores the ID of the left child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
children_rightchildren_right[i][k] stores the ID of the right child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
featurefeature[i][k] stores the ID of the feature used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
thresholdthreshold[i][k] stores the threshold used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
valuevalue[i][k] stores the leaf output of node k of the i-th tree. This is only defined if node k is a leaf node.
n_node_samplesn_node_samples[i][k] stores the number of data samples associated with node k of the i-th tree.
weighted_n_node_samplesweighted_n_node_samples[i][k] stores the sum of weighted data samples associated with node k of the i-th tree.
impurityimpurity[i][k] stores the impurity measure (gini, entropy etc) associated with node k of the i-th tree.
Returns
Loaded model

◆ LoadRandomForestRegressor()

std::unique_ptr<treelite::Model> treelite::model_loader::sklearn::LoadRandomForestRegressor ( int  n_estimators,
int  n_features,
int  n_targets,
std::int64_t const *  node_count,
std::int64_t const **  children_left,
std::int64_t const **  children_right,
std::int64_t const **  feature,
double const **  threshold,
double const **  value,
std::int64_t const **  n_node_samples,
double const **  weighted_n_node_samples,
double const **  impurity 
)

Load a scikit-learn RandomForestRegressor model from a collection of arrays. Refer to https://scikit-learn.org/stable/auto_examples/tree/plot_unveil_tree_structure.html to learn the meaning of the arrays in detail. Note that this function can also be used to load an ensemble of extremely randomized trees (sklearn.ensemble.ExtraTreesRegressor).

Parameters
n_estimatorsNumber of trees in the random forest
n_featuresNumber of features in the training data
n_targetsNumber of targets (outputs)
node_countnode_count[i] stores the number of nodes in the i-th tree
children_leftchildren_left[i][k] stores the ID of the left child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
children_rightchildren_right[i][k] stores the ID of the right child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
featurefeature[i][k] stores the ID of the feature used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
thresholdthreshold[i][k] stores the threshold used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
valuevalue[i][k] stores the leaf output of node k of the i-th tree. This is only defined if node k is a leaf node.
n_node_samplesn_node_samples[i][k] stores the number of data samples associated with node k of the i-th tree.
weighted_n_node_samplesweighted_n_node_samples[i][k] stores the sum of weighted data samples associated with node k of the i-th tree.
impurityimpurity[i][k] stores the impurity measure (gini, entropy etc) associated with node k of the i-th tree.
Returns
Loaded model