Treelite C API
Treelite exposes a set of C functions to enable interfacing with a variety of languages. This page will be most useful for:
those writing a new language binding (glue code).
those wanting to incorporate functions of Treelite into their own native libraries.
We recommend the Python API for everyday uses.
Use of C and C++ in Treelite
Core logic of Treelite are written in C++ to take advantage of higher abstractions. We provide C only interface here, as many more programming languages bind with C than with C++. See this page for more details.
Model loader interface
Use the following functions to load decision tree ensemble models from a file. Treelite supports multiple model file formats.
int TreeliteLoadXGBoostModelLegacyBinary(char const *filename, char const *config_json, TreeliteModelHandle *out)
Load a model file generated by XGBoost (dmlc/xgboost), stored in the legacy binary format.
- Parameters:
filename – Name of model file
config_json – JSON string consisting key-value pairs; used for configuring the model parser
out – Loaded model
- Returns:
0 for success, -1 for failure
int TreeliteLoadXGBoostModelLegacyBinaryFromMemoryBuffer(void const *buf, size_t len, char const *config_json, TreeliteModelHandle *out)
Load an XGBoost model from a memory buffer using the legacy binary format.
- Parameters:
buf – Memory buffer
len – Size of memory buffer
config_json – Null-terminated JSON string consisting key-value pairs; used for configuring the model parser
out – Loaded model
- Returns:
0 for success, -1 for failure
int TreeliteLoadXGBoostModel(char const *filename, char const *config_json, TreeliteModelHandle *out)
Deprecated. Please use TreeliteLoadXGBoostModelJSON instead.
int TreeliteLoadXGBoostModelFromString(char const *json_str, size_t length, char const *config_json, TreeliteModelHandle *out)
Deprecated. Please use TreeliteLoadXGBoostModelFromJSONString instead.
int TreeliteLoadXGBoostModelJSON(char const *filename, char const *config_json, TreeliteModelHandle *out)
Load a model file generated by XGBoost (dmlc/xgboost), stored in the JSON format.
- Parameters:
filename – Name of model file
config_json – Null-terminated JSON string consisting key-value pairs; used for configuring the model parser
out – Loaded model
- Returns:
0 for success, -1 for failure
int TreeliteLoadXGBoostModelFromJSONString(char const *json_str, size_t length, char const *config_json, TreeliteModelHandle *out)
Load an XGBoost model from a JSON string.
- Parameters:
json_str – JSON string containing the XGBoost model
length – Length of the JSON string
config_json – Null-terminated JSON string consisting key-value pairs; used for configuring the model parser
out – Loaded model
- Returns:
0 for success, -1 for failure
int TreeliteLoadXGBoostModelUBJSON(char const *filename, char const *config_json, TreeliteModelHandle *out)
Load a model file generated by XGBoost (dmlc/xgboost), stored in the UBJSON format.
- Parameters:
filename – Name of model file
config_json – Null-terminated JSON string consisting key-value pairs; used for configuring the model parser
out – Loaded model
- Returns:
0 for success, -1 for failure
int TreeliteLoadXGBoostModelFromUBJSONString(uint8_t const *ubjson_str, size_t length, char const *config_json, TreeliteModelHandle *out)
Load an XGBoost model from a UBJSON string.
- Parameters:
ubjson_str – UBJSON byte sequence
length – Length of the byte sequence
config_json – Null-terminated JSON string consisting key-value pairs; used for configuring the model parser
out – Loaded model
- Returns:
0 for success, -1 for failure
int TreeliteDetectXGBoostFormat(char const *filename, char const **out_str)
Inspect the first few bytes of an XGBoost model and heuristically determine whether it’s using the JSON or UBJSON format.
- Parameters:
filename – Name of model file
out_str – String indicating the model type (“json”, “ubjson”, or “unknown”)
- Returns:
0 for success, -1 for failure
int TreeliteLoadLightGBMModel(char const *filename, char const *config_json, TreeliteModelHandle *out)
Load a model file generated by LightGBM (Microsoft/LightGBM). The model file must contain a decision tree ensemble.
- Parameters:
filename – Name of model file
config_json – Null-terminated JSON string consisting key-value pairs; used for configuring the model parser
out – Loaded model
- Returns:
0 for success, -1 for failure
int TreeliteLoadLightGBMModelFromString(char const *model_str, char const *config_json, TreeliteModelHandle *out)
Load a LightGBM model from a string. The string should be created with the model_to_string() method in LightGBM.
- Parameters:
model_str – Model string
config_json – Null-terminated JSON string consisting key-value pairs; used for configuring the model parser
out – Loaded model
- Returns:
0 for success, -1 for failure
Model loader interface for scikit-learn models
Use the following functions to load decision tree ensemble models from a scikit-learn model object.
int TreeliteLoadSKLearnRandomForestRegressor(int n_estimators, int n_features, int n_targets, int64_t const *node_count, int64_t const **children_left, int64_t const **children_right, int64_t const **feature, double const **threshold, double const **value, int64_t const **n_node_samples, double const **weighted_n_node_samples, double const **impurity, TreeliteModelHandle *out)
Load a scikit-learn RandomForestRegressor model from a collection of arrays. Refer to to learn the meaning of the arrays in detail. Note that this function can also be used to load an ensemble of extremely randomized trees (sklearn.ensemble.ExtraTreesRegressor).
- Parameters:
n_estimators – Number of trees in the random forest
n_features – Number of features in the training data
n_targets – Number of targets (outputs)
node_count – node_count[i] stores the number of nodes in the i-th tree
children_left – children_left[i][k] stores the ID of the left child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
children_right – children_right[i][k] stores the ID of the right child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
feature – feature[i][k] stores the ID of the feature used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
threshold – threshold[i][k] stores the threshold used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
value – value[i][k] stores the leaf output of node k of the i-th tree. This is only defined if node k is a leaf node.
n_node_samples – n_node_samples[i][k] stores the number of data samples associated with node k of the i-th tree.
weighted_n_node_samples – weighted_n_node_samples[i][k] stores the sum of weighted data samples associated with node k of the i-th tree.
impurity – impurity[i][k] stores the impurity measure (gini, entropy etc) associated with node k of the i-th tree.
out – Loaded model
- Returns:
0 for success, -1 for failure
int TreeliteLoadSKLearnIsolationForest(int n_estimators, int n_features, int64_t const *node_count, int64_t const **children_left, int64_t const **children_right, int64_t const **feature, double const **threshold, double const **value, int64_t const **n_node_samples, double const **weighted_n_node_samples, double const **impurity, double ratio_c, TreeliteModelHandle *out)
Load a scikit-learn IsolationForest model from a collection of arrays. Refer to to learn the meaning of the arrays in detail.
- Parameters:
n_estimators – Number of trees in the isolation forest
n_features – Number of features in the training data
node_count – node_count[i] stores the number of nodes in the i-th tree
children_left – children_left[i][k] stores the ID of the left child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
children_right – children_right[i][k] stores the ID of the right child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
feature – feature[i][k] stores the ID of the feature used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
threshold – threshold[i][k] stores the threshold used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
value – value[i][k] stores the expected isolation depth of node k of the i-th tree. This is only defined if node k is a leaf node.
n_node_samples – n_node_samples[i][k] stores the number of data samples associated with node k of the i-th tree.
weighted_n_node_samples – weighted_n_node_samples[i][k] stores the sum of weighted data samples associated with node k of the i-th tree.
impurity – Not used, but must be passed as array of arrays for each tree and node.
ratio_c – Standardizing constant to use for calculation of the anomaly score.
out – Loaded model
- Returns:
0 for success, -1 for failure
int TreeliteLoadSKLearnRandomForestClassifier(int n_estimators, int n_features, int n_targets, int32_t const *n_classes, int64_t const *node_count, int64_t const **children_left, int64_t const **children_right, int64_t const **feature, double const **threshold, double const **value, int64_t const **n_node_samples, double const **weighted_n_node_samples, double const **impurity, TreeliteModelHandle *out)
Load a scikit-learn RandomForestClassifier model from a collection of arrays. Refer to to learn the meaning of the arrays in detail. Note that this function can also be used to load an ensemble of extremely randomized trees (sklearn.ensemble.ExtraTreesClassifier).
- Parameters:
n_estimators – Number of trees in the random forest
n_features – Number of features in the training data
n_targets – Number of targets (outputs)
n_classes – n_classes[i] stores the number of classes in the i-th target
node_count – node_count[i] stores the number of nodes in the i-th tree
children_left – children_left[i][k] stores the ID of the left child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
children_right – children_right[i][k] stores the ID of the right child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
feature – feature[i][k] stores the ID of the feature used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
threshold – threshold[i][k] stores the threshold used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
value – value[i][k] stores the leaf output of node k of the i-th tree. This is only defined if node k is a leaf node.
n_node_samples – n_node_samples[i][k] stores the number of data samples associated with node k of the i-th tree.
weighted_n_node_samples – weighted_n_node_samples[i][k] stores the sum of weighted data samples associated with node k of the i-th tree.
impurity – impurity[i][k] stores the impurity measure (gini, entropy etc) associated with node k of the i-th tree.
out – Loaded model
- Returns:
0 for success, -1 for failure
int TreeliteLoadSKLearnGradientBoostingRegressor(int n_iter, int n_features, int64_t const *node_count, int64_t const **children_left, int64_t const **children_right, int64_t const **feature, double const **threshold, double const **value, int64_t const **n_node_samples, double const **weighted_n_node_samples, double const **impurity, double const *base_scores, TreeliteModelHandle *out)
Load a scikit-learn GradientBoostingRegressor model from a collection of arrays. Refer to to learn the meaning of the arrays in detail. Note: GradientBoostingRegressor does not support multiple targets (outputs).
- Parameters:
n_iter – Number of boosting iterations
n_features – Number of features in the training data
node_count – node_count[i] stores the number of nodes in the i-th tree
children_left – children_left[i][k] stores the ID of the left child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
children_right – children_right[i][k] stores the ID of the right child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
feature – feature[i][k] stores the ID of the feature used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
threshold – threshold[i][k] stores the threshold used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
value – value[i][k] stores the leaf output of node k of the i-th tree. This is only defined if node k is a leaf node.
n_node_samples – n_node_samples[i][k] stores the number of data samples associated with node k of the i-th tree.
weighted_n_node_samples – weighted_n_node_samples[i][k] stores the sum of weighted data samples associated with node k of the i-th tree.
impurity – impurity[i][k] stores the impurity measure (gini, entropy etc) associated with node k of the i-th tree.
base_scores – Baseline predictions for outputs. At prediction, margin scores will be adjusted by this amount before applying the post-processing (link) function. Required shape: (1,)
out – Loaded model
- Returns:
0 for success, -1 for failure
int TreeliteLoadSKLearnGradientBoostingClassifier(int n_iter, int n_features, int n_classes, int64_t const *node_count, int64_t const **children_left, int64_t const **children_right, int64_t const **feature, double const **threshold, double const **value, int64_t const **n_node_samples, double const **weighted_n_node_samples, double const **impurity, double const *base_scores, TreeliteModelHandle *out)
Load a scikit-learn GradientBoostingClassifier model from a collection of arrays. Refer to to learn the meaning of the arrays in detail. Note: GradientBoostingClassifier does not support multiple targets (outputs).
- Parameters:
n_iter – Number of boosting iterations
n_features – Number of features in the training data
n_classes – Number of classes in the target variable
node_count – node_count[i] stores the number of nodes in the i-th tree
children_left – children_left[i][k] stores the ID of the left child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
children_right – children_right[i][k] stores the ID of the right child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
feature – feature[i][k] stores the ID of the feature used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
threshold – threshold[i][k] stores the threshold used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
value – value[i][k] stores the leaf output of node k of the i-th tree. This is only defined if node k is a leaf node.
n_node_samples – n_node_samples[i][k] stores the number of data samples associated with node k of the i-th tree.
weighted_n_node_samples – weighted_n_node_samples[i][k] stores the sum of weighted data samples associated with node k of the i-th tree.
impurity – impurity[i][k] stores the impurity measure (gini, entropy etc) associated with node k of the i-th tree.
base_scores – Baseline predictions for outputs. At prediction, margin scores will be adjusted by this amount before applying the post-processing (link) function. Required shape: (n_classes,)
out – Loaded model
- Returns:
0 for success, -1 for failure
int TreeliteLoadSKLearnHistGradientBoostingRegressor(int n_iter, int n_features, int64_t const *node_count, void const **nodes, int expected_sizeof_node_struct, uint32_t n_categorical_splits, uint32_t const **raw_left_cat_bitsets, uint32_t const *known_cat_bitsets, uint32_t const *known_cat_bitsets_offset_map, int32_t const *features_map, int64_t const **categories_map, double const *base_scores, TreeliteModelHandle *out)
Load a scikit-learn HistGradientBoostingRegressor model from a collection of arrays. Note: HistGradientBoostingRegressor does not support multiple targets (outputs).
- Parameters:
n_iter – Number of boosting iterations
n_features – Number of features in the training data
node_count – node_count[i] stores the number of nodes in the i-th tree
nodes – nodes[i][k] stores the k-th node of the i-th tree.
expected_sizeof_node_struct – Expected size of Node struct, in bytes
n_categorical_splits – n_categorical_splits[i] stores the number of categorical splits in the i-th tree.
raw_left_cat_bitsets – raw_left_cat_bitsets[i][k] stores the bitmaps for node k of tree i. The bitmaps are used to represent categorical tests. Shape of raw_left_cat_bitsets[i]: (n_categorical_splits, 8)
known_cat_bitsets – Bitsets representing the list of known categories per categorical feature. Shape: (n_categorical_features, 8)
known_cat_bitsets_offset_map – Map from an original feature index to the corresponding index in the known_cat_bitsets array. Shape: (n_features,)
features_map – Mapping to re-order features. This is needed because HistGradientBoosting estimator internally re-orders features using ColumnTransformer so that the categorical features come before the numerical features.
categories_map – Mapping to transform categorical features. This is needed because HistGradientBoosting estimator embeds an OrdinalEncoder. categories_map[i] represents the mapping for i-th categorical feature.
base_scores – Baseline predictions for outputs. At prediction, margin scores will be adjusted by this amount before applying the post-processing (link) function. Required shape: (1,)
out – Loaded model
- Returns:
0 for success, -1 for failure
int TreeliteLoadSKLearnHistGradientBoostingClassifier(int n_iter, int n_features, int n_classes, int64_t const *node_count, void const **nodes, int expected_sizeof_node_struct, uint32_t n_categorical_splits, uint32_t const **raw_left_cat_bitsets, uint32_t const *known_cat_bitsets, uint32_t const *known_cat_bitsets_offset_map, int32_t const *features_map, int64_t const **categories_map, double const *base_scores, TreeliteModelHandle *out)
Load a scikit-learn HistGradientBoostingClassifier model from a collection of arrays. Note: HistGradientBoostingClassifier does not support multiple targets (outputs).
- Parameters:
n_iter – Number of boosting iterations
n_features – Number of features in the training data
n_classes – Number of classes in the target variable
node_count – node_count[i] stores the number of nodes in the i-th tree
nodes – nodes[i][k] stores the k-th node of the i-th tree.
expected_sizeof_node_struct – Expected size of Node struct, in bytes
n_categorical_splits – n_categorical_splits[i] stores the number of categorical splits in the i-th tree.
raw_left_cat_bitsets – raw_left_cat_bitsets[i][k] stores the bitmaps for node k of tree i. The bitmaps are used to represent categorical tests. Shape of raw_left_cat_bitsets[i]: (n_categorical_splits, 8)
known_cat_bitsets – Bitsets representing the list of known categories per categorical feature. Shape: (n_categorical_features, 8)
known_cat_bitsets_offset_map – Map from an original feature index to the corresponding index in the known_cat_bitsets array. Shape: (n_features,)
features_map – Mapping to re-order features. This is needed because HistGradientBoosting estimator internally re-orders features using ColumnTransformer so that the categorical features come before the numerical features.
categories_map – Mapping to transform categorical features. This is needed because HistGradientBoosting estimator embeds an OrdinalEncoder. categories_map[i] represents the mapping for i-th categorical feature.
base_scores – Baseline predictions for outputs. At prediction, margin scores will be adjusted by this amount before applying the post-processing (link) function. Required shape: (1,) for binary classification; (n_classes,) for multi-class classification
out – Loaded model
- Returns:
0 for success, -1 for failure
Model builder interface
Use the following functions to incrementally build decisio n tree ensemble models.
int TreeliteGetModelBuilder(char const *json_str, TreeliteModelBuilderHandle *out)
Initialize a model builder object from a JSON string.
The JSON string must contain all relevant metadata, including:
threshold_type: Type of thresholds in the tree model
leaf_output_type: Type of leaf outputs in the tree model
metadata: Model metadata, consisting of following subfields:
num_feature: Number of features
task_type: Task type
average_tree_output: Whether to average outputs of trees
num_target: Number of targets
num_class: Number of classes. num_class[i] is the number of classes of target i.
leaf_vector_shape: Shape of the output from each leaf node
tree_annotation: Annotation for individual trees, consisting of following subfields:
num_tree: Number of trees
target_id: target_id Target that each tree is associated with
class_id: Class that each tree is associated with
postprocessor: Postprocessor for prediction outputs, consisting of following subfields:
name: Name of postprocessor
config_json: Optional JSON string to configure the postprocessor
base_scores: Baseline scores for targets and classes, before adding tree outputs. Also known as the intercept.
attributes: Arbitrary JSON object, to be stored in the “attributes” field in the model object.
- Parameters:
json_str – JSON string containing relevant metadata.
out – Model builder object
- Returns:
0 for success, -1 for failure
int TreeliteDeleteModelBuilder(TreeliteModelBuilderHandle model_builder)
Delete model builder object from memory.
- Parameters:
model_builder – Model builder object to be deleted
- Returns:
0 for success, -1 for failure
int TreeliteModelBuilderStartTree(TreeliteModelBuilderHandle model_builder)
Start a new tree.
- Parameters:
model_builder – Model builder object
- Returns:
0 for success, -1 for failure
int TreeliteModelBuilderEndTree(TreeliteModelBuilderHandle model_builder)
End the current tree.
- Parameters:
model_builder – Model builder object
- Returns:
0 for success, -1 for failure
int TreeliteModelBuilderStartNode(TreeliteModelBuilderHandle model_builder, int node_key)
Start a new node.
- Parameters:
model_builder – Model builder object
node_key – Integer key that unique identifies the node.
- Returns:
0 for success, -1 for failure
int TreeliteModelBuilderEndNode(TreeliteModelBuilderHandle model_builder)
End the current node.
- Parameters:
model_builder – Model builder object
- Returns:
0 for success, -1 for failure
int TreeliteModelBuilderNumericalTest(TreeliteModelBuilderHandle model_builder, int32_t split_index, double threshold, int default_left, char const *cmp, int left_child_key, int right_child_key)
Declare the current node as a numerical test node, where the test is of form [feature value] [cmp] [threshold]. Data points for which the test evaluates to True will be mapped to the left child node; all other data points (for which the test evaluates to False) will be mapped to the right child node.
- Parameters:
model_builder – Model builder object
split_index – Feature ID
threshold – Threshold
default_left – Whether the missing value should be mapped to the left child
cmp – Comparison operator
left_child_key – Integer key that unique identifies the left child node.
right_child_key – Integer key that unique identifies the right child node.
- Returns:
0 for success, -1 for failure
int TreeliteModelBuilderCategoricalTest(TreeliteModelBuilderHandle model_builder, int32_t split_index, int default_left, uint32_t const *category_list, size_t category_list_len, int category_list_right_child, int left_child_key, int right_child_key)
Declare the current node as a categorical test node, where the test is of form [feature value] \in [category list].
- Parameters:
model_builder – Model builder object
split_index – Feature ID
default_left – Whether the missing value should be mapped to the left child
category_list – List of categories to be tested for match
category_list_len – Length of category_list
category_list_right_child – Whether the data points for which the test evaluates to True should be mapped to the right child or the left child.
left_child_key – Integer key that unique identifies the left child node.
right_child_key – Integer key that unique identifies the right child node.
- Returns:
0 for success, -1 for failure
int TreeliteModelBuilderLeafScalar(TreeliteModelBuilderHandle model_builder, double leaf_value)
Declare the current node as a leaf node with a scalar output.
- Parameters:
model_builder – Model builder object
leaf_value – Value of leaf output
- Returns:
0 for success, -1 for failure
int TreeliteModelBuilderLeafVectorFloat32(TreeliteModelBuilderHandle model_builder, float const *leaf_vector, size_t leaf_vector_len)
Declare the current node as a leaf node with a vector output (float32)
- Parameters:
model_builder – Model builder object
leaf_vector – Value of leaf output
leaf_vector_len – Length of leaf_vector
- Returns:
0 for success, -1 for failure
int TreeliteModelBuilderLeafVectorFloat64(TreeliteModelBuilderHandle model_builder, double const *leaf_vector, size_t leaf_vector_len)
Declare the current node as a leaf node with a vector output (float64)
- Parameters:
model_builder – Model builder object
leaf_vector – Value of leaf output
leaf_vector_len – Length of leaf_vector
- Returns:
0 for success, -1 for failure
int TreeliteModelBuilderGain(TreeliteModelBuilderHandle model_builder, double gain)
Specify the gain (loss reduction) that’s resulted from the current split.
- Parameters:
model_builder – Model builder object
gain – Gain (loss reduction)
- Returns:
0 for success, -1 for failure
int TreeliteModelBuilderDataCount(TreeliteModelBuilderHandle model_builder, uint64_t data_count)
Specify the number of data points (samples) that are mapped to the current node.
- Parameters:
model_builder – Model builder object
data_count – Number of data points
- Returns:
0 for success, -1 for failure
int TreeliteModelBuilderSumHess(TreeliteModelBuilderHandle model_builder, double sum_hess)
Specify the weighted sample count or the sum of Hessians for the data points that are mapped to the current node.
- Parameters:
model_builder – Model builder object
sum_hess – Weighted sample count or the sum of Hessians
- Returns:
0 for success, -1 for failure
int TreeliteModelBuilderCommitModel(TreeliteModelBuilderHandle model_builder, TreeliteModelHandle *out)
Conclude model building and obtain the final model object.
- Parameters:
model_builder – Model builder object
out – Final model object
Model manager interface
int TreeliteDumpAsJSON(TreeliteModelHandle handle, int pretty_print, char const **out_json_str)
Dump a model object as a JSON string.
- Parameters:
handle – The handle to the model object
pretty_print – Whether to pretty-print JSON string (0 for false, != 0 for true)
out_json_str – The JSON string
- Returns:
0 for success, -1 for failure
int TreeliteGetInputType(TreeliteModelHandle model, char const **out_str)
Query the input type of a Treelite model object.
- Parameters:
model – Treelite Model object
out_str – String representation of input type
- Returns:
0 for success; -1 for failure
int TreeliteGetOutputType(TreeliteModelHandle model, char const **out_str)
Query the output type of a Treelite model object.
- Parameters:
model – Treelite Model object
out_str – String representation of output type
- Returns:
0 for success; -1 for failure
int TreeliteQueryNumTree(TreeliteModelHandle model, size_t *out)
Query the number of trees in the model.
- Parameters:
model – Model to query
out – Number of trees
- Returns:
0 for success, -1 for failure
int TreeliteQueryNumFeature(TreeliteModelHandle model, int *out)
Query the number of features used in the model.
- Parameters:
model – Model to query
out – Number of features
- Returns:
0 for success, -1 for failure
int TreeliteConcatenateModelObjects(TreeliteModelHandle const *objs, size_t len, TreeliteModelHandle *out)
Concatenate multiple model objects into a single model object by copying all member trees into the destination model object.
- Parameters:
objs – Pointer to the beginning of the list of model objects
len – Number of model objects
out – Used to save the concatenated model
int TreeliteFreeModel(TreeliteModelHandle handle)
Delete model from memory.
- Parameters:
handle – Model to remove
- Returns:
0 for success, -1 for failure
int TreeliteSerializeModelToFile(TreeliteModelHandle handle, char const *filename)
Serialize (persist) a model object to disk.
- Parameters:
handle – Handle to the model object
filename – Name of the file to which to serialize the model. The file will be using a binary format that’s optimized to store the Treelite model object efficiently.
- Returns:
0 for success, -1 for failure
int TreeliteDeserializeModelFromFile(char const *filename, TreeliteModelHandle *out)
Deserialize (load) a model object from disk.
- Parameters:
filename – Name of the file from which to deserialize the model. The file should be created by a call to TreeliteSerializeModelToFile.
out – Handle to the model object
- Returns:
0 for success, -1 for failure
int TreeliteSerializeModelToBytes(TreeliteModelHandle handle, char const **out_bytes, size_t *out_bytes_len)
Serialize (persist) a model object to a byte sequence.
- Parameters:
handle – Handle to the model object
out_bytes – Byte sequence containing serialized model
out_bytes_len – Length of out_bytes
- Returns:
0 for success, -1 for failure
int TreeliteDeserializeModelFromBytes(char const *bytes, size_t bytes_len, TreeliteModelHandle *out)
Deserialize (load) a model object from a byte sequence.
- Parameters:
bytes – Byte sequence containing serialized model. The string should be created by a call to TreeliteSerializeModelToBytes.
bytes_len – Length of bytes
out – Loaded model
- Returns:
0 for success, -1 for failure
int TreeliteSerializeModelToPyBuffer(TreeliteModelHandle handle, TreelitePyBufferFrame **out_frames, size_t *out_num_frames)
Serialize a model object using the Python buffer protocol (PEP 3118).
- Parameters:
handle – Handle to the model object
out_frames – Pointer to buffer frames
out_num_frames – Number of buffer frames
- Returns:
0 for success, -1 for failure
int TreeliteDeserializeModelFromPyBuffer(TreelitePyBufferFrame *frames, size_t num_frames, TreeliteModelHandle *out)
Deserialize a model object using the Python buffer protocol (PEP 3118).
- Parameters:
frames – Buffer frames
num_frames – Number of buffer frames
out – Loaded model
- Returns:
0 for success, -1 for failure
Getters and setters for the model object
int TreeliteGetHeaderField(TreeliteModelHandle model, char const *name, TreelitePyBufferFrame *out_frame)
Get a field in the header.
This function returns the requested field using the Python buffer protocol (PEP 3118).
- Parameters:
model – Treelite Model object
name – Name of the field
out_frame – Buffer frame representing the requested field
- Returns:
0 for success; -1 for failure
int TreeliteGetTreeField(TreeliteModelHandle model, uint64_t tree_id, char const *name, TreelitePyBufferFrame *out_frame)
Get a field in a tree.
This function returns the requested field using the Python buffer protocol (PEP 3118).
- Parameters:
model – Treelite Model object
tree_id – ID of the tree
name – Name of the field
out_frame – Buffer frame representing the requested field
- Returns:
0 for success; -1 for failure
int TreeliteSetHeaderField(TreeliteModelHandle model, char const *name, TreelitePyBufferFrame frame)
Set a field in the header.
This function accepts the field’s new value using the Python buffer protocol (PEP 3118).
- Parameters:
model – Treelite Model object
name – Name of the field
frame – Buffer frame representing the new value for the field
- Returns:
0 for success; -1 for failure
int TreeliteSetTreeField(TreeliteModelHandle model, uint64_t tree_id, char const *name, TreelitePyBufferFrame frame)
Set a field in a tree.
This function accepts the field’s new value using the Python buffer protocol (PEP 3118).
- Parameters:
model – Treelite Model object
tree_id – ID of the tree
name – Name of the field
frame – Buffer frame representing the new value for the field
- Returns:
0 for success; -1 for failure
General Tree Inference Library (GTIL)
int TreeliteGTILParseConfig(char const *config_json, TreeliteGTILConfigHandle *out)
Load a configuration for GTIL predictor from a JSON string.
- Parameters:
config_json – a JSON string with the following fields:
”nthread” (optional): Number of threads used for initializing DMatrix. Set <= 0 to use all CPU cores.
”predict_type” (required): Must be one of the following.
”default”: Sum over trees and apply post-processing
”raw”: Sum over trees, but don’t apply post-processing; get raw margin scores instead.
”leaf_id”: Output one (integer) leaf ID per tree.
”score_per_tree”: Output one or more margin scores per tree.
out – Parsed configuration
- Returns:
0 for success; -1 for failure
int TreeliteGTILDeleteConfig(TreeliteGTILConfigHandle handle)
Delete a GTIL configuration from memory.
- Parameters:
handle – Handle to the GTIL configuration to be deleted
- Returns:
0 for success; -1 for failure
int TreeliteGTILGetOutputShape(TreeliteModelHandle model, uint64_t num_row, TreeliteGTILConfigHandle config, uint64_t const **out, uint64_t *out_ndim)
Given a data matrix, query the necessary shape of array to hold predictions for all data points.
- Parameters:
model – Treelite Model object
num_row – Number of rows in the input
config – Configuration of GTIL predictor. Set this by calling TreeliteGTILParseConfig.
out_shape – Array of dimensions
out_ndim – Number of dimensions in out_shape
- Returns:
0 for success; -1 for failure
int TreeliteGTILPredict(TreeliteModelHandle model, void const *input, char const *input_type, uint64_t num_row, void *output, TreeliteGTILConfigHandle config)
Predict with a 2D dense array.
- Parameters:
model – Treelite Model object
input – The 2D data array, laid out in row-major layout
input_type – Data type of the data matrix
num_row – Number of rows in the data matrix.
output – Pointer to buffer to store the output. Call TreeliteGTILGetOutputShape to get the amount of buffer you should allocate for this parameter.
config – Configuration of GTIL predictor. Set this by calling TreeliteGTILParseConfig.
- Returns:
0 for success; -1 for failure
int TreeliteGTILPredictSparse(TreeliteModelHandle model, void const *data, char const *input_type, uint64_t const *col_ind, uint64_t const *row_ptr, uint64_t num_row, void *output, TreeliteGTILConfigHandle config)
Predict with sparse data with CSR (compressed sparse row) layout.
In the CSR layout, data[row_ptr[i]:row_ptr[i+1]] store the nonzero entries of row i, and col_ind[row_ptr[i]:row_ptr[i+1]] stores the corresponding column indices.
- Parameters:
model – Treelite Model object
data – Nonzero elements in the data matrix
input_type – Data type of the data matrix
col_ind – Feature indices. col_ind[i] indicates the feature index associated with data[i].
row_ptr – Pointer to row headers. Length is [num_row] + 1.
num_row – Number of rows in the data matrix.
output – Pointer to buffer to store the output. Call GetOutputShape to get the amount of buffer you should allocate for this parameter.
config – Configuration of GTIL predictor. Set this by calling TreeliteGTILParseConfig.
- Returns:
0 for success; -1 for failure
Handle types
Treelite uses C++ classes to define its internal data structures. In order to
pass C++ objects to C functions, opaque handles are used. Opaque handles
are void*
pointers that store raw memory addresses.
typedef void *TreeliteModelHandle
Handle to a decision tree ensemble model.
typedef void *TreeliteModelBuilderHandle
Handle to a model builder object.
typedef void *TreeliteGTILConfigHandle
Handle to a configuration of GTIL predictor.