Treelite C API

Treelite exposes a set of C functions to enable interfacing with a variety of languages. This page will be most useful for:

those writing a new language binding (glue code).
those wanting to incorporate functions of Treelite into their own native libraries.

We recommend the Python API for everyday uses.

Note

Use of C and C++ in Treelite

Core logic of Treelite are written in C++ to take advantage of higher abstractions. We provide C only interface here, as many more programming languages bind with C than with C++. See this page for more details.

Model loader interface 

Use the following functions to load decision tree ensemble models from a file. Treelite supports multiple model file formats.

int TreeliteLoadXGBoostModelLegacyBinary(char const *filename, char const *config_json, TreeliteModelHandle *out)

Load a model file generated by XGBoost (dmlc/xgboost), stored in the legacy binary format.

Parameters:

filename – Name of model file
config_json – JSON string consisting key-value pairs; used for configuring the model parser
out – Loaded model

Returns:

0 for success, -1 for failure

int TreeliteLoadXGBoostModelLegacyBinaryFromMemoryBuffer(void const *buf, size_t len, char const *config_json, TreeliteModelHandle *out)

Load an XGBoost model from a memory buffer using the legacy binary format.

Parameters:

buf – Memory buffer
len – Size of memory buffer
config_json – Null-terminated JSON string consisting key-value pairs; used for configuring the model parser
out – Loaded model

Returns:

0 for success, -1 for failure

int TreeliteLoadXGBoostModel(char const *filename, char const *config_json, TreeliteModelHandle *out): Deprecated. Please use TreeliteLoadXGBoostModelJSON instead.

int TreeliteLoadXGBoostModelFromString(char const *json_str, size_t length, char const *config_json, TreeliteModelHandle *out): Deprecated. Please use TreeliteLoadXGBoostModelFromJSONString instead.

int TreeliteLoadXGBoostModelJSON(char const *filename, char const *config_json, TreeliteModelHandle *out)

Load a model file generated by XGBoost (dmlc/xgboost), stored in the JSON format.

Parameters:

filename – Name of model file
config_json – Null-terminated JSON string consisting key-value pairs; used for configuring the model parser
out – Loaded model

Returns:

0 for success, -1 for failure

int TreeliteLoadXGBoostModelFromJSONString(char const *json_str, size_t length, char const *config_json, TreeliteModelHandle *out)

Load an XGBoost model from a JSON string.

Parameters:

json_str – JSON string containing the XGBoost model
length – Length of the JSON string
config_json – Null-terminated JSON string consisting key-value pairs; used for configuring the model parser
out – Loaded model

Returns:

0 for success, -1 for failure

int TreeliteLoadXGBoostModelUBJSON(char const *filename, char const *config_json, TreeliteModelHandle *out)

Load a model file generated by XGBoost (dmlc/xgboost), stored in the UBJSON format.

Parameters:

filename – Name of model file
config_json – Null-terminated JSON string consisting key-value pairs; used for configuring the model parser
out – Loaded model

Returns:

0 for success, -1 for failure

int TreeliteLoadXGBoostModelFromUBJSONString(char const *ubjson_str, size_t length, char const *config_json, TreeliteModelHandle *out)

Load an XGBoost model from a UBJSON string.

Parameters:

ubjson_str – UBJSON byte sequence
length – Length of the byte sequence
config_json – Null-terminated JSON string consisting key-value pairs; used for configuring the model parser
out – Loaded model

Returns:

0 for success, -1 for failure

int TreeliteDetectXGBoostFormat(char const *filename, char const **out_str)

Inspect the first few bytes of an XGBoost model and heuristically determine whether it’s using the JSON or UBJSON format.

Parameters:

filename – Name of model file
out_str – String indicating the model type (“json”, “ubjson”, or “unknown”)

Returns:

0 for success, -1 for failure

int TreeliteLoadLightGBMModel(char const *filename, char const *config_json, TreeliteModelHandle *out)

Load a model file generated by LightGBM (Microsoft/LightGBM). The model file must contain a decision tree ensemble.

Parameters:

filename – Name of model file
config_json – Null-terminated JSON string consisting key-value pairs; used for configuring the model parser
out – Loaded model

Returns:

0 for success, -1 for failure

int TreeliteLoadLightGBMModelFromString(char const *model_str, char const *config_json, TreeliteModelHandle *out)

Load a LightGBM model from a string. The string should be created with the model_to_string() method in LightGBM.

Parameters:

model_str – Model string
config_json – Null-terminated JSON string consisting key-value pairs; used for configuring the model parser
out – Loaded model

Returns:

0 for success, -1 for failure

Model loader interface for scikit-learn models 

Use the following functions to load decision tree ensemble models from a scikit-learn model object.

int TreeliteLoadSKLearnRandomForestRegressor(int n_estimators, int n_features, int n_targets, int64_t const *node_count, int64_t const **children_left, int64_t const **children_right, int64_t const **feature, double const **threshold, double const **value, int64_t const **n_node_samples, double const **weighted_n_node_samples, double const **impurity, TreeliteModelHandle *out)

Load a scikit-learn RandomForestRegressor model from a collection of arrays. Refer to https://scikit-learn.org/stable/auto_examples/tree/plot_unveil_tree_structure.html to learn the meaning of the arrays in detail. Note that this function can also be used to load an ensemble of extremely randomized trees (sklearn.ensemble.ExtraTreesRegressor).

Parameters:

n_estimators – Number of trees in the random forest
n_features – Number of features in the training data
n_targets – Number of targets (outputs)
node_count – node_count[i] stores the number of nodes in the i-th tree
children_left – children_left[i][k] stores the ID of the left child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
children_right – children_right[i][k] stores the ID of the right child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
feature – feature[i][k] stores the ID of the feature used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
threshold – threshold[i][k] stores the threshold used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
value – value[i][k] stores the leaf output of node k of the i-th tree. This is only defined if node k is a leaf node.
n_node_samples – n_node_samples[i][k] stores the number of data samples associated with node k of the i-th tree.
weighted_n_node_samples – weighted_n_node_samples[i][k] stores the sum of weighted data samples associated with node k of the i-th tree.
impurity – impurity[i][k] stores the impurity measure (gini, entropy etc) associated with node k of the i-th tree.
out – Loaded model

Returns:

0 for success, -1 for failure

int TreeliteLoadSKLearnIsolationForest(int n_estimators, int n_features, int64_t const *node_count, int64_t const **children_left, int64_t const **children_right, int64_t const **feature, double const **threshold, double const **value, int64_t const **n_node_samples, double const **weighted_n_node_samples, double const **impurity, double ratio_c, TreeliteModelHandle *out)

Load a scikit-learn IsolationForest model from a collection of arrays. Refer to https://scikit-learn.org/stable/auto_examples/tree/plot_unveil_tree_structure.html to learn the meaning of the arrays in detail.

Parameters:

n_estimators – Number of trees in the isolation forest
n_features – Number of features in the training data
node_count – node_count[i] stores the number of nodes in the i-th tree
children_left – children_left[i][k] stores the ID of the left child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
children_right – children_right[i][k] stores the ID of the right child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
feature – feature[i][k] stores the ID of the feature used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
threshold – threshold[i][k] stores the threshold used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
value – value[i][k] stores the expected isolation depth of node k of the i-th tree. This is only defined if node k is a leaf node.
n_node_samples – n_node_samples[i][k] stores the number of data samples associated with node k of the i-th tree.
weighted_n_node_samples – weighted_n_node_samples[i][k] stores the sum of weighted data samples associated with node k of the i-th tree.
impurity – Not used, but must be passed as array of arrays for each tree and node.
ratio_c – Standardizing constant to use for calculation of the anomaly score.
out – Loaded model

Returns:

0 for success, -1 for failure

int TreeliteLoadSKLearnRandomForestClassifier(int n_estimators, int n_features, int n_targets, int32_t const *n_classes, int64_t const *node_count, int64_t const **children_left, int64_t const **children_right, int64_t const **feature, double const **threshold, double const **value, int64_t const **n_node_samples, double const **weighted_n_node_samples, double const **impurity, TreeliteModelHandle *out)

Load a scikit-learn RandomForestClassifier model from a collection of arrays. Refer to https://scikit-learn.org/stable/auto_examples/tree/plot_unveil_tree_structure.html to learn the meaning of the arrays in detail. Note that this function can also be used to load an ensemble of extremely randomized trees (sklearn.ensemble.ExtraTreesClassifier).

Parameters:

n_estimators – Number of trees in the random forest
n_features – Number of features in the training data
n_targets – Number of targets (outputs)
n_classes – n_classes[i] stores the number of classes in the i-th target
node_count – node_count[i] stores the number of nodes in the i-th tree
children_left – children_left[i][k] stores the ID of the left child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
children_right – children_right[i][k] stores the ID of the right child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
feature – feature[i][k] stores the ID of the feature used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
threshold – threshold[i][k] stores the threshold used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
value – value[i][k] stores the leaf output of node k of the i-th tree. This is only defined if node k is a leaf node.
n_node_samples – n_node_samples[i][k] stores the number of data samples associated with node k of the i-th tree.
weighted_n_node_samples – weighted_n_node_samples[i][k] stores the sum of weighted data samples associated with node k of the i-th tree.
impurity – impurity[i][k] stores the impurity measure (gini, entropy etc) associated with node k of the i-th tree.
out – Loaded model

Returns:

0 for success, -1 for failure

int TreeliteLoadSKLearnGradientBoostingRegressor(int n_iter, int n_features, int64_t const *node_count, int64_t const **children_left, int64_t const **children_right, int64_t const **feature, double const **threshold, double const **value, int64_t const **n_node_samples, double const **weighted_n_node_samples, double const **impurity, double const *base_scores, TreeliteModelHandle *out)

Load a scikit-learn GradientBoostingRegressor model from a collection of arrays. Refer to https://scikit-learn.org/stable/auto_examples/tree/plot_unveil_tree_structure.html to learn the meaning of the arrays in detail. Note: GradientBoostingRegressor does not support multiple targets (outputs).

Parameters:

n_iter – Number of boosting iterations
n_features – Number of features in the training data
node_count – node_count[i] stores the number of nodes in the i-th tree
children_left – children_left[i][k] stores the ID of the left child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
children_right – children_right[i][k] stores the ID of the right child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
feature – feature[i][k] stores the ID of the feature used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
threshold – threshold[i][k] stores the threshold used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
value – value[i][k] stores the leaf output of node k of the i-th tree. This is only defined if node k is a leaf node.
n_node_samples – n_node_samples[i][k] stores the number of data samples associated with node k of the i-th tree.
weighted_n_node_samples – weighted_n_node_samples[i][k] stores the sum of weighted data samples associated with node k of the i-th tree.
impurity – impurity[i][k] stores the impurity measure (gini, entropy etc) associated with node k of the i-th tree.
base_scores – Baseline predictions for outputs. At prediction, margin scores will be adjusted by this amount before applying the post-processing (link) function. Required shape: (1,)
out – Loaded model

Returns:

0 for success, -1 for failure

int TreeliteLoadSKLearnGradientBoostingClassifier(int n_iter, int n_features, int n_classes, int64_t const *node_count, int64_t const **children_left, int64_t const **children_right, int64_t const **feature, double const **threshold, double const **value, int64_t const **n_node_samples, double const **weighted_n_node_samples, double const **impurity, double const *base_scores, TreeliteModelHandle *out)

Load a scikit-learn GradientBoostingClassifier model from a collection of arrays. Refer to https://scikit-learn.org/stable/auto_examples/tree/plot_unveil_tree_structure.html to learn the meaning of the arrays in detail. Note: GradientBoostingClassifier does not support multiple targets (outputs).

Parameters:

n_iter – Number of boosting iterations
n_features – Number of features in the training data
n_classes – Number of classes in the target variable
node_count – node_count[i] stores the number of nodes in the i-th tree
children_left – children_left[i][k] stores the ID of the left child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
children_right – children_right[i][k] stores the ID of the right child node of node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
feature – feature[i][k] stores the ID of the feature used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
threshold – threshold[i][k] stores the threshold used in the binary tree split at node k of the i-th tree. This is only defined if node k is an internal (non-leaf) node.
value – value[i][k] stores the leaf output of node k of the i-th tree. This is only defined if node k is a leaf node.
n_node_samples – n_node_samples[i][k] stores the number of data samples associated with node k of the i-th tree.
weighted_n_node_samples – weighted_n_node_samples[i][k] stores the sum of weighted data samples associated with node k of the i-th tree.
impurity – impurity[i][k] stores the impurity measure (gini, entropy etc) associated with node k of the i-th tree.
base_scores – Baseline predictions for outputs. At prediction, margin scores will be adjusted by this amount before applying the post-processing (link) function. Required shape: (n_classes,)
out – Loaded model

Returns:

0 for success, -1 for failure

int TreeliteLoadSKLearnHistGradientBoostingRegressor(int n_iter, int n_features, int64_t const *node_count, void const **nodes, int expected_sizeof_node_struct, uint32_t n_categorical_splits, uint32_t const **raw_left_cat_bitsets, uint32_t const *known_cat_bitsets, uint32_t const *known_cat_bitsets_offset_map, int32_t const *features_map, int64_t const **categories_map, double const *base_scores, TreeliteModelHandle *out)

Load a scikit-learn HistGradientBoostingRegressor model from a collection of arrays. Note: HistGradientBoostingRegressor does not support multiple targets (outputs).

Parameters:

n_iter – Number of boosting iterations
n_features – Number of features in the training data
node_count – node_count[i] stores the number of nodes in the i-th tree
nodes – nodes[i][k] stores the k-th node of the i-th tree.
expected_sizeof_node_struct – Expected size of Node struct, in bytes
n_categorical_splits – n_categorical_splits[i] stores the number of categorical splits in the i-th tree.
raw_left_cat_bitsets – raw_left_cat_bitsets[i][k] stores the bitmaps for node k of tree i. The bitmaps are used to represent categorical tests. Shape of raw_left_cat_bitsets[i]: (n_categorical_splits, 8)
known_cat_bitsets – Bitsets representing the list of known categories per categorical feature. Shape: (n_categorical_features, 8)
known_cat_bitsets_offset_map – Map from an original feature index to the corresponding index in the known_cat_bitsets array. Shape: (n_features,)
features_map – Mapping to re-order features. This is needed because HistGradientBoosting estimator internally re-orders features using ColumnTransformer so that the categorical features come before the numerical features.
categories_map – Mapping to transform categorical features. This is needed because HistGradientBoosting estimator embeds an OrdinalEncoder. categories_map[i] represents the mapping for i-th categorical feature.
base_scores – Baseline predictions for outputs. At prediction, margin scores will be adjusted by this amount before applying the post-processing (link) function. Required shape: (1,)
out – Loaded model

Returns:

0 for success, -1 for failure

int TreeliteLoadSKLearnHistGradientBoostingClassifier(int n_iter, int n_features, int n_classes, int64_t const *node_count, void const **nodes, int expected_sizeof_node_struct, uint32_t n_categorical_splits, uint32_t const **raw_left_cat_bitsets, uint32_t const *known_cat_bitsets, uint32_t const *known_cat_bitsets_offset_map, int32_t const *features_map, int64_t const **categories_map, double const *base_scores, TreeliteModelHandle *out)

Load a scikit-learn HistGradientBoostingClassifier model from a collection of arrays. Note: HistGradientBoostingClassifier does not support multiple targets (outputs).

Parameters:

n_iter – Number of boosting iterations
n_features – Number of features in the training data
n_classes – Number of classes in the target variable
node_count – node_count[i] stores the number of nodes in the i-th tree
nodes – nodes[i][k] stores the k-th node of the i-th tree.
expected_sizeof_node_struct – Expected size of Node struct, in bytes
n_categorical_splits – n_categorical_splits[i] stores the number of categorical splits in the i-th tree.
raw_left_cat_bitsets – raw_left_cat_bitsets[i][k] stores the bitmaps for node k of tree i. The bitmaps are used to represent categorical tests. Shape of raw_left_cat_bitsets[i]: (n_categorical_splits, 8)
known_cat_bitsets – Bitsets representing the list of known categories per categorical feature. Shape: (n_categorical_features, 8)
known_cat_bitsets_offset_map – Map from an original feature index to the corresponding index in the known_cat_bitsets array. Shape: (n_features,)
features_map – Mapping to re-order features. This is needed because HistGradientBoosting estimator internally re-orders features using ColumnTransformer so that the categorical features come before the numerical features.
categories_map – Mapping to transform categorical features. This is needed because HistGradientBoosting estimator embeds an OrdinalEncoder. categories_map[i] represents the mapping for i-th categorical feature.
base_scores – Baseline predictions for outputs. At prediction, margin scores will be adjusted by this amount before applying the post-processing (link) function. Required shape: (1,) for binary classification; (n_classes,) for multi-class classification
out – Loaded model

Returns:

0 for success, -1 for failure

Model builder interface 

Use the following functions to incrementally build decisio n tree ensemble models.

int TreeliteGetModelBuilder(char const *json_str, TreeliteModelBuilderHandle *out)

Initialize a model builder object from a JSON string.

The JSON string must contain all relevant metadata, including:

threshold_type: Type of thresholds in the tree model
leaf_output_type: Type of leaf outputs in the tree model
metadata: Model metadata, consisting of following subfields:
- num_feature: Number of features
- task_type: Task type
- average_tree_output: Whether to average outputs of trees
- num_target: Number of targets
- num_class: Number of classes. num_class[i] is the number of classes of target i.
- leaf_vector_shape: Shape of the output from each leaf node
tree_annotation: Annotation for individual trees, consisting of following subfields:
- num_tree: Number of trees
- target_id: target_id Target that each tree is associated with
- class_id: Class that each tree is associated with
postprocessor: Postprocessor for prediction outputs, consisting of following subfields:
- name: Name of postprocessor
- config_json: Optional JSON string to configure the postprocessor
base_scores: Baseline scores for targets and classes, before adding tree outputs. Also known as the intercept.
attributes: Arbitrary JSON object, to be stored in the “attributes” field in the model object.

Parameters:

json_str – JSON string containing relevant metadata.
out – Model builder object

Returns:

0 for success, -1 for failure

int TreeliteDeleteModelBuilder(TreeliteModelBuilderHandle model_builder)

Delete model builder object from memory.

Parameters:: model_builder – Model builder object to be deleted
Returns:: 0 for success, -1 for failure

int TreeliteModelBuilderStartTree(TreeliteModelBuilderHandle model_builder)

Start a new tree.

Parameters:: model_builder – Model builder object
Returns:: 0 for success, -1 for failure

int TreeliteModelBuilderEndTree(TreeliteModelBuilderHandle model_builder)

End the current tree.

Parameters:: model_builder – Model builder object
Returns:: 0 for success, -1 for failure

int TreeliteModelBuilderStartNode(TreeliteModelBuilderHandle model_builder, int node_key)

Start a new node.

Parameters:

model_builder – Model builder object
node_key – Integer key that unique identifies the node.

Returns:

0 for success, -1 for failure

int TreeliteModelBuilderEndNode(TreeliteModelBuilderHandle model_builder)

End the current node.

Parameters:: model_builder – Model builder object
Returns:: 0 for success, -1 for failure

int TreeliteModelBuilderNumericalTest(TreeliteModelBuilderHandle model_builder, int32_t split_index, double threshold, int default_left, char const *cmp, int left_child_key, int right_child_key)

Declare the current node as a numerical test node, where the test is of form [feature value] [cmp] [threshold]. Data points for which the test evaluates to True will be mapped to the left child node; all other data points (for which the test evaluates to False) will be mapped to the right child node.

Parameters:

model_builder – Model builder object
split_index – Feature ID
threshold – Threshold
default_left – Whether the missing value should be mapped to the left child
cmp – Comparison operator
left_child_key – Integer key that unique identifies the left child node.
right_child_key – Integer key that unique identifies the right child node.

Returns:

0 for success, -1 for failure

int TreeliteModelBuilderCategoricalTest(TreeliteModelBuilderHandle model_builder, int32_t split_index, int default_left, uint32_t const *category_list, size_t category_list_len, int category_list_right_child, int left_child_key, int right_child_key)

Declare the current node as a categorical test node, where the test is of form [feature value] \in [category list].

Parameters:

model_builder – Model builder object
split_index – Feature ID
default_left – Whether the missing value should be mapped to the left child
category_list – List of categories to be tested for match
category_list_len – Length of category_list
category_list_right_child – Whether the data points for which the test evaluates to True should be mapped to the right child or the left child.
left_child_key – Integer key that unique identifies the left child node.
right_child_key – Integer key that unique identifies the right child node.

Returns:

0 for success, -1 for failure

int TreeliteModelBuilderLeafScalar(TreeliteModelBuilderHandle model_builder, double leaf_value)

Declare the current node as a leaf node with a scalar output.

Parameters:

model_builder – Model builder object
leaf_value – Value of leaf output

Returns:

0 for success, -1 for failure

int TreeliteModelBuilderLeafVectorFloat32(TreeliteModelBuilderHandle model_builder, float const *leaf_vector, size_t leaf_vector_len)

Declare the current node as a leaf node with a vector output (float32)

Parameters:

model_builder – Model builder object
leaf_vector – Value of leaf output
leaf_vector_len – Length of leaf_vector

Returns:

0 for success, -1 for failure

int TreeliteModelBuilderLeafVectorFloat64(TreeliteModelBuilderHandle model_builder, double const *leaf_vector, size_t leaf_vector_len)

Declare the current node as a leaf node with a vector output (float64)

Parameters:

model_builder – Model builder object
leaf_vector – Value of leaf output
leaf_vector_len – Length of leaf_vector

Returns:

0 for success, -1 for failure

int TreeliteModelBuilderGain(TreeliteModelBuilderHandle model_builder, double gain)

Specify the gain (loss reduction) that’s resulted from the current split.

Parameters:

model_builder – Model builder object
gain – Gain (loss reduction)

Returns:

0 for success, -1 for failure

int TreeliteModelBuilderDataCount(TreeliteModelBuilderHandle model_builder, uint64_t data_count)

Specify the number of data points (samples) that are mapped to the current node.

Parameters:

model_builder – Model builder object
data_count – Number of data points

Returns:

0 for success, -1 for failure

int TreeliteModelBuilderSumHess(TreeliteModelBuilderHandle model_builder, double sum_hess)

Specify the weighted sample count or the sum of Hessians for the data points that are mapped to the current node.

Parameters:

model_builder – Model builder object
sum_hess – Weighted sample count or the sum of Hessians

Returns:

0 for success, -1 for failure

int TreeliteModelBuilderCommitModel(TreeliteModelBuilderHandle model_builder, TreeliteModelHandle *out)

Conclude model building and obtain the final model object.

Parameters:

model_builder – Model builder object
out – Final model object

Model manager interface 

int TreeliteDumpAsJSON(TreeliteModelHandle handle, int pretty_print, char const **out_json_str)

Dump a model object as a JSON string.

Parameters:

handle – The handle to the model object
pretty_print – Whether to pretty-print JSON string (0 for false, != 0 for true)
out_json_str – The JSON string

Returns:

0 for success, -1 for failure

int TreeliteGetInputType(TreeliteModelHandle model, char const **out_str)

Query the input type of a Treelite model object.

Parameters:

model – Treelite Model object
out_str – String representation of input type

Returns:

0 for success; -1 for failure

int TreeliteGetOutputType(TreeliteModelHandle model, char const **out_str)

Query the output type of a Treelite model object.

Parameters:

model – Treelite Model object
out_str – String representation of output type

Returns:

0 for success; -1 for failure

int TreeliteQueryNumTree(TreeliteModelHandle model, size_t *out)

Query the number of trees in the model.

Parameters:

model – Model to query
out – Number of trees

Returns:

0 for success, -1 for failure

int TreeliteQueryNumFeature(TreeliteModelHandle model, int *out)

Query the number of features used in the model.

Parameters:

model – Model to query
out – Number of features

Returns:

0 for success, -1 for failure

int TreeliteConcatenateModelObjects(TreeliteModelHandle const *objs, size_t len, TreeliteModelHandle *out)

Concatenate multiple model objects into a single model object by copying all member trees into the destination model object.

Parameters:

objs – Pointer to the beginning of the list of model objects
len – Number of model objects
out – Used to save the concatenated model

int TreeliteFreeModel(TreeliteModelHandle handle)

Delete model from memory.

Parameters:: handle – Model to remove
Returns:: 0 for success, -1 for failure

Serializer 

int TreeliteSerializeModelToFile(TreeliteModelHandle handle, char const *filename)

Serialize (persist) a model object to disk.

Parameters:

handle – Handle to the model object
filename – Name of the file to which to serialize the model. The file will be using a binary format that’s optimized to store the Treelite model object efficiently.

Returns:

0 for success, -1 for failure

int TreeliteDeserializeModelFromFile(char const *filename, TreeliteModelHandle *out)

Deserialize (load) a model object from disk.

Parameters:

filename – Name of the file from which to deserialize the model. The file should be created by a call to TreeliteSerializeModelToFile.
out – Handle to the model object

Returns:

0 for success, -1 for failure

int TreeliteSerializeModelToBytes(TreeliteModelHandle handle, char const **out_bytes, size_t *out_bytes_len)

Serialize (persist) a model object to a byte sequence.

Parameters:

handle – Handle to the model object
out_bytes – Byte sequence containing serialized model
out_bytes_len – Length of out_bytes

Returns:

0 for success, -1 for failure

int TreeliteDeserializeModelFromBytes(char const *bytes, size_t bytes_len, TreeliteModelHandle *out)

Deserialize (load) a model object from a byte sequence.

Parameters:

bytes – Byte sequence containing serialized model. The string should be created by a call to TreeliteSerializeModelToBytes.
bytes_len – Length of bytes
out – Loaded model

Returns:

0 for success, -1 for failure

int TreeliteSerializeModelToPyBuffer(TreeliteModelHandle handle, TreelitePyBufferFrame **out_frames, size_t *out_num_frames)

Serialize a model object using the Python buffer protocol (PEP 3118).

Parameters:

handle – Handle to the model object
out_frames – Pointer to buffer frames
out_num_frames – Number of buffer frames

Returns:

0 for success, -1 for failure

int TreeliteDeserializeModelFromPyBuffer(TreelitePyBufferFrame *frames, size_t num_frames, TreeliteModelHandle *out)

Deserialize a model object using the Python buffer protocol (PEP 3118).

Parameters:

frames – Buffer frames
num_frames – Number of buffer frames
out – Loaded model

Returns:

0 for success, -1 for failure

Getters and setters for the model object 

int TreeliteGetHeaderField(TreeliteModelHandle model, char const *name, TreelitePyBufferFrame *out_frame)

Get a field in the header.

This function returns the requested field using the Python buffer protocol (PEP 3118).

Parameters:

model – Treelite Model object
name – Name of the field
out_frame – Buffer frame representing the requested field

Returns:

0 for success; -1 for failure

int TreeliteGetTreeField(TreeliteModelHandle model, uint64_t tree_id, char const *name, TreelitePyBufferFrame *out_frame)

Get a field in a tree.

This function returns the requested field using the Python buffer protocol (PEP 3118).

Parameters:

model – Treelite Model object
tree_id – ID of the tree
name – Name of the field
out_frame – Buffer frame representing the requested field

Returns:

0 for success; -1 for failure

int TreeliteSetHeaderField(TreeliteModelHandle model, char const *name, TreelitePyBufferFrame frame)

Set a field in the header.

This function accepts the field’s new value using the Python buffer protocol (PEP 3118).

Parameters:

model – Treelite Model object
name – Name of the field
frame – Buffer frame representing the new value for the field

Returns:

0 for success; -1 for failure

int TreeliteSetTreeField(TreeliteModelHandle model, uint64_t tree_id, char const *name, TreelitePyBufferFrame frame)

Set a field in a tree.

This function accepts the field’s new value using the Python buffer protocol (PEP 3118).

Parameters:

model – Treelite Model object
tree_id – ID of the tree
name – Name of the field
frame – Buffer frame representing the new value for the field

Returns:

0 for success; -1 for failure

General Tree Inference Library (GTIL)

int TreeliteGTILParseConfig(char const *config_json, TreeliteGTILConfigHandle *out)

Load a configuration for GTIL predictor from a JSON string.

Parameters:

config_json – a JSON string with the following fields:
- ”nthread” (optional): Number of threads used for initializing DMatrix. Set <= 0 to use all CPU cores.
- ”predict_type” (required): Must be one of the following.
  - ”default”: Sum over trees and apply post-processing
  - ”raw”: Sum over trees, but don’t apply post-processing; get raw margin scores instead.
  - ”leaf_id”: Output one (integer) leaf ID per tree.
  - ”score_per_tree”: Output one or more margin scores per tree.
out – Parsed configuration

Returns:

0 for success; -1 for failure

int TreeliteGTILDeleteConfig(TreeliteGTILConfigHandle handle)

Delete a GTIL configuration from memory.

Parameters:: handle – Handle to the GTIL configuration to be deleted
Returns:: 0 for success; -1 for failure

int TreeliteGTILGetOutputShape(TreeliteModelHandle model, uint64_t num_row, TreeliteGTILConfigHandle config, uint64_t const **out, uint64_t *out_ndim)

Given a data matrix, query the necessary shape of array to hold predictions for all data points.

Parameters:

model – Treelite Model object
num_row – Number of rows in the input
config – Configuration of GTIL predictor. Set this by calling TreeliteGTILParseConfig.
out_shape – Array of dimensions
out_ndim – Number of dimensions in out_shape

Returns:

0 for success; -1 for failure

int TreeliteGTILPredict(TreeliteModelHandle model, void const *input, char const *input_type, uint64_t num_row, void *output, TreeliteGTILConfigHandle config)

Predict with a 2D dense array.

Parameters:

model – Treelite Model object
input – The 2D data array, laid out in row-major layout
input_type – Data type of the data matrix
num_row – Number of rows in the data matrix.
output – Pointer to buffer to store the output. Call TreeliteGTILGetOutputShape to get the amount of buffer you should allocate for this parameter.
config – Configuration of GTIL predictor. Set this by calling TreeliteGTILParseConfig.

Returns:

0 for success; -1 for failure

int TreeliteGTILPredictSparse(TreeliteModelHandle model, void const *data, char const *input_type, uint64_t const *col_ind, uint64_t const *row_ptr, uint64_t num_row, void *output, TreeliteGTILConfigHandle config)

Predict with sparse data with CSR (compressed sparse row) layout.

In the CSR layout, data[row_ptr[i]:row_ptr[i+1]] store the nonzero entries of row i, and col_ind[row_ptr[i]:row_ptr[i+1]] stores the corresponding column indices.

Parameters:

model – Treelite Model object
data – Nonzero elements in the data matrix
input_type – Data type of the data matrix
col_ind – Feature indices. col_ind[i] indicates the feature index associated with data[i].
row_ptr – Pointer to row headers. Length is [num_row] + 1.
num_row – Number of rows in the data matrix.
output – Pointer to buffer to store the output. Call GetOutputShape to get the amount of buffer you should allocate for this parameter.
config – Configuration of GTIL predictor. Set this by calling TreeliteGTILParseConfig.

Returns:

0 for success; -1 for failure

Handle types 

Treelite uses C++ classes to define its internal data structures. In order to pass C++ objects to C functions, opaque handles are used. Opaque handles are void* pointers that store raw memory addresses.

typedef void *TreeliteModelHandle: Handle to a decision tree ensemble model.

typedef void *TreeliteModelBuilderHandle: Handle to a model builder object.

typedef void *TreeliteGTILConfigHandle: Handle to a configuration of GTIL predictor.