Treelite C API

Treelite exposes a set of C functions to enable interfacing with a variety of languages. This page will be most useful for:

  • those writing a new language binding (glue code).
  • those wanting to incorporate functions of treelite into their own native libraries.

We recommend the Python API for everyday uses.

Note

Use of C and C++ in treelite

Core logic of treelite are written in C++ to take advantage of higher abstractions. We provide C only interface here, as many more programming languages bind with C than with C++. See this page for more details.

Data matrix interface

Use the following functions to load and manipulate data from a variety of sources.

int TreeliteDMatrixCreateFromFile(const char *path, const char *format, int nthread, int verbose, DMatrixHandle *out)

create DMatrix from a file

Return
0 for success, -1 for failure
Parameters
  • path: file path
  • format: file format
  • nthread: number of threads to use
  • verbose: whether to produce extra messages
  • out: the created DMatrix

int TreeliteDMatrixCreateFromCSR(const float *data, const unsigned *col_ind, const size_t *row_ptr, size_t num_row, size_t num_col, DMatrixHandle *out)

create DMatrix from a (in-memory) CSR matrix

Return
0 for success, -1 for failure
Parameters
  • data: feature values
  • col_ind: feature indices
  • row_ptr: pointer to row headers
  • num_row: number of rows
  • num_col: number of columns
  • out: the created DMatrix

int TreeliteDMatrixCreateFromMat(const float *data, size_t num_row, size_t num_col, float missing_value, DMatrixHandle *out)

create DMatrix from a (in-memory) dense matrix

Return
0 for success, -1 for failure
Parameters
  • data: feature values
  • num_row: number of rows
  • num_col: number of columns
  • missing_value: value to represent missing value
  • out: the created DMatrix

int TreeliteDMatrixGetDimension(DMatrixHandle handle, size_t *out_num_row, size_t *out_num_col, size_t *out_nelem)

get dimensions of a DMatrix

Return
0 for success, -1 for failure
Parameters
  • handle: handle to DMatrix
  • out_num_row: used to set number of rows
  • out_num_col: used to set number of columns
  • out_nelem: used to set number of nonzero entries

int TreeliteDMatrixGetPreview(DMatrixHandle handle, const char **out_preview)

produce a human-readable preview of a DMatrix Will print first and last 25 non-zero entries, along with their locations

Return
0 for success, -1 for failure
Parameters
  • handle: handle to DMatrix
  • out_preview: used to save the address of the string literal

int TreeliteDMatrixGetArrays(DMatrixHandle handle, const float **out_data, const uint32_t **out_col_ind, const size_t **out_row_ptr)

extract three arrays (data, col_ind, row_ptr) that define a DMatrix.

Return
0 for success, -1 for failure
Parameters
  • handle: handle to DMatrix
  • out_data: used to save pointer to array containing feature values
  • out_col_ind: used to save pointer to array containing feature indices
  • out_row_ptr: used to save pointer to array containing pointers to row headers

int TreeliteDMatrixFree(DMatrixHandle handle)

delete DMatrix from memory

Return
0 for success, -1 for failure
Parameters
  • handle: handle to DMatrix

Branch annotator interface

Use the following functions to annotate branches in decision trees.

int TreeliteAnnotateBranch(ModelHandle model, DMatrixHandle dmat, int nthread, int verbose, AnnotationHandle *out)

annotate branches in a given model using frequency patterns in the training data.

Return
0 for success, -1 for failure
Parameters
  • model: model to annotate
  • dmat: training data matrix
  • nthread: number of threads to use
  • verbose: whether to produce extra messages
  • out: used to save handle for the created annotation

int TreeliteAnnotationSave(AnnotationHandle handle, const char *path)

save branch annotation to a JSON file

Return
0 for success, -1 for failure
Parameters
  • handle: annotation to save
  • path: path to JSON file

int TreeliteAnnotationFree(AnnotationHandle handle)

delete branch annotation from memory

Return
0 for success, -1 for failure
Parameters
  • handle: annotation to remove

Compiler interface

Use the following functions to produce optimize prediction subroutine (in C) from a given decision tree ensemble.

int TreeliteCompilerCreate(const char *name, CompilerHandle *out)

create a compiler with a given name

Return
0 for success, -1 for failure
Parameters
  • name: name of compiler
  • out: created compiler

int TreeliteCompilerSetParam(CompilerHandle handle, const char *name, const char *value)

set a parameter for a compiler

Return
0 for success, -1 for failure
Parameters
  • handle: compiler
  • name: name of parameter
  • value: value of parameter

int TreeliteCompilerGenerateCode(CompilerHandle compiler, ModelHandle model, int verbose, const char *dirpath)

generate prediction code from a tree ensemble model. The code will be C99 compliant. One header file (.h) will be generated, along with one or more source files (.c).

Usage example:

TreeliteCompilerGenerateCode(compiler, model, 1, "./my/model");
// files to generate: ./my/model/header.h, ./my/model/main.c
// if parallel compilation is enabled:
// ./my/model/header.h, ./my/model/main.c, ./my/model/tu0.c,
// ./my/model/tu1.c, and so forth
Return
0 for success, -1 for failure
Parameters
  • compiler: handle for compiler
  • model: handle for tree ensemble model
  • verbose: whether to produce extra messages
  • dirpath: directory to store header and source files

int TreeliteCompilerFree(CompilerHandle handle)

delete compiler from memory

Return
0 for success, -1 for failure
Parameters
  • handle: compiler to remove

Model loader interface

Use the following functions to load decision tree ensemble models from a file. Treelite supports multiple model file formats.

int TreeliteLoadLightGBMModel(const char *filename, ModelHandle *out)

load a model file generated by LightGBM (Microsoft/LightGBM). The model file must contain a decision tree ensemble.

Return
0 for success, -1 for failure
Parameters
  • filename: name of model file
  • out: loaded model

int TreeliteLoadXGBoostModel(const char *filename, ModelHandle *out)

load a model file generated by XGBoost (dmlc/xgboost). The model file must contain a decision tree ensemble.

Return
0 for success, -1 for failure
Parameters
  • filename: name of model file
  • out: loaded model

int TreeliteLoadXGBoostModelFromMemoryBuffer(const void *buf, size_t len, ModelHandle *out)

load an XGBoost model from a memory buffer.

Return
0 for success, -1 for failure
Parameters
  • buf: memory buffer
  • len: size of memory buffer
  • out: loaded model

int TreeliteLoadProtobufModel(const char *filename, ModelHandle *out)

load a model in Protocol Buffers format. Protocol Buffers (google/protobuf) is a language- and platform-neutral mechanism for serializing structured data. See tree.proto for format spec.

Return
0 for success, -1 for failure
Parameters
  • filename: name of model file
  • out: loaded model

int TreeliteExportProtobufModel(const char *filename, ModelHandle model)

export a model in Protocol Buffers format. Protocol Buffers (google/protobuf) is a language- and platform-neutral mechanism for serializing structured data. See src/tree.proto for format spec.

Return
0 for success, -1 for failure
Parameters
  • filename: name of model file
  • model: model to export

int TreeliteQueryNumTree(ModelHandle handle, size_t *out)

Query the number of trees in the model.

Return
0 for success, -1 for failure
Parameters
  • handle: model to query
  • out: number of trees

int TreeliteQueryNumFeature(ModelHandle handle, size_t *out)

Query the number of features used in the model.

Return
0 for success, -1 for failure
Parameters
  • handle: model to query
  • out: number of features

int TreeliteQueryNumOutputGroups(ModelHandle handle, size_t *out)

Query the number of output groups of the model.

Return
0 for success, -1 for failure
Parameters
  • handle: model to query
  • out: number of output groups

int TreeliteFreeModel(ModelHandle handle)

delete model from memory

Return
0 for success, -1 for failure
Parameters
  • handle: model to remove

Model builder interface

Use the following functions to incrementally build decisio n tree ensemble models.

int TreeliteCreateTreeBuilder(TreeBuilderHandle *out)

Create a new tree builder.

Return
0 for success; -1 for failure
Parameters
  • out: newly created tree builder

int TreeliteDeleteTreeBuilder(TreeBuilderHandle handle)

Delete a tree builder from memory.

Return
0 for success; -1 for failure
Parameters
  • handle: tree builder to remove

int TreeliteTreeBuilderCreateNode(TreeBuilderHandle handle, int node_key)

Create an empty node within a tree.

Return
0 for success; -1 for failure
Parameters
  • handle: tree builder
  • node_key: unique integer key to identify the new node

int TreeliteTreeBuilderDeleteNode(TreeBuilderHandle handle, int node_key)

Remove a node from a tree.

Return
0 for success; -1 for failure
Parameters
  • handle: tree builder
  • node_key: unique integer key to identify the node to be removed

int TreeliteTreeBuilderSetRootNode(TreeBuilderHandle handle, int node_key)

Set a node as the root of a tree.

Return
0 for success; -1 for failure
Parameters
  • handle: tree builder
  • node_key: unique integer key to identify the root node

int TreeliteTreeBuilderSetNumericalTestNode(TreeBuilderHandle handle, int node_key, unsigned feature_id, const char *opname, double threshold, int default_left, int left_child_key, int right_child_key)

Turn an empty node into a test node with numerical split. The test is in the form [feature value] OP [threshold]. Depending on the result of the test, either left or right child would be taken.

Return
0 for success; -1 for failure
Parameters
  • handle: tree builder
  • node_key: unique integer key to identify the node being modified; this node needs to be empty
  • feature_id: id of feature
  • opname: binary operator to use in the test
  • threshold: threshold value
  • default_left: default direction for missing values
  • left_child_key: unique integer key to identify the left child node
  • right_child_key: unique integer key to identify the right child node

int TreeliteTreeBuilderSetCategoricalTestNode(TreeBuilderHandle handle, int node_key, unsigned feature_id, const unsigned int *left_categories, size_t left_categories_len, int default_left, int left_child_key, int right_child_key)

Turn an empty node into a test node with categorical split. A list defines all categories that would be classified as the left side. Categories are integers ranging from 0 to (n-1), where n is the number of categories in that particular feature. Let’s assume n <= 64.

Return
0 for success; -1 for failure
Parameters
  • handle: tree builder
  • node_key: unique integer key to identify the node being modified; this node needs to be empty
  • feature_id: id of feature
  • left_categories: list of categories belonging to the left child
  • left_categories_len: length of left_cateogries
  • default_left: default direction for missing values
  • left_child_key: unique integer key to identify the left child node
  • right_child_key: unique integer key to identify the right child node

int TreeliteTreeBuilderSetLeafNode(TreeBuilderHandle handle, int node_key, double leaf_value)

Turn an empty node into a leaf node.

Return
0 for success; -1 for failure
Parameters
  • handle: tree builder
  • node_key: unique integer key to identify the node being modified; this node needs to be empty
  • leaf_value: leaf value (weight) of the leaf node

int TreeliteTreeBuilderSetLeafVectorNode(TreeBuilderHandle handle, int node_key, const double *leaf_vector, size_t leaf_vector_len)

Turn an empty node into a leaf vector node The leaf vector (collection of multiple leaf weights per leaf node) is useful for multi-class random forest classifier.

Return
0 for success; -1 for failure
Parameters
  • handle: tree builder
  • node_key: unique integer key to identify the node being modified; this node needs to be empty
  • leaf_vector: leaf vector of the leaf node
  • leaf_vector_len: length of leaf_vector

int TreeliteCreateModelBuilder(int num_feature, int num_output_group, int random_forest_flag, ModelBuilderHandle *out)

Create a new model builder.

Return
0 for success; -1 for failure
Parameters
  • num_feature: number of features used in model being built. We assume that all feature indices are between 0 and (num_feature - 1).
  • num_output_group: number of output groups. Set to 1 for binary classification and regression; >1 for multiclass classification
  • random_forest_flag: whether the model is a random forest. Set to 0 if the model is gradient boosted trees. Any nonzero value shall indicate that the model is a random forest.
  • out: newly created model builder

int TreeliteModelBuilderSetModelParam(ModelBuilderHandle handle, const char *name, const char *value)

Set a model parameter.

Return
0 for success; -1 for failure
Parameters
  • handle: model builder
  • name: name of parameter
  • value: value of parameter

int TreeliteDeleteModelBuilder(ModelBuilderHandle handle)

Delete a model builder from memory.

Return
0 for success; -1 for failure
Parameters
  • handle: model builder to remove

int TreeliteModelBuilderInsertTree(ModelBuilderHandle handle, TreeBuilderHandle tree_builder, int index)

Insert a tree at specified location.

Return
index of the new tree within the ensemble; -1 for failure
Parameters
  • handle: model builder
  • tree_builder: builder for the tree to be inserted. The tree must not be part of any other existing tree ensemble. Note: The tree_builder argument will become unusuable after the tree insertion. Should you want to modify the tree afterwards, use GetTree(*) method to get a fresh handle to the tree.
  • index: index of the element before which to insert the tree; use -1 to insert at the end

int TreeliteModelBuilderGetTree(ModelBuilderHandle handle, int index, TreeBuilderHandle *out)

Get a reference to a tree in the ensemble.

Return
0 for success; -1 for failure
Parameters
  • handle: model builder
  • index: index of the tree in the ensemble
  • out: used to save reference to the tree

int TreeliteModelBuilderDeleteTree(ModelBuilderHandle handle, int index)

Remove a tree from the ensemble.

Return
0 for success; -1 for failure
Parameters
  • handle: model builder
  • index: index of the tree that would be removed

int TreeliteModelBuilderCommitModel(ModelBuilderHandle handle, ModelHandle *out)

finalize the model and produce the in-memory representation

Return
0 for success; -1 for failure
Parameters
  • handle: model builder
  • out: used to save handle to in-memory representation of the finished model

Predictor interface

Use the following functions to load compiled prediction subroutines from shared libraries and to make predictions.

int TreeliteAssembleSparseBatch(const float *data, const uint32_t *col_ind, const size_t *row_ptr, size_t num_row, size_t num_col, CSRBatchHandle *out)

assemble a sparse batch

Return
0 for success, -1 for failure
Parameters
  • data: feature values
  • col_ind: feature indices
  • row_ptr: pointer to row headers
  • num_row: number of data rows in the batch
  • num_col: number of columns (features) in the batch
  • out: handle to sparse batch

int TreeliteDeleteSparseBatch(CSRBatchHandle handle)

delete a sparse batch from memory

Return
0 for success, -1 for failure
Parameters
  • handle: sparse batch

int TreeliteAssembleDenseBatch(const float *data, float missing_value, size_t num_row, size_t num_col, DenseBatchHandle *out)

assemble a dense batch

Return
0 for success, -1 for failure
Parameters
  • data: feature values
  • missing_value: value to represent the missing value
  • num_row: number of data rows in the batch
  • num_col: number of columns (features) in the batch
  • out: handle to sparse batch

int TreeliteDeleteDenseBatch(DenseBatchHandle handle)

delete a dense batch from memory

Return
0 for success, -1 for failure
Parameters
  • handle: dense batch

int TreeliteBatchGetDimension(void *handle, int batch_sparse, size_t *out_num_row, size_t *out_num_col)

get dimensions of a batch

Return
0 for success, -1 for failure
Parameters
  • handle: a batch of rows (must be of type SparseBatch or DenseBatch)
  • batch_sparse: whether the batch is sparse (true) or dense (false)
  • out_num_row: used to set number of rows
  • out_num_col: used to set number of columns

int TreelitePredictorLoad(const char *library_path, int num_worker_thread, int include_master_thread, PredictorHandle *out)

load prediction code into memory. This function assumes that the prediction code has been already compiled into a dynamic shared library object (.so/.dll/.dylib).

Return
0 for success, -1 for failure
Parameters
  • library_path: path to library object file containing prediction code
  • num_worker_thread: number of worker threads (-1 to use max number)
  • include_master_thread: whether to assign workload to the master thread. If not, only workers threads will be assigned work.
  • out: handle to predictor

int TreelitePredictorPredictBatch(PredictorHandle handle, void *batch, int batch_sparse, int verbose, int pred_margin, float *out_result, size_t *out_result_size)

Make predictions on a batch of data rows (synchronously). This function internally divides the workload among all worker threads.

Return
0 for success, -1 for failure
Parameters
  • handle: predictor
  • batch: a batch of rows (must be of type SparseBatch or DenseBatch)
  • batch_sparse: whether batch is sparse (1) or dense (0)
  • verbose: whether to produce extra messages
  • pred_margin: whether to produce raw margin scores instead of transformed probabilities
  • out_result: resulting output vector; use TreelitePredictorQueryResultSize() to allocate sufficient space
  • out_result_size: used to save length of the output vector, which is guaranteed to be less than or equal to TreelitePredictorQueryResultSize()

int TreelitePredictorPredictInst(PredictorHandle handle, union TreelitePredictorEntry *inst, int pred_margin, float *out_result, size_t *out_result_size)

Make predictions on a single data row (synchronously). The work will be scheduled to the calling thread.

Return
0 for success, -1 for failure
Parameters

int TreelitePredictorQueryResultSize(PredictorHandle handle, void *batch, int batch_sparse, size_t *out)

Given a batch of data rows, query the necessary size of array to hold predictions for all data points.

Return
0 for success, -1 for failure
Parameters
  • handle: predictor
  • batch: a batch of rows (must be of type SparseBatch or DenseBatch)
  • batch_sparse: whether batch is sparse (1) or dense (0)
  • out: used to store the length of prediction array

int TreelitePredictorQueryResultSizeSingleInst(PredictorHandle handle, size_t *out)

Query the necessary size of array to hold the prediction for a single data row.

Return
0 for success, -1 for failure
Parameters
  • handle: predictor
  • out: used to store the length of prediction array

int TreelitePredictorQueryNumOutputGroup(PredictorHandle handle, size_t *out)

Get the number of output groups in the loaded model The number is 1 for most tasks; it is greater than 1 for multiclass classifcation.

Return
0 for success, -1 for failure
Parameters
  • handle: predictor
  • out: length of prediction array

int TreelitePredictorQueryNumFeature(PredictorHandle handle, size_t *out)

Get the width (number of features) of each instance used to train the loaded model.

Return
0 for success, -1 for failure
Parameters
  • handle: predictor
  • out: number of features

int TreelitePredictorFree(PredictorHandle handle)

delete predictor from memory

Return
0 for success, -1 for failure
Parameters
  • handle: predictor to remove

Handle types

Treelite uses C++ classes to define its internal data structures. In order to pass C++ objects to C functions, opaque handles are used. Opaque handles are void* pointers that store raw memory addresses.

typedef void *DMatrixHandle

handle to a data matrix

typedef void *ModelHandle

handle to a decision tree ensemble model

typedef void *TreeBuilderHandle

handle to tree builder class

typedef void *ModelBuilderHandle

handle to ensemble builder class

typedef void *AnnotationHandle

handle to branch annotation data

typedef void *CompilerHandle

handle to compiler class

typedef void *PredictorHandle

handle to predictor class

typedef void *CSRBatchHandle

handle to batch of sparse data rows

typedef void *DenseBatchHandle

handle to batch of dense data rows