General Tree Inference Library (GTIL)

GTIL is a reference implementation of a prediction runtime for all Treelite models. It has the following goals:

Universal coverage: GTIL shall support all tree ensemble models that can be represented as Treelite objects.
Accessible code: GTIL should be written in an easy-to-read style that can be understood to a first-time contributor. We prefer code legibility to performance optimization.
Correct output: As a reference implementation, GTIL should produce correct prediction outputs.

General Tree Inference Library (GTIL)

Functions:

`predict`(model, data, *[, nthread, pred_margin])	Predict with a Treelite model using the General Tree Inference Library (GTIL).
`predict_leaf`(model, data, *[, nthread])	Predict with a Treelite model, outputting the leaf node's ID for each row.
`predict_per_tree`(model, data, *[, nthread])	Predict with a Treelite model and output prediction of each tree.

treelite.gtil.predict(model, data, *, nthread=-1, pred_margin=False)

Predict with a Treelite model using the General Tree Inference Library (GTIL).

Parameters:

model (Model object) – Treelite model object
data (numpy.ndarray / scipy.sparse.csr_matrix) – Data matrix, with which to run prediction
nthread (int, optional) – Number of CPU cores to use in prediction. If <= 0, use all CPU cores.
pred_margin (bool) – Whether to produce raw margin scores. If pred_margin=True, post-processing is no longer applied and raw margin scores are produced.

Returns:

prediction – Prediction output. Expected dimensions: (num_row, num_target, max(num_class))

Return type:

numpy.ndarray array

treelite.gtil.predict_leaf(model, data, *, nthread=-1)

Predict with a Treelite model, outputting the leaf node’s ID for each row.

Parameters:

model (Model object) – Treelite model object
data (numpy.ndarray / scipy.sparse.csr_matrix) – Data matrix, with which to run prediction
nthread (int, optional) – Number of CPU cores to use in prediction. If <= 0, use all CPU cores.

Returns:

prediction – Prediction output. Expected output dimensions: (num_row, num_tree)

Return type:

numpy.ndarray array

Notes

Treelite assigns a unique integer ID for every node in the tree, including leaf nodes as well as internal nodes. It does so by traversing the tree breadth-first. So, for example, the root node is assigned ID 0, and the two nodes at depth=1 is assigned ID 1 and 2, respectively. Call treelite.Model.dump_as_json() to obtain the ID of every tree node.

treelite.gtil.predict_per_tree(model, data, *, nthread=-1)

Predict with a Treelite model and output prediction of each tree. This function computes one or more margin scores per tree.

Parameters:

model (Model object) – Treelite model object
data (numpy.ndarray / scipy.sparse.csr_matrix) – Data matrix, with which to run prediction
nthread (int, optional) – Number of CPU cores to use in prediction. If <= 0, use all CPU cores.

Returns:

prediction – Prediction output. Expected output dimensions: (num_row, num_tree, leaf_vector_shape[0] * leaf_vector_shape[1])

Return type:

numpy.ndarray array