Treelite is a model compiler for decision tree ensembles, aimed at efficient deployment.
Star WatchYou are currently browsing the documentation of a stable version of Treelite: 0.93.
Treelite compiles your tree model into optimized shared library. A Benchmark demonstrates 2-6x improvement in prediction throughput, due to more efficient use of compute resources.
Treelite accommodates a wide range of decision tree ensemble models. In particular, it handles both random forests and gradient boosted trees.
Treelite can read models produced by XGBoost, LightGBM, and scikit-learn. In cases where you are using another package to train your model, you may use the flexible builder class.
It is a great hassle to install machine learning packages (e.g. XGBoost, LightGBM, scikit-learn, etc.) on every machine your tree model will run. This is the case no longer: Treelite will export your model as a stand-alone prediction library so that predictions will be made without any machine learning package installed.
Install Treelite from PyPI:
python3 -m pip install --user treelite treelite_runtime
Import your tree ensemble model into Treelite:
import treelite
model = treelite.Model.load('my_model.model', model_format='xgboost')
Deploy a source archive:
# Produce a zipped source directory, containing all model information
# Run `make` on the target machine
model.export_srcpkg(platform='unix', toolchain='gcc',
pkgpath='./mymodel.zip', libname='mymodel.so',
verbose=True)
Deploy a shared library:
# Like export_srcpkg, but generates a shared library immediately
# Use this only when the host and target machines are compatible
model.export_lib(toolchain='gcc', libpath='./mymodel.so', verbose=True)
Make predictions on the target machine:
import treelite_runtime
predictor = treelite_runtime.Predictor('./mymodel.so', verbose=True)
batch = treelite_runtime.Batch.from_npy2d(X)
out_pred = predictor.predict(batch)
Read First tutorial for a more detailed example. See Deploying models for additional instructions on deployment.
Note
A note on API compatibility
Since Treelite is in early development, its API may change substantially in the future.
The workflow involves two distinct machines: the host machine that generates prediction subroutine from a given tree model, and the target machine that runs the subroutine. The two machines exchange a single C file that contains all relevant information about the tree model. Only the host machine needs to have Treelite installed; the target machine requires only a working C compiler.