First tutorial
This tutorial will demonstrate the basic workflow.
import treelite
Regression Example
In this tutorial, we will use a small regression example to describe the full workflow.
Load the Boston house prices dataset
Let us use the Boston house prices dataset from scikit-learn
(sklearn.datasets.load_boston()
). It consists of 506 houses
with 13 distinct features:
from sklearn.datasets import load_boston
X, y = load_boston(return_X_y=True)
print(f'dimensions of X = {X.shape}')
print(f'dimensions of y = {y.shape}')
Train a tree ensemble model using XGBoost
The first step is to train a tree ensemble model using XGBoost (dmlc/xgboost).
Disclaimer: Treelite does NOT depend on the XGBoost package in any way. XGBoost was used here only to provide a working example.
import xgboost
dtrain = xgboost.DMatrix(X, label=y)
params = {'max_depth':3, 'eta':1, 'objective':'reg:squarederror', 'eval_metric':'rmse'}
bst = xgboost.train(params, dtrain, 20, [(dtrain, 'train')])
Pass XGBoost model into Treelite
Next, we feed the trained model into Treelite. If you used XGBoost to train the model, it takes only one line of code:
model = treelite.Model.from_xgboost(bst)
Note
Using other packages to train decision trees
With additional work, you can use models trained with other machine learning packages. See this page for instructions.