Specifying models using Protocol Buffers

Since the scope of treelite is limited to prediction only, one must use other machine learning packages to train decision tree ensemble models. In this document, we will show how to import an ensemble model that had been trained elsewhere.

Using XGBoost or LightGBM for training? Read this document instead.

What is Protocol Buffers?

Protocol Buffers (google/protobuf) is a widely used mechanism to serialize structured data. You may specify your ensemble model according to the specification src/tree.proto. Depending on the package you used to train the model, it may take some effort to express the model in terms of the given spec. See this helpful guide on reading and writing serialized messages.

To import models that had been serialized with Protocol Buffers, use the load() method with argument format='protobuf':

# model had been saved to a file named my_model.bin
# notice the second argument format='protobuf'
model = Model.load('my_model.bin', format='protobuf')