In this tutorial you will learn how to use the Giza stack though a XGBoost model.
Installation
To follow this tutorial, you must first proceed with the following installation.
Handling Python versions with Pyenv
You should install Giza tools in a virtual environment. If you’re unfamiliar with Python virtual environments, take a look at this guide. A virtual environment makes it easier to manage different projects and avoid compatibility issues between dependencies.
Install Python 3.11 using pyenv
pyenvinstall3.11.0
Set Python 3.11 as local Python version:
pyenvlocal3.11.0
Create a virtual environment using Python 3.11:
pyenvvirtualenv3.11.0my-env
Activate the virtual environment:
pyenvactivatemy-env
Now, your terminal session will use Python 3.11 for this project.
Install Giza
Install Giza SDK
Install CLI, agents and zkcook using giza-sdk from PyPi
From your terminal, create a Giza user through our CLI in order to access the Giza Platform:
gizauserscreate
After creating your user, log into Giza:
gizauserslogin
Optional: you can create an API Key for your user in order to not regenerate your access token every few hours.
gizauserscreate-api-key
Create and Train an XGBoost Model
We'll start by creating a simple XGBoost model using Scikit-Learn and train it on diabetes dataset.
import xgboost as xgbfrom sklearn.datasets import load_diabetesfrom sklearn.model_selection import train_test_splitdata =load_diabetes()X, y = data.data, data.targetX_train, X_test, y_train, y_test =train_test_split(X, y, test_size=0.2, random_state=42)# Increase the number of trees and maximum depthn_estimators =2# Increase the number of treesmax_depth =6# Increase the maximum depth of each treexgb_reg = xgb.XGBRegressor(n_estimators=n_estimators, max_depth=max_depth)xgb_reg.fit(X_train, y_train)
Save the model
Save the model in Json format
from giza.zkcook import serialize_modelserialize_model(xgb_reg, "xgb_diabetes.json")
Transpile your model to Orion Cairo
For more detailed information on transpilation, please consult the Transpiler resource.
We will use Giza-CLI to transpile our saved model to Orion Cairo.
!gizatranspilexgb_diabetes.json--output-pathxgb_diabetes>>>>[giza][2024-05-10 17:14:48.565] No model id provided, checking ifmodelexists✅[giza][2024-05-10 17:14:48.567] Model name is: xgb_diabetes[giza][2024-05-10 17:14:49.081] Model already exists, using existing model ✅ [giza][2024-05-10 17:14:49.083] Model found with id -> 588!✅[giza][2024-05-10 17:14:49.777] Version Created with id -> 2!✅[giza][2024-05-10 17:14:49.780] Sending model for transpilation ✅ [giza][2024-05-10 17:15:00.670] Transpilation is fully compatible. Version compiled and Sierra is saved at Giza ✅⠙TranspilingModel...[giza][2024-05-10 17:15:01.337] Downloading model ✅[giza][2024-05-10 17:15:01.339] model saved at: xgb_diabetes
Deploy an inference endpoint
For more detailed information on inference endpoint, please consult the Endpoint resource.
Now that our model is transpiled to Cairo we can deploy an endpoint to run verifiable inferences. We will use Giza CLI again to run and deploy an endpoint. Ensure to replace model-id and version-id with your ids provided during transpilation.
!gizaendpointsdeploy--model-id588--version-id2>>>>▰▰▰▰▰▰▰Creatingendpoint!t![giza][2024-05-10 17:15:21.628] Endpointissuccessful✅[giza][2024-05-10 17:15:21.635] Endpoint created with id -> 190 ✅[giza][2024-05-10 17:15:21.636] Endpoint created with endpoint URL: https://endpoint-raphael-doukhan-588-2-72c9b3b8-7i3yxzspbq-ew.a.run.app 🎉
Run a verifiable inference
To streamline verifiable inference, you might consider using the endpoint URL obtained after transpilation. However, this approach requires manual serialization of the input for the Cairo program and handling the deserialization process. To make this process more user-friendly and keep you within a Python environment, we've introduced a Python SDK designed to facilitate the creation of ML workflows and execution of verifiable predictions. When you initiate a prediction, our system automatically retrieves the endpoint URL you deployed earlier, converts your input into Cairo-compatible format, executes the prediction, and then converts the output back into a numpy object.
import xgboost as xgbfrom sklearn.datasets import load_diabetesfrom sklearn.model_selection import train_test_splitfrom giza.agents.model import GizaModelMODEL_ID =588# Update with your model IDVERSION_ID =2# Update with your version IDdefprediction(input,model_id,version_id): model =GizaModel(id=model_id, version=version_id) (result, proof_id) = model.predict( input_feed={"input": input}, verifiable=True, model_category="XGB" )return result, proof_iddefexecution():# The input data type should match the model's expected inputinput= X_test[1,:] (result, proof_id) =prediction(input, MODEL_ID, VERSION_ID)print(f"Predicted value for input {input.flatten()[0]} is {result}")return result, proof_idif__name__=="__main__": data =load_diabetes() X, y = data.data, data.target X_train, X_test, y_train, y_test =train_test_split( X, y, test_size=0.2, random_state=42 ) _, proof_id =execution()print(f"Proof ID: {proof_id}")
If your problem is a binary classification problem, you will need to post-process the result obtained after executing the predict method. The code you need to execute to get the probability of class 1 (same probability returned by XGBClassifier.predict_proba()) is in the following code snippet
import jsonimport mathdeflogit(x):return math.log(x / (1- x))defpost_process_binary_pred(model_json_path,result):""" Returns the probability of the positive class given a result from GizaModel.predict() Parameters: model_json_path (str): Path to the trained model in JSON format. result (float): Result from GizaModel.predict(). Returns: float: Probability of the positive class. """withopen(model_json_path, 'r')as f: xg_json = json.load(f) base_score =float(xg_json['learner']['learner_model_param']['base_score'])if base_score !=0: result = result +logit(base_score) final_score =1/ (1+ math.exp(-result))return final_score# Usage examplemodel_path ='PATH_TO_YOUR_MODEL.json'# Path to your model JSON filepredict_result =3.45# Example result from GizaModel.predict()probability =post_process_binary_pred(model_path, predict_result)
Download the proof
For more detailed information on proving, please consult the Prove resource.
Initiating a verifiable inference sets off a proving job on our server, sparing you the complexities of installing and configuring the prover yourself. Upon completion, you can download your proof.
First, let's check the status of the proving job to ensure that it has been completed.
Remember to substitute endpoint-id and proof-id with the specific IDs assigned to you throughout this tutorial.
$gizaendpointsget-proof--endpoint-id190--proof-id"546f8817fa454db78982463868440e8c">>>[giza][2024-03-19 11:51:45.470] Getting proof from endpoint 190 ✅ {"id":664,"job_id":831,"metrics":{"proving_time":15.083126 },"created_date":"2024-03-19T10:41:11.120310"}
Once the proof is ready, you can download it.
$ giza endpoints download-proof --endpoint-id 190 --proof-id "546f8817fa454db78982463868440e8c" --output-path zk_xgboost.proof
>>>>[giza][2024-03-19 11:55:49.713] Getting proof from endpoint 190 ✅ [giza][2024-03-19 11:55:50.493] Proof downloaded to zk_xgboost.proof ✅
Better to surround the proof-id in double quotes (") when using the alphanumerical id