Verifiable MNIST Neural Network

Giza provides developers with the tools to easily create and expand Verifiable Machine Learning solutions, transforming their Python scripts and ML models into robust, repeatable workflows.

In this tutorial, we will explore the process of building your first Neural Network using MNIST dataset, Pytorch, and Giza SDK and demonstrating its verifiability.

What is MNIST dataset?

The MNIST dataset is an extensive collection of handwritten digits, very popular in the field of image processing. Often, it's used as a reference point for machine learning algorithms. This dataset conveniently comes already partitioned into training and testing sets, a feature we'll delve into later in this tutorial.

The MNIST database comprises a collection of 70,000 images of handwritten digits, ranging from 0 to 9. Each image measures 28 x 28 pixels. For the purpose of this tutorial, we will resize image to 14 x 14 pixels.

Installation

To follow this tutorial, you must first proceed with the following installation.

Handling Python versions with Pyenv

You should install Giza tools in a virtual environment. If you’re unfamiliar with Python virtual environments, take a look at this guide. A virtual environment makes it easier to manage different projects and avoid compatibility issues between dependencies.

Install Python 3.11 using pyenv

pyenv install 3.11.0

Set Python 3.11 as local Python version:

pyenv local 3.11.0

Create a virtual environment using Python 3.11:

pyenv virtualenv 3.11.0 my-env

Activate the virtual environment:

pyenv activate my-env

Now, your terminal session will use Python 3.11 for this project.

Install Giza

Install Giza CLI

Install the CLI from PyPi:

pip install giza-cli

Install Agent SDK

Install the Agents package from PyPi

pip install giza-agents

You'll find more options for installing Giza in the installation guide.

Install Dependencies

You must also install the following dependencies:

pip install torch scipy numpy

Setup

From your terminal, create a Giza user through our CLI in order to access the Giza Platform:

giza users create

After creating your user, log into Giza:

giza users login

Optional: you can create an API Key for your user in order to not regenerate your access token every few hours.

giza users create-api-key

Define and Train a Model

Step 1: Set Up the Environment

First, import the necessary libraries and configure the environment settings.

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torchvision
import numpy as np
import logging
from scipy.ndimage import zoom
from torch.utils.data import DataLoader, TensorDataset

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

Step 2: Define Model Parameters

Specify the parameters that will define the neural network's structure.

input_size = 196  # 14x14
hidden_size = 10 
num_classes = 10
num_epochs = 10
batch_size = 256
learning_rate = 0.001

Step 3: Create the Neural Network Model

Define the neural network architecture.

class NeuralNet(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super(NeuralNet, self).__init__()
        self.input_size = input_size
        self.l1 = nn.Linear(input_size, hidden_size) 
        self.relu = nn.ReLU()
        self.l2 = nn.Linear(hidden_size, num_classes) 

    def forward(self, x):
        out = self.l1(x)
        out = self.relu(out)
        out = self.l2(out)
        return out

Step 4: Prepare Datasets

Write functions to resize the images and prepare training and testing datasets.

def resize_images(images):
    return np.array([zoom(image[0], (0.5, 0.5)) for image in images])

def prepare_datasets():
    print("Prepare dataset...")
    train_dataset = torchvision.datasets.MNIST(root='./data', train=True, download=True)
    test_dataset = torchvision.datasets.MNIST(root='./data', train=False)

    x_train = resize_images(train_dataset)
    x_test = resize_images(test_dataset)

    x_train = torch.tensor(x_train.reshape(-1, 14*14).astype('float32') / 255)
    y_train = torch.tensor([label for _, label in train_dataset], dtype=torch.long)

    x_test = torch.tensor(x_test.reshape(-1, 14*14).astype('float32') / 255)
    y_test = torch.tensor([label for _, label in test_dataset], dtype=torch.long)

    print("✅ Datasets prepared successfully")

    return x_train, y_train, x_test, y_test

Step 5: Create Data Loaders

Create data loaders to manage batches of the datasets.

def create_data_loaders(x_train, y_train, x_test, y_test):
    print("Create loaders...")

    train_loader = DataLoader(TensorDataset(x_train, y_train), batch_size=batch_size, shuffle=True)
    test_loader = DataLoader(TensorDataset(x_test, y_test), batch_size=batch_size, shuffle=False)

    print("✅ Loaders created!")

    return train_loader, test_loader

Step 6: Train the Model

Develop a function to train the model using the training loader.

def train_model(train_loader):
    print("Train model...")

    model = NeuralNet(input_size, hidden_size, num_classes).to(device)
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=learning_rate)

    for epoch in range(num_epochs):
        for i, (images, labels) in enumerate(train_loader):
            images = images.to(device).reshape(-1, 14*14)
            labels = labels.to(device)

            outputs = model(images)
            loss = criterion(outputs, labels)

            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

            if (i + 1) % 100 == 0:
                print(f'Epoch [{epoch + 1}/{num_epochs}], Step [{i + 1}/{len(train_loader)}], Loss: {loss.item():.4f}')

    print("✅ Model trained successfully")
    return model

Step 7: Test the Model

Define a function to evaluate the model's performance on the test set.

def test_model(model, test_loader):
    print("Test model...")
    with torch.no_grad():
        n_correct = 0
        n_samples = 0
        for images, labels in test_loader:
            images = images.to(device).reshape(-1, 14*14)
            labels = labels.to(device)
            outputs = model(images)
            _, predicted = torch.max(outputs.data, 1)
            n_samples += labels.size(0)
            n_correct += (predicted == labels).sum().item()

        acc = 100.0 * n_correct / n_samples
        print(f'Accuracy of the network on the 10000 test images: {acc} %')

Step 8: Execute the Tasks

Create a function to execute all the previous steps in sequence.

def execution():
    # Prepare training and testing datasets
    x_train, y_train, x_test, y_test = prepare_datasets()

    train_loader, test_loader = create_data_loaders(x_train, y_train, x_test, y_test)

    model = train_model(train_loader)
 
    test_model(model, test_loader)
    
    return model

model = execution()

Convert the model to ONNX

Before transpiling the model we've just trained to ZK circuits, we need to convert the model to the ONNX framework. You can consult the list of frameworks supported by the Transpiler, here.

ONNX, short for Open Neural Network Exchange, is an open format for representing and exchanging machine learning models between different frameworks and libraries. It serves as an intermediary format that allows you to move models seamlessly between various platforms and tools, facilitating interoperability and flexibility in the machine learning ecosystem.

import torch.onnx

def convert_to_onnx(model, onnx_file_path):
    dummy_input = torch.randn(1, input_size).to(device)
    torch.onnx.export(model, dummy_input, onnx_file_path,
                      export_params=True, opset_version=10, do_constant_folding=True)

    print(f"Model has been converted to ONNX and saved as {onnx_file_path}")

onnx_file_path = "mnist_model.onnx"
convert_to_onnx(model, onnx_file_path)

Transpile your Model to Orion Cairo

For more detailed information on transpilation, please consult the Transpiler resource.

Now that your model is converted to ONNX format, use the Giza-CLI to transpile it to Orion Cairo code.

> giza transpile mnist_model.onnx --output-path verifiable_mnist
>>>
[giza][2024-02-07 16:31:20.844] No model id provided, checking if model exists ✅
[giza][2024-02-07 16:31:20.845] Model name is: mnist_model
[giza][2024-02-07 16:31:21.599] Model Created with id -> 1! ✅
[giza][2024-02-07 16:31:22.436] Version Created with id -> 1! ✅
[giza][2024-02-07 16:31:22.437] Sending model for transpilation ✅
[giza][2024-02-07 16:32:13.511] Transpilation is fully compatible. Version compiled and Sierra is saved at Giza ✅
[giza][2024-02-07 16:32:13.516] Transpilation recieved! ✅
[giza][2024-02-07 16:32:14.349] Transpilation saved at: verifiable_mnist

Thanks to full support for all operators used by MNIST model in the transpiler, your transpilation process is completely compatible. This ensures that your project compiles smoothly and has already been compiled behind the scenes on our platform.

If your model incorporates operators that aren't supported by the transpiler, you may need to refine your Cairo project to ensure successful compilation. For more details, refer to the to Transpiler resource.

Deploy an Inference Endpoint

For more detailed information on inference endpoint, please consult the Endpoint resource.

With your model successfully transposed, it's now ready for deployment of an inference endpoint. Our deployment process sets up services that handle prediction requests via a designated endpoint, using Cairo to ensure inference provability.

Deploy your service, which will be ready to accept prediction requests at the /cairo_run endpoint, by using the following command:

giza endpoints deploy --model-id 1 --version-id 1
▰▰▰▰▰▱▱ Creating endpoint!
[giza][2024-02-07 12:31:02.498] Endpoint is successful 
[giza][2024-02-07 12:31:02.501] Endpoint created with id -> 1 ✅
[giza][2024-02-07 12:31:02.502] Endpoint created with endpoint URL: https://deployment-gizabrain-38-1-53427f44-dagsgas-ew.a.run.app 🎉

Run a Verifiable Inference

Now that your Cairo model is deployed on the Giza platform, you have the capability to execute it.

When you initiate a prediction using Giza it executes the Cairo program using CairoVM, generating trace and memory files for the proving. It also returns the output value and initiates a proving job to generate a Stark proof of the inference.

Update the IDs in the following code with your own.

from giza.agents.model import GizaModel

def preprocess_image(image_path):
    from PIL import Image
    import numpy as np

    # Load image, convert to grayscale, resize and normalize
    image = Image.open(image_path).convert('L')
    # Resize to match the input size of the model
    image = image.resize((14, 14))
    image = np.array(image).astype('float32') / 255
    image = image.reshape(1, 196)  # Reshape to (1, 196) for model input
    return image

MODEL_ID = 1  # Update with your model ID
VERSION_ID = 1  # Update with your version ID

def prediction(image, model_id, version_id):
    model = GizaModel(id=model_id, version=version_id)

    (result, request_id) = model.predict(
        input_feed={"image": image}, verifiable=True
    )

    # Convert result to a PyTorch tensor
    result_tensor = torch.tensor(result)
    # Apply softmax to convert to probabilities
    probabilities = F.softmax(result_tensor, dim=1)
    # Use argmax to get the predicted class
    predicted_class = torch.argmax(probabilities, dim=1)

    return predicted_class.item(), request_id

def execution():
    image = preprocess_image("./imgs/zero.png")
    (result, request_id) = prediction(image, MODEL_ID, VERSION_ID)
    print("Result: ", result)
    print("Request id: ", request_id)

    return result, request_id


execution()
🚀 Starting deserialization process...
✅ Deserialization completed! 🎉
(0, '"3a15bca06d1f4788b36c1c54fa71ba07"')

Download the proof

For more detailed information on proving, please consult the Prove resource.

Executing a verifiable inference sets off a proving job on our server, sparing you the complexities of installing and configuring the prover yourself. Upon completion, you can download your proof.

First, let's check the status of the proving job to ensure that it has been completed.

Remember to substitute endpoint-id and proof-id with the specific IDs assigned to you throughout this tutorial.

giza endpoints get-proof --endpoint-id 1 --proof-id "3a15bca06d1f4788b36c1c54fa71ba07"

>>>
[giza][2024-03-19 11:51:45.470] Getting proof from endpoint 109 ✅ 
{
  "id": 664,
  "job_id": 831,
  "metrics": {
    "proving_time": 15.083126
  },
  "created_date": "2024-03-19T10:41:11.120310"
}

Once the proof is ready, you can download it.

$ giza endpoints download-proof --endpoint-id 1 --proof-id "3a15bca06d1f4788b36c1c54fa71ba07" --output-path zk_mnist.proof

>>>>
[giza][2024-03-19 11:55:49.713] Getting proof from endpoint 1 ✅ 
[giza][2024-03-19 11:55:50.493] Proof downloaded to zk_mnist.proof ✅ 

Voilà 🎉🎉 You've learned how to use the entire Giza stack, from training your model to transpiling it to Cairo for verifiable execution. We hope you've enjoyed this journey!

Better to surround the proof-id in double quotes (") when using the alphanumerical id

Verify the proof

Finally you can verify the proof.

$ giza verify --proof-id 664

>>>>
[giza][2024-05-21 10:08:59.315] Verifying proof...
[giza][2024-05-21 10:09:00.268] Verification result: True
[giza][2024-05-21 10:09:00.270] Verification time: 0.437505093

Last updated