Known Limitations
Transpilation is failing due to memory
This can happen for two reasons:
The provided model is so big that the transpilation runs out of memory. When the model is transpiled, due to a Cairo limitation we need to embed the data as a Cairo file, these files are much bigger than a binary (like onnx) which means that we won't be able to run them as they will consume all our memory when running the Cairo code.
When we have a fully compatible model, we compile it on the users behalf to generate the sierra file in order to deploy it later, this compilation uses a lot of memory and can lead Out of Memory error (OOM), which will also mean that we won't be able to run it
We suggest to review you model architecture and simplify it (number of layers, neurons...). We encourage the usage of ZKBoost alongside with our Model Complexity Reducer (MCR).
Transpilation is failing
When a transpilation fails the logs of the transpilation are returned to give more information about what its happening. If there is an unhandled error please reach a developer and provide the logs.
Pendulum installation is failing
Make sure that you are using Python 3.11, giza-cli
has not been tested with versions above 3.11 and 3.11 is the minimum requirement. Take a look at Installation
Proving Job failed
The most common cause is due to running out of memory. Creating a proof takes a lot of memory due to the complexity of it, by default proving jobs are created using the M
size which uses 8 vCPU's and 32 GB of Ram, if you used the default size, try with L
or XL
sizes. Here is a table with the computing resources for each:
When executing a job it says: "Daily job rate exceeded for size: X"
There is a quota on the number of jobs that we can use daily in order to limit the usage per day to prevent overuse. When the quota limit is reached you can try to use another size if you need, we encourage to use the smaller possible as the quote is higher than for L
or XL
sizes. Quotas are restarted at 00:00 UTC.
When using an endpoint remember that a dry_run=
argument exists in the predict
function so proof creation is not triggered every single time which is useful for development.
Proving time is X but Job took much more time
When a proving job is executed we need to gather computing resources to launch the job. For bigger sizes gathering these resources takes more time than smaller sizes, so the time to just spin up the job can take up to 5 minutes for sizes L
and XL
where sizes S
and M
should just take a couple of seconds to start.
predict
raises a 503 Service Unavailable
predict
raises a 503 Service Unavailable
When using the predict
method of a GizaModel
if this error is raise it can be for 2 reasons:
Out of Memory: when running the cairo code the service runs out of memory of the service is killed, thus being "Unavailable". Try to delete the endpoint and create one with a higher size, check Endpoints. If it persist, the model is to big to being run.
The shape of the input is not in the expected shape, this usually happens when we pass a
numpy
array with a shape of(x, )
, make sure that we have both shapes and not just one(x, 1)
so it can be serialized by the service
Now there is a command giza endpoints logs --endpoint-id X
to retrieve the logs and help to identify which one of that happened. For the first, a log message of killed
should be shown and for the second a deserialization ERROR
Creating and endpoint returns Maximum number of endpoints
reached
Maximum number of endpoints
reachedI'm getting a 500 Internal Server Error
when executing a command
500 Internal Server Error
when executing a commandThis is an unexpected error. Please open an issue at https://github.com/gizatechxyz/giza-cli .
Make sure to provide as much information as possible like:
Operating System
Python version
Giza CLI/Agents-sdk version
Giza command used
Python traceback
request_id
if returnedContext
Actions SDK is returning an unhandled error
This is an unexpected error. Please open an issue at https://github.com/gizatechxyz/actions-sdk
Make sure to provide as much information as possible like:
Operating System
Python version
Giza CLI/Agents-sdk version
Giza command used
Python traceback
request_id
if returnedContext
Last updated