The Lattica Platform

Lattica is a platform for AI providers to host models, manage access, and allocate compute resources. End users can run encrypted queries on these models without exposing their data.

Built on the Lattica FHE stack, our platform enables fully homomorphic encrypted inference, so AI providers never see user inputs, and users never expose their data.

platform_banner1

the lattica platform architecture

the lattica platform architecture

Query Client

End User

img
img

Encrypted query

img

Encrypted response

img
FHE Inference Engine

Cloud

img

Deploy model

img img

Manage resources

img
LatticaAI Console

AI
Provider

img
img
img

how it works

1. Upload your model through the Lattica platform.

2. We host and serve your model on the cloud.

3. Your users encrypt their queries with the Lattica Query Client before sending them to our inference API.

4. When an encrypted query arrives, we perform FHE inference on your model and return the encrypted result directly to your user.

Console

https://www.lattica.ai/wp-content/uploads/2025/03/Group-119.png

Access Control

Control who can use your AI models with token-based access.

  • Tokens act as access keys that allow a specific user to query your model.
  • Set permissions to control usage limits.
  • Revoke or update tokens at any time.

Example: A company providing an AI model for medical diagnosis can issue tokens to hospitals, allowing them to run secure queries without accessing the model directly.

https://www.lattica.ai/wp-content/uploads/2025/03/Group-113.png

Compute Resource Management

Control how much computing power is used for each model.

  • Workers are the processing units that run AI models on the FHE engine.
  • Start and stop workers as needed.
  • Usage is tracked by uptime, not number of queries.

Example: If a model is processing 100 queries per minute, you may need multiple workers. If demand drops, you can turn off unused workers to save costs.

https://www.lattica.ai/wp-content/uploads/2025/03/WhatsApp-Image-2025-03-12-at-10.30.03.png

Credit-Based Payment System

A pay-as-you-go system for compute resources.

  • Credits are deducted based on worker uptime.
  • Different worker types have different costs depending on performance and hardware.
  • Credits can be purchased at any time to keep workers running without interruption.

Example: A high-performance GPU worker will consume more credits per hour than a CPU worker, allowing providers to balance cost and speed based on workload needs.

https://www.lattica.ai/wp-content/uploads/2025/03/Group-115.png

Monitoring Dashboard

Manage AI models, track usage, and control access from a single interface.

  • Upload and manage models from a web interface or a CLI.
  • Monitor usage, including queries, compute time, and costs.
  • Control access by assigning tokens to users.
  • Allocate compute resources and track credit consumption.
from lattica import QueryClient
import torch

# === One-time setup per user/model ===
# Authenticate and get encryption params.
client = QueryClient("my_query_token")
# Generate a key-pair and upload evaluation key to the server. 
context = client.generate_key()

# === Make encrypted queries ===
query_data1 = torch.Tensor([0.1, 0.2, 0.3])
result = client.run_query(query_data1, context)
query_data2 = torch.Tensor([0.4, 0.5, 0.6])
result = client.run_query(query_data2, context)

Query Client

1. Obtain a query token for the model you intend to query

2. Encrypt input data and send a secure query

3. Receive an encrypted response

4. Decrypt results locally

explore available models

How to Get Started

view documentation

Or

Request Early Access to the Platform

    What brings you to Lattica?

    Select all that apply

    How did you hear about Lattica?

    Select all that apply