Quantir Documentation
Quantir is a crypto intelligence platform powered by decentralized AI inference. We provide analytics tools for Solana traders and an OpenAI-compatible API for developers, all running on c0mpute's distributed GPU network.
Analytics Platform
AI-powered token analysis, wallet tracking, and sentiment scoring for Solana.
Inference API
OpenAI-compatible endpoint. Private, uncensored, flat-rate per request.
GPU Network
Powered by c0mpute. Distributed workers earn USDC for every job completed.
Architecture
Quantir's architecture is split into two layers: the application layer (analytics, dashboard, chat) and the inference layer (powered by c0mpute's decentralized network).
┌──────────────────────────────────────────────┐
│ Quantir Application │
│ Dashboard │ Chat AI │ API Gateway │
└──────────────────────┬───────────────────────┘
│ HTTPS (OpenAI protocol)
┌────────▼────────┐
│ c0mpute.ai │
│ Orchestrator │
└────────┬────────┘
│ Socket.io
┌──────────────┼──────────────┐
│ │ │
┌────▼────┐ ┌────▼────┐ ┌────▼────┐
│ Browser │ │ Native │ │ Image │
│ Workers │ │ Workers │ │ Workers │
│ (WebGPU) │ │ (CUDA) │ │(ComfyUI)│
└──────────┘ └─────────┘ └─────────┘Request Flow
- User sends a query via Quantir dashboard or API
- Quantir backend adds system prompts, on-chain context, and tool definitions
- Request is forwarded to c0mpute's orchestrator via OpenAI-compatible protocol
- Orchestrator selects a GPU worker (weighted-random by speed)
- Worker runs inference and streams tokens back
- Quantir post-processes and delivers the response
Key Properties
- Stateless inference — No prompts are stored on any server
- Worker anonymity — Workers process data without knowing the sender
- Fault-tolerant — Jobs auto-retry on different workers if one fails
- Horizontally scalable — More workers join, more throughput available
API Reference
Base Configuration
Base URL: https://quantir.tech/api/v1 API Key: sk-quantir-... (generate at quantir.tech/settings) Headers: Authorization: Bearer sk-quantir-... Content-Type: application/json
POST /v1/chat/completions
Create a chat completion. Supports streaming, tool calling, and vision.
// Request
{
"model": "quantir-max",
"messages": [
{ "role": "system", "content": "You are a crypto analyst." },
{ "role": "user", "content": "Analyze SOL/USDC momentum" }
],
"stream": true,
"temperature": 0.7,
"max_tokens": 2048
}
// Response (non-streaming)
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"model": "quantir-max",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Based on current on-chain data..."
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 42,
"completion_tokens": 156,
"total_tokens": 198
}
}POST /v1/chat/completions (with tools)
Function/tool calling for structured data retrieval.
{
"model": "quantir-max",
"messages": [
{ "role": "user", "content": "Get top 5 wallets buying SOL today" }
],
"tools": [{
"type": "function",
"function": {
"name": "get_top_wallets",
"description": "Fetch top wallets by volume",
"parameters": {
"type": "object",
"properties": {
"token": { "type": "string" },
"limit": { "type": "integer" },
"timeframe": { "type": "string", "enum": ["1h", "24h", "7d"] }
},
"required": ["token"]
}
}
}]
}POST /v1/images/generations
Generate images. 20 credits per image.
// Request
{
"prompt": "cyberpunk trading dashboard, dark theme",
"n": 1,
"size": "1024x1024"
}
// Response
{
"data": [{
"url": "https://quantir.tech/images/gen-abc123.png"
}]
}GET /v1/balance
Check your current credit balance.
// Response
{
"credits": 450,
"usd_equivalent": 4.50
}Error Codes
| Code | Meaning |
|---|---|
| 401 | Invalid or missing API key |
| 402 | Insufficient credits |
| 429 | Rate limit exceeded (60 req/min) |
| 503 | No workers available, retry in a moment |
Models
| Model | Credits | Workers | Best For |
|---|---|---|---|
quantir-pro | 10 | Browser (WebGPU) | Fast queries, simple analysis |
quantir-max | 15 | Native (CUDA) | Deep analysis, complex reasoning |
quantir-max-think | 20 | Native (CUDA) | Multi-step research, extended reasoning |
code | 15 | Native (CUDA) | Code generation, smart contract analysis |
Capabilities
- Chat completions — All models
- Streaming — All models (SSE format)
- Tool/function calling — All models
- Vision (base64 images) — quantir-max, quantir-max-think
- Web search — quantir-max, quantir-max-think (model-driven tool call)
Model Details
quantir-pro runs on browser-based WebGPU workers using optimized 8B parameter models. Response times are typically 1–3 seconds for short queries. Best for quick lookups and simple analysis.
quantir-max runs on native GPU workers with 27B+ parameter models via ollama. Supports extended context windows and complex multi-turn conversations. Recommended for deep research.
quantir-max-think enables extended chain-of-thought reasoning. The model internally deliberates before responding, producing more thorough analysis at higher latency.
Pricing
1 credit = $0.01 USDC. Flat per-request billing — you know the cost before sending.
Credit Costs
| Model | Credits/request | USD/request |
|---|---|---|
| quantir-pro | 10 | $0.10 |
| quantir-max | 15 | $0.15 |
| quantir-max-think | 20 | $0.20 |
| code | 15 | $0.15 |
| Image generation | 20 | $0.20 |
Why Flat Pricing
Traditional AI APIs charge per-token, making costs unpredictable — especially for crypto analysis which often requires large context windows. With Quantir, every request has a fixed cost regardless of input/output length. No token math, no bill shock.
Free Tier
- Basic analytics dashboard access
- Limited AI queries per day
- Public sentiment feed
- No credit card required
Buying Credits
Credits are purchased with USDC on Solana. Minimum purchase: 100 credits ($1.00). Credits never expire. Balance is visible at GET /v1/balance.
$QNTR Token
$QNTR is a value-accrual token on Solana. It is not required to use the platform — all services are paid in USDC credits. $QNTR exists to capture and distribute protocol revenue.
Contract Address: Coming soon
Treasury Sources
| Source | % to Treasury |
|---|---|
| Compute margin (workers keep 70%) | 30% |
| $QNTR trading fees | 35% |
Daily Treasury Split
- 50% — Buyback and burn $QNTR (deflationary pressure)
- 50% — Distributed to stakers in USDC (real yield)
Staking Benefits
| Benefit | Details |
|---|---|
| USDC Rewards | Daily at 15:00 UTC, proportional to stake size |
| Free Credits | Daily allowance based on stake (use-it-or-lose-it, active users only) |
| Worker Boost | Stake 500K+ $QNTR → earn 80% per job instead of 70% |
Staking Mechanics
- 24-hour maturity period on new deposits
- Unstake anytime — newest deposits withdrawn first (LIFO)
- Auto-compound option: USDC reward → buy $QNTR → restake
- Self-custody vault — your keys, your tokens, always
Why Not Inflationary Emissions
Most crypto yield comes from token inflation (printing new tokens to pay stakers). $QNTR rewards are paid in USDC from actual protocol revenue. The token supply only decreases over time via buyback-and-burn. This is real yield, not dilution disguised as APY.
Running a Worker
Workers contribute GPU compute to the network and earn USDC for every inference job. The network runs on c0mpute's infrastructure — workers register with their orchestrator.
Browser Worker (Pro Tier)
Requirements: Chrome/Edge with WebGPU support, ~6GB VRAM (dedicated or shared GPU)
- Navigate to the worker page
- Connect your Solana wallet
- Click “Start Worker” — model downloads (~4.3GB, cached after first run)
- Keep the tab open to receive and process jobs
Browser workers run an optimized 8B parameter model via WebGPU. No installation required.
Native Worker (Max Tier)
Requirements: 20GB+ VRAM, NVIDIA (CUDA), Apple Silicon (Metal), or AMD (Vulkan)
# 1. Install ollama curl -fsSL https://ollama.ai/install.sh | sh # 2. Pull the model (choose one) ollama pull qwen3.5:27b # General purpose (recommended) ollama pull supergemma4:26b # Alternative # 3. Install and start the worker npm i -g @quantir/worker quantir-worker start --wallet <your-solana-address> # Worker connects to orchestrator and starts receiving jobs
Worker Selection
The orchestrator uses weighted-random selection based on measured tokens/second. Faster workers receive proportionally more jobs, but work is spread across all active workers.
Earnings
| Tier | Earnings/job | With 500K $QNTR staked |
|---|---|---|
| Pro (browser) | 70% of credit value | 80% |
| Max (native) | 70% of credit value | 80% |
Max jobs pay 3–5x more per job since the credit cost is higher. Native workers earn significantly more than browser workers. Earnings are paid in USDC and withdrawable to any Solana wallet.
Analytics Features
Quantir's analytics layer sits on top of the inference network, combining live on-chain data with uncensored AI reasoning to provide crypto intelligence.
AI Pattern Recognition
Detects trading patterns and anomalies across Solana tokens. Identifies accumulation phases, distribution patterns, wash trading, and momentum shifts using real-time transaction analysis.
Smart Wallet Tracking
Monitor wallets with consistently profitable trades. The system identifies high-performing wallets, tracks their positions in real-time, and alerts you to new entries and exits.
Whale and Insider Alerts
Early detection of large holder movements. Get notified when whales accumulate, distribute, or when insider wallets (team, early investors) begin moving tokens.
Sentiment Analysis
Real-time scoring combining social signals (Twitter, Telegram, Discord) with on-chain metrics (volume, holder count, transaction frequency) to produce actionable sentiment indicators.
Uncensored Analysis
Unlike corporate AI tools that refuse to discuss certain tokens or strategies, Quantir has no content filters on financial analysis. Ask about any token, any strategy, any risk level. The AI provides analysis without moral judgment — the decision is always yours.
Infrastructure
Quantir's inference is powered by c0mpute.ai — a decentralized GPU network where regular people contribute compute power and earn USDC.
Why Decentralized Inference
- Privacy — No central server stores your prompts or logs your activity
- Censorship resistance — No corporate policy layer filtering responses
- Cost efficiency — Distributed GPU supply keeps inference costs low
- Resilience — No single point of failure; network auto-heals
How c0mpute Works
c0mpute runs a Node.js orchestrator that maintains a registry of GPU workers. When Quantir sends an inference request, the orchestrator selects an available worker, routes the job, and streams tokens back. Workers are anonymous — they process prompts without knowing who sent them.
Worker Types
| Type | Hardware | Model | VRAM |
|---|---|---|---|
| Browser (Pro) | WebGPU | 8B params | ~6GB |
| Native (Max) | CUDA/Metal/Vulkan | 27B params | 20GB+ |
| Image | Dedicated GPU | Chroma1-HD (ComfyUI) | 12GB+ |
Network Stats
The orchestrator broadcasts live network statistics every 5 seconds including active worker count, average tok/s, jobs processed, and network utilization. These stats are visible on the dashboard.
SDKs and Frameworks
Quantir's API is fully OpenAI-compatible. Any library or tool that supports OpenAI works out of the box — just change the base URL and API key.
Python (OpenAI SDK)
from openai import OpenAI
client = OpenAI(
base_url="https://quantir.tech/api/v1",
api_key="sk-quantir-..."
)
# Streaming
stream = client.chat.completions.create(
model="quantir-max",
messages=[{"role": "user", "content": "Analyze SOL momentum"}],
stream=True
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")Node.js (OpenAI SDK)
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://quantir.tech/api/v1',
apiKey: 'sk-quantir-...'
});
const stream = await client.chat.completions.create({
model: 'quantir-max',
messages: [{ role: 'user', content: 'Top accumulating wallets' }],
stream: true
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
}Vercel AI SDK
import { createOpenAI } from '@ai-sdk/openai';
import { streamText } from 'ai';
const quantir = createOpenAI({
baseURL: 'https://quantir.tech/api/v1',
apiKey: 'sk-quantir-...'
});
const result = await streamText({
model: quantir('quantir-max'),
prompt: 'What tokens are trending on Solana?'
});LangChain (Python)
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
base_url="https://quantir.tech/api/v1",
api_key="sk-quantir-...",
model="quantir-max"
)
response = llm.invoke("Summarize whale activity on SOL/USDC")cURL
curl https://quantir.tech/api/v1/chat/completions \
-H "Authorization: Bearer sk-quantir-..." \
-H "Content-Type: application/json" \
-d '{
"model": "quantir-max",
"messages": [{"role": "user", "content": "Hello"}],
"stream": false
}'Privacy and Security
Privacy is not a feature — it is the architecture. Quantir and c0mpute are designed so that user data cannot be retained even if someone wanted to.
Guarantees
- No prompt storage — Prompts are processed in-memory and discarded after response
- No logging — Neither Quantir nor c0mpute log request content
- Worker anonymity — Workers see the prompt but not who sent it
- User anonymity — Wallet-based auth, no email/identity required
- No training — Your data is never used to train or fine-tune models
Authentication
Quantir uses wallet-based authentication via Privy. Connect any Solana wallet to create an account. API keys are generated from your settings page and can be revoked at any time.
Data Flow
User → [HTTPS/TLS] → Quantir API Gateway → [adds system prompt + context] → [HTTPS/TLS] → c0mpute Orchestrator → [Socket.io] → GPU Worker (processes in RAM) → Tokens streamed back ← same path → All in-memory data discarded
What We Do Store
- Wallet address (for auth and billing)
- Credit balance and transaction history (amounts only, not content)
- API key hashes (not the keys themselves)
We explicitly do not store: prompts, responses, conversation history, IP addresses, or analytics cookies.