Quantir Documentation

Quantir is a crypto intelligence platform powered by decentralized AI inference. We provide analytics tools for Solana traders and an OpenAI-compatible API for developers, all running on c0mpute's distributed GPU network.

Analytics Platform

AI-powered token analysis, wallet tracking, and sentiment scoring for Solana.

Inference API

OpenAI-compatible endpoint. Private, uncensored, flat-rate per request.

GPU Network

Architecture

Quantir's architecture is split into two layers: the application layer (analytics, dashboard, chat) and the inference layer (powered by c0mpute's decentralized network).

┌──────────────────────────────────────────────┐
│              Quantir Application              │
│   Dashboard  │  Chat AI  │  API Gateway      │
└──────────────────────┬───────────────────────┘
                       │ HTTPS (OpenAI protocol)
              ┌────────▼────────┐
              │  c0mpute.ai     │
              │  Orchestrator   │
              └────────┬────────┘
                       │ Socket.io
        ┌──────────────┼──────────────┐
        │              │              │
   ┌────▼────┐   ┌────▼────┐   ┌────▼────┐
   │ Browser  │   │ Native  │   │  Image  │
   │ Workers  │   │ Workers │   │ Workers │
   │ (WebGPU) │   │ (CUDA)  │   │(ComfyUI)│
   └──────────┘   └─────────┘   └─────────┘

Request Flow

User sends a query via Quantir dashboard or API
Quantir backend adds system prompts, on-chain context, and tool definitions
Request is forwarded to c0mpute's orchestrator via OpenAI-compatible protocol
Orchestrator selects a GPU worker (weighted-random by speed)
Worker runs inference and streams tokens back
Quantir post-processes and delivers the response

Key Properties

Stateless inference — No prompts are stored on any server
Worker anonymity — Workers process data without knowing the sender
Fault-tolerant — Jobs auto-retry on different workers if one fails
Horizontally scalable — More workers join, more throughput available

API Reference

Base Configuration

Base URL:  https://quantir.tech/api/v1
API Key:   sk-quantir-... (generate at quantir.tech/settings)

Headers:
  Authorization: Bearer sk-quantir-...
  Content-Type: application/json

POST /v1/chat/completions

Create a chat completion. Supports streaming, tool calling, and vision.

// Request
{
  "model": "quantir-max",
  "messages": [
    { "role": "system", "content": "You are a crypto analyst." },
    { "role": "user", "content": "Analyze SOL/USDC momentum" }
  ],
  "stream": true,
  "temperature": 0.7,
  "max_tokens": 2048
}

// Response (non-streaming)
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "quantir-max",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Based on current on-chain data..."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 42,
    "completion_tokens": 156,
    "total_tokens": 198
  }
}

POST /v1/chat/completions (with tools)

Function/tool calling for structured data retrieval.

{
  "model": "quantir-max",
  "messages": [
    { "role": "user", "content": "Get top 5 wallets buying SOL today" }
  ],
  "tools": [{
    "type": "function",
    "function": {
      "name": "get_top_wallets",
      "description": "Fetch top wallets by volume",
      "parameters": {
        "type": "object",
        "properties": {
          "token": { "type": "string" },
          "limit": { "type": "integer" },
          "timeframe": { "type": "string", "enum": ["1h", "24h", "7d"] }
        },
        "required": ["token"]
      }
    }
  }]
}

POST /v1/images/generations

Generate images. 20 credits per image.

// Request
{
  "prompt": "cyberpunk trading dashboard, dark theme",
  "n": 1,
  "size": "1024x1024"
}

// Response
{
  "data": [{
    "url": "https://quantir.tech/images/gen-abc123.png"
  }]
}

GET /v1/balance

Check your current credit balance.

// Response
{
  "credits": 450,
  "usd_equivalent": 4.50
}

Error Codes

Code	Meaning
401	Invalid or missing API key
402	Insufficient credits
429	Rate limit exceeded (60 req/min)
503	No workers available, retry in a moment

Models

Model	Credits	Workers	Best For
`quantir-pro`	10	Browser (WebGPU)	Fast queries, simple analysis
`quantir-max`	15	Native (CUDA)	Deep analysis, complex reasoning
`quantir-max-think`	20	Native (CUDA)	Multi-step research, extended reasoning
`code`	15	Native (CUDA)	Code generation, smart contract analysis

Capabilities

Chat completions — All models
Streaming — All models (SSE format)
Tool/function calling — All models
Vision (base64 images) — quantir-max, quantir-max-think
Web search — quantir-max, quantir-max-think (model-driven tool call)

Model Details

quantir-pro runs on browser-based WebGPU workers using optimized 8B parameter models. Response times are typically 1–3 seconds for short queries. Best for quick lookups and simple analysis.

quantir-max runs on native GPU workers with 27B+ parameter models via ollama. Supports extended context windows and complex multi-turn conversations. Recommended for deep research.

quantir-max-think enables extended chain-of-thought reasoning. The model internally deliberates before responding, producing more thorough analysis at higher latency.

Pricing

1 credit = $0.01 USDC. Flat per-request billing — you know the cost before sending.

Credit Costs

Model	Credits/request	USD/request
quantir-pro	10	$0.10
quantir-max	15	$0.15
quantir-max-think	20	$0.20
code	15	$0.15
Image generation	20	$0.20

Why Flat Pricing

Traditional AI APIs charge per-token, making costs unpredictable — especially for crypto analysis which often requires large context windows. With Quantir, every request has a fixed cost regardless of input/output length. No token math, no bill shock.

Free Tier

Basic analytics dashboard access
Limited AI queries per day
Public sentiment feed
No credit card required

Buying Credits

Credits are purchased with USDC on Solana. Minimum purchase: 100 credits ($1.00). Credits never expire. Balance is visible at GET /v1/balance.

$QNTR Token

$QNTR is a value-accrual token on Solana. It is not required to use the platform — all services are paid in USDC credits. $QNTR exists to capture and distribute protocol revenue.

Contract Address: Coming soon

Treasury Sources

Source	% to Treasury
Compute margin (workers keep 70%)	30%
$QNTR trading fees	35%

Daily Treasury Split

50% — Buyback and burn $QNTR (deflationary pressure)
50% — Distributed to stakers in USDC (real yield)

Staking Benefits

Benefit	Details
USDC Rewards	Daily at 15:00 UTC, proportional to stake size
Free Credits	Daily allowance based on stake (use-it-or-lose-it, active users only)
Worker Boost	Stake 500K+ $QNTR → earn 80% per job instead of 70%

Staking Mechanics

24-hour maturity period on new deposits
Unstake anytime — newest deposits withdrawn first (LIFO)
Auto-compound option: USDC reward → buy $QNTR → restake
Self-custody vault — your keys, your tokens, always

Why Not Inflationary Emissions

Most crypto yield comes from token inflation (printing new tokens to pay stakers). $QNTR rewards are paid in USDC from actual protocol revenue. The token supply only decreases over time via buyback-and-burn. This is real yield, not dilution disguised as APY.

Running a Worker

Workers contribute GPU compute to the network and earn USDC for every inference job. The network runs on c0mpute's infrastructure — workers register with their orchestrator.

Browser Worker (Pro Tier)

Requirements: Chrome/Edge with WebGPU support, ~6GB VRAM (dedicated or shared GPU)

Navigate to the worker page
Connect your Solana wallet
Click “Start Worker” — model downloads (~4.3GB, cached after first run)
Keep the tab open to receive and process jobs

Browser workers run an optimized 8B parameter model via WebGPU. No installation required.

Native Worker (Max Tier)

Requirements: 20GB+ VRAM, NVIDIA (CUDA), Apple Silicon (Metal), or AMD (Vulkan)

# 1. Install ollama
curl -fsSL https://ollama.ai/install.sh | sh

# 2. Pull the model (choose one)
ollama pull qwen3.5:27b        # General purpose (recommended)
ollama pull supergemma4:26b    # Alternative

# 3. Install and start the worker
npm i -g @quantir/worker
quantir-worker start --wallet <your-solana-address>

# Worker connects to orchestrator and starts receiving jobs

Worker Selection

The orchestrator uses weighted-random selection based on measured tokens/second. Faster workers receive proportionally more jobs, but work is spread across all active workers.

Earnings

Tier	Earnings/job	With 500K $QNTR staked
Pro (browser)	70% of credit value	80%
Max (native)	70% of credit value	80%

Max jobs pay 3–5x more per job since the credit cost is higher. Native workers earn significantly more than browser workers. Earnings are paid in USDC and withdrawable to any Solana wallet.

Analytics Features

Quantir's analytics layer sits on top of the inference network, combining live on-chain data with uncensored AI reasoning to provide crypto intelligence.

AI Pattern Recognition

Detects trading patterns and anomalies across Solana tokens. Identifies accumulation phases, distribution patterns, wash trading, and momentum shifts using real-time transaction analysis.

Smart Wallet Tracking

Monitor wallets with consistently profitable trades. The system identifies high-performing wallets, tracks their positions in real-time, and alerts you to new entries and exits.

Whale and Insider Alerts

Early detection of large holder movements. Get notified when whales accumulate, distribute, or when insider wallets (team, early investors) begin moving tokens.

Sentiment Analysis

Real-time scoring combining social signals (Twitter, Telegram, Discord) with on-chain metrics (volume, holder count, transaction frequency) to produce actionable sentiment indicators.

Uncensored Analysis

Unlike corporate AI tools that refuse to discuss certain tokens or strategies, Quantir has no content filters on financial analysis. Ask about any token, any strategy, any risk level. The AI provides analysis without moral judgment — the decision is always yours.

Infrastructure

Quantir's inference is powered by c0mpute.ai — a decentralized GPU network where regular people contribute compute power and earn USDC.

Why Decentralized Inference

Privacy — No central server stores your prompts or logs your activity
Censorship resistance — No corporate policy layer filtering responses
Cost efficiency — Distributed GPU supply keeps inference costs low
Resilience — No single point of failure; network auto-heals

How c0mpute Works

c0mpute runs a Node.js orchestrator that maintains a registry of GPU workers. When Quantir sends an inference request, the orchestrator selects an available worker, routes the job, and streams tokens back. Workers are anonymous — they process prompts without knowing who sent them.

Worker Types

Type	Hardware	Model	VRAM
Browser (Pro)	WebGPU	8B params	~6GB
Native (Max)	CUDA/Metal/Vulkan	27B params	20GB+
Image	Dedicated GPU	Chroma1-HD (ComfyUI)	12GB+

Network Stats

The orchestrator broadcasts live network statistics every 5 seconds including active worker count, average tok/s, jobs processed, and network utilization. These stats are visible on the dashboard.

SDKs and Frameworks

Quantir's API is fully OpenAI-compatible. Any library or tool that supports OpenAI works out of the box — just change the base URL and API key.

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="https://quantir.tech/api/v1",
    api_key="sk-quantir-..."
)

# Streaming
stream = client.chat.completions.create(
    model="quantir-max",
    messages=[{"role": "user", "content": "Analyze SOL momentum"}],
    stream=True
)

for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

Node.js (OpenAI SDK)

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://quantir.tech/api/v1',
  apiKey: 'sk-quantir-...'
});

const stream = await client.chat.completions.create({
  model: 'quantir-max',
  messages: [{ role: 'user', content: 'Top accumulating wallets' }],
  stream: true
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

Vercel AI SDK

import { createOpenAI } from '@ai-sdk/openai';
import { streamText } from 'ai';

const quantir = createOpenAI({
  baseURL: 'https://quantir.tech/api/v1',
  apiKey: 'sk-quantir-...'
});

const result = await streamText({
  model: quantir('quantir-max'),
  prompt: 'What tokens are trending on Solana?'
});

LangChain (Python)

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://quantir.tech/api/v1",
    api_key="sk-quantir-...",
    model="quantir-max"
)

response = llm.invoke("Summarize whale activity on SOL/USDC")

cURL

curl https://quantir.tech/api/v1/chat/completions \
  -H "Authorization: Bearer sk-quantir-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "quantir-max",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": false
  }'

Privacy and Security

Privacy is not a feature — it is the architecture. Quantir and c0mpute are designed so that user data cannot be retained even if someone wanted to.

Guarantees

No prompt storage — Prompts are processed in-memory and discarded after response
No logging — Neither Quantir nor c0mpute log request content
Worker anonymity — Workers see the prompt but not who sent it
User anonymity — Wallet-based auth, no email/identity required
No training — Your data is never used to train or fine-tune models

Authentication

Quantir uses wallet-based authentication via Privy. Connect any Solana wallet to create an account. API keys are generated from your settings page and can be revoked at any time.

Data Flow

User → [HTTPS/TLS] → Quantir API Gateway
  → [adds system prompt + context]
  → [HTTPS/TLS] → c0mpute Orchestrator
  → [Socket.io] → GPU Worker (processes in RAM)
  → Tokens streamed back ← same path
  → All in-memory data discarded

What We Do Store

Wallet address (for auth and billing)
Credit balance and transaction history (amounts only, not content)
API key hashes (not the keys themselves)

We explicitly do not store: prompts, responses, conversation history, IP addresses, or analytics cookies.