Run frontier MoE models for $5/month. Volunteer your GPU, earn real money. A distributed inference network where every node carries a piece of the load.
Receives your request, runs the gating network locally, and activates only the 2-6 relevant experts across the network. Aggregates the results back to you.
Each volunteer GPU hosts 1-3 expert weight files (~1.5GB each for Mixtral). A single RTX 3090 can serve inference requests while you game or work.
Every node holds a cryptographic identity from Bedrock. Consent tokens gate which models a node will serve. Every inference is audit-logged.
$5/month gives you a token budget. Tokens debit per inference. Hosts earn a 70% share of what you spend, proportional to compute contributed.
Adaptive routing balances load (60%) and latency (40%). Falls back to redundant experts on timeout. No single point of failure.
Q4_K_M quantization fits frontier models on consumer hardware. Mixtral-8x7B expert = ~1.5GB. Full model = ~24GB. Each piece runs independently.
Pick a tier. Get an API key instantly. Explorer starts at $5/month with 500K tokens.
Use the Sawyer API like any inference endpoint. The router receives your prompt and token embedding.
The gating network identifies which 2-6 experts are needed. Only those experts activate. The rest stay dormant.
Expert nodes run forward passes concurrently. Average latency: 50-200ms per expert on consumer hardware.
The router aggregates expert outputs and returns your response. Tokens are debited from your budget.
70% of what you spend goes to the nodes that served you. Monthly or quarterly payouts via Stripe Connect.
Of every token you serve goes to you. The other 30% sustains the network routing and infrastructure.
Monthly payout threshold. Or $25 for quarterly. Via Stripe Connect. 1099-K tax reporting handled automatically.
Register your GPU, download expert weights, and start serving. Sawyer handles routing, payment, and health monitoring.
Stripe Connect Express handles onboarding, bank verification, and tax reporting. Your personal data stays with Stripe.
| Model | Params | Experts | Active/Token | Q4 Size | Expert Size |
|---|---|---|---|---|---|
| Mixtral 8x7B | 46.7B | 8 | 2 | ~24 GB | ~1.5 GB |
| DeepSeek-V2 Lite new | 15.7B | 64 (shared) | 6 | ~9 GB | varies |
| Qwen2.5 7B MoE | 14.3B | 60 | 4 | ~7 GB | varies |
# Create your account
$ sawyer account create --tier explorer
# Register your GPU as a host node
$ sawyer provider register --email you@example.com --name "MyNode"
# Start serving inference requests
$ sawyer serve --gpu
Sawyer Node Started
Node: sawyer-node-abc123
Experts: mixtral-8x7b/e2, mixtral-8x7b/e5
GPU: NVIDIA RTX 3090 (24 GB)
Status: Healthy
Earnings: $0.00
No credit card required for the Explorer tier. Cancel anytime. Your unused tokens roll over.