SHARK PROJECT SUMMARY

Current Status & Milestones

  1. Token SHIVER live on Base · contract address pending publication.
  2. Initial Shiver mesh: 4 cold-running nodes across EU + NA, serving inference for a private alpha group of ~30 builders.
  3. First open model (SHARK Llama 3.3 70B) targeted for HuggingFace publication this quarter.
  4. Public API + node-client release timed with the open model drop.
  5. OpenAI-compatible /v1/chat endpoint already serving the alpha — drop-in replacement for any app using the OpenAI Python or JS SDK.
  6. Cold-tuning pipeline (our internal fine-tune toolchain) validated on Mistral 24B; multi-shard distributed training works across 8+ heterogeneous GPUs.

Summary

SHARK is a distributed AI compute and model-distribution project on Base. It exists because three things are converging at the same time, and the existing landscape doesn't solve all three at once:

The Shiver Network solves all three: cold-running consumer GPUs serve open uncensored models at sub-second latency, paid in $SHIVER. Builders pay roughly 30% less than centralized providers. Node operators get a clean way to monetize hardware that would otherwise burn power for nothing.

The brand is intentional: a shiver is the English collective noun for sharks, and it also literally describes how our nodes are configured to run — cold, thermally capped, never thrashing the silicon.

Why SHARK, why now

Three forces are reshaping the AI compute market right now:

  1. Hyperscaler capacity is capped. AWS, Azure, and GCP are sold out of H100s in most regions for the foreseeable future. Lead times on the next-gen B200 series are 6+ months. Centralized providers are price-takers on hardware they can't even acquire.
  2. Closed-frontier alignment is tightening. Every major lab has narrowed acceptable outputs, refused entire categories of queries, and added hidden system prompts that bias toward their preferred answers. For developers building serious products, this is a steerability crisis.
  3. Consumer GPUs are everywhere. Estimates put deployed RTX 30/40-series cards alone at ~80 million worldwide. Most of them sit idle ~22 hours per day. The aggregate compute dwarfs every hyperscaler combined.

SHARK pulls those threads together into a network where consumer hardware serves uncensored open weights, paid in a native token, with onchain attestation for billing. None of those pieces are new individually. Combined and shipped, they're a different shape of company.

SHARK Models

SHARK is trained on a pipeline of leading open-source frontier models with the goal of customizing alignment to the user — not to a corporate policy. We start from Meta Llama 3.3, Mistral, and Google Gemma checkpoints and apply our internal toolchain called cold-tuning which modifies the base model with a custom dataset emphasizing instruction-following over refusal.

This is done through a process called "fine-tuning" which modifies the base model using a custom dataset. We have developed a pipeline to create Unaligned & Unbiased versions of the best open-source AI models that are available to the public. Every weight is published to HuggingFace under shark-shiver/ — anyone can audit, fork, or self-host. We do not gate model access behind the Shiver Network; we just believe the Shiver is the cheapest place to run them.

Three families are currently in active development:

Shiver Network — Distributed Inference & Training

We have created the Shiver Network for distributed AI compute. Anyone running the SHARK client contributes GPU cycles in exchange for $SHIVER payouts, denominated per token of inference served. Onboarding takes around 10 minutes; the client auto-detects GPU and proposes a thermal target you can adjust before running.

Architecture

The Shiver is a three-layer system:

  1. Edge nodes — the GPU operators. Each runs the SHARK client, which advertises capacity (model slugs supported, throughput, thermal headroom) to the routing layer over a gossip protocol.
  2. Routing layer — a thin peer-to-peer overlay that maps an inbound inference request to the best-matching idle node based on model availability, latency, and reputation score.
  3. Settlement layer — onchain on Base. Every served token generates a signed attestation from both the caller and the serving node; these are batched and submitted to the rewards contract per epoch.

There is no central "shiver.cloud" inference server — api.shrk.cloud is a thin gateway that resolves to the same routing layer any peer can speak. Self-hosting the gateway is a documented and supported path.

Security model

Three threats matter:

Roadmap

  1. Q3 — Alpha Shiver opens to public. SHARK Llama 3.3 70B drops on HuggingFace.
  2. Q4 — Node client + onchain payouts go live on Base. $SHIVER market opens.
  3. Q1 — Synthetic data pipeline online. First public dataset release. Multi-tenant on-node isolation.
  4. Q2 — SHARK Mistral 24B Cold drops alongside the Telegram + Discord bot. Vision API beta.
  5. Q3+1 — Distributed training MVP (multi-node fine-tunes paid in $SHIVER).

Economics & $SHIVER value accrual

$SHIVER is the unit of account inside the Shiver. Apps pay for inference in $SHIVER (or auto-converted USDC at the gateway); node operators receive $SHIVER. The protocol takes a 1% fee on inference revenue and uses it to buy-and-burn $SHIVER from the open market, creating a continuous value loop tied directly to compute demand.

Supply schedule is deliberately simple:

Token: SHIVER · Supply: 500,000,000 · Chain: Base · Initial LP fee: 0.3%

Governance

Onchain parameter changes (protocol fee, epoch length, slashing thresholds, model allowlist) move through a simple veToken vote: lock $SHIVER for 1–4 years, vote weight proportional to lock duration. Smart-contract upgrades require a 7-day timelock visible onchain. No emergency pause; if the contracts break, we ship new ones and migrate.

Open source

The SHARK node client, the routing layer, and the settlement contracts are all open-source under the Apache 2.0 license. Model weights are released under their respective base-model licenses (Llama Community License for the Llama variants, Apache for Mistral/Gemma variants). No proprietary fork of any base model — every artifact is reproducible by any contributor.

FAQ

Is this just another decentralized inference network? The market is real but most projects in it are token-first, product-second. We're shipping the model + the network + the chat interface together — judge it on what's online, not the roadmap.

Why Base instead of Solana / Ethereum mainnet? Cheap settlement, Coinbase-backed sequencer, growing onchain identity tooling, and the most active retail flow of any L2 today. Solana's L1 fees are competitive but Base's tooling for production apps is years ahead.

What about regulation? Models are open weights, the network is decentralized, payouts are software-mediated. We don't host or moderate content. Operators are responsible for their own jurisdiction. The same way Ethereum nodes don't get sued because someone deployed a meme coin.

Can I run a node behind NAT? Yes — the routing layer holepunches via STUN. No port forwarding required.

Can I self-host the gateway? Yes — the gateway is a thin proxy; spinning your own up is one binary + a config file. Documented in Docs.