# Porten

> OpenAI-compatible LLM inference, routed across a fleet of community and self-hosted GPU nodes. Point any OpenAI SDK at one base URL, pick a model, and requests route to a node that can serve it — loading the model on demand if needed. EU-sovereign by design.

This file indexes the developer documentation. Append `.md` to any docs URL for the raw Markdown of that page. The whole set as one file is at https://porten.ai/llms-full.txt.

## Get started

- [Overview](https://porten.ai/docs/overview.md): What Porten is — one OpenAI-compatible API in front of a fleet of community and self-hosted GPU nodes, with models that load on demand.
- [Quickstart](https://porten.ai/docs/quickstart.md): Make your first Porten API call in a few minutes — get a key, list models, and stream a chat completion with curl or any OpenAI SDK.
- [Use it from your tools](https://porten.ai/docs/integrations.md): Point OpenCode, Cursor, Continue, the OpenAI SDKs, and LangChain at Porten by overriding the base URL and API key.

## Using the API

- [API reference](https://porten.ai/docs/api-reference.md): The OpenAI-compatible surface — chat completions, embeddings, models — with parameters, streaming, headers, and error codes.
- [Models & on-demand loading](https://porten.ai/docs/models.md): How the curated model catalog works, what "loads on demand" means, how warming and idle eviction behave, and how to pick a model.
- [Regions & data sovereignty](https://porten.ai/docs/regions.md): Porten is EU-sovereign by design — pin an API key to a region so requests only route to nodes there.
- [Sovereign inference — your hardware, your region](https://porten.ai/docs/sovereign.md): Guarantee that inference runs only on machines you own and only in the region you choose — enforced (fail-closed), not just promised. For regulated and public-sector workloads.

## Run a node

- [Run a node & earn](https://porten.ai/docs/running-a-node.md): Run a GPU machine as a Porten node — one-line install, browser login, engines (built-in, Ollama, OpenAI-compatible), and how serving turns into payouts.
- [Hardware guide — what to build](https://porten.ai/docs/hardware.md): How much VRAM each model needs, which GPUs and Macs to buy, how quantization and context affect memory, and what runs a flagship-competitive coding agent locally.
- [Build a combined machine (Thunderbolt 5)](https://porten.ai/docs/cluster-thunderbolt.md): Pool 2–6 Apple-Silicon Macs over Thunderbolt 5 with exo into one logical node, so Porten can serve frontier-size models no single box can hold.