# ORB Cloud, Complete Reference

Base URL: https://api.orbcloud.dev
Auth: Authorization: Bearer orb_YOUR_API_KEY

## Is ORB the right fit?

ORB is for AI agents that call LLMs and need a persistent environment
between calls. The whole reason ORB exists is to checkpoint idle agents
while they wait on LLM responses, so idle time doesn't cost RAM.

Fit:
  - Coding / sales / research / support / browser-use / orchestrator agents
  - Anything that calls Anthropic, OpenAI, Google, etc. and stays alive
    between calls (minutes, hours, days)

Not a fit:
  - Traditional web apps (Next.js, Rails, Django) with no LLM calls ->
    use Vercel / Railway / Fly. ORB's efficiency gain is zero without
    LLM-wait time to checkpoint through.
  - Stateless scripts / one-shot jobs -> Lambda / Modal / E2B
  - CI/CD -> GitHub Actions

If you're not calling an LLM, ORB costs you sandbox isolation with no
corresponding benefit. Pick a purpose-built host instead.

## Get an API Key

Two paths, the key (orb_...) looks identical either way.

Path A, human-assisted (already done if a human pasted you a prompt).
  The human signed in at https://orbcloud.dev with GitHub; the landing
  page showed them an orb_... key and handed it to you in this prompt.
  Use it directly as Authorization: Bearer orb_...

Path B, agent self-serve (no human, instant, free, no approval).

  BEFORE calling /register, check if a key already exists:
    1. Read ./.env, if ORB_API_KEY= is set, use it. Do not re-register.
    2. Check your own persistent memory (CLAUDE.md, AGENTS.md, memory
       store) for "ORB_API_KEY for <this project>".
    3. If a human is present, confirm which account to use.

  Only if 1-3 find nothing:

  curl -X POST https://api.orbcloud.dev/api/v1/auth/register \
    -H 'Content-Type: application/json' \
    -d '{"email":"agent+<unique>@yourdomain.com"}'

  Response: {"tenant_id":"...","api_key":"orb_..."}

  After registering:
    1. Append ORB_API_KEY=orb_... to ./.env (create the file if missing).
    2. Ensure .env is in .gitignore.
    3. Record in your persistent memory: "ORB account for <project> lives
       at <absolute path>/.env, do not re-register for this project."

  Emails are unique, a second /register with the same email returns 409.
  Without the key saved anywhere, you lose the account until email-OTP
  recovery ships.

Use the key as Authorization: Bearer orb_... on every call below.

Rotate a compromised key:
  POST /v1/keys/{key_id}/rotate  (authed with the key being rotated)
  -> {"key_id":"...","api_key":"orb_...","replaced_key_id":"..."}
Old key is revoked atomically in the same transaction. No downtime.

## Quick Deploy (5 steps)

1. POST /v1/computers  {"name":"my-agent","runtime_mb":2048,"disk_mb":4096}
2. POST /v1/computers/{id}/config  Content-Type: application/toml  (body: orb.toml)
3. POST /v1/computers/{id}/build  (timeout 600s, clones repo, installs deps)
4. POST /v1/computers/{id}/agents  {"org_secrets":{"API_KEY":"sk-..."}}
5. Agent live at https://{first-8-chars-of-id}.orbcloud.dev

## orb.toml Format

```toml
[agent]
name = "my-agent"       # required
lang = "python"          # python | node | binary | go | rust
entry = "agent.py"       # script/binary to run
args = ["--flag"]        # optional

[agent.env]
HOME = "/root"
MY_KEY = "literal-value"
SECRET = "${VAR}"        # resolved from org_secrets at deploy time

[source]
git = "https://github.com/you/repo"
branch = "main"          # default: main
token = "${GITHUB_TOKEN}" # optional, for private repos

[build]
steps = ["pip install -r requirements.txt"]  # at least one step required; use ["true"] for no-op
working_dir = "/agent/code"  # default

[llm]
base_url = "https://api.anthropic.com"  # required

[ports]
expose = [8000]          # exposes at https://{id}.orbcloud.dev
                         # Bind your server to this port (the first value).
                         # ORB does not inject a PORT env var, if your
                         # framework reads one, set it in [agent.env]:
                         #   PORT = "8000"  (must match expose[0])

[resources]
runtime = "2GB"
disk = "4GB"

```

The [llm] section tells ORB where your agent's LLM provider lives.
ORB runs a proxy that intercepts LLM calls for checkpoint optimization.

Common base_url values:
  Anthropic:     https://api.anthropic.com
  OpenAI:        https://api.openai.com
  Google:        https://generativelanguage.googleapis.com
  z.ai Plan:     https://api.z.ai/api/anthropic
  OpenRouter:    https://openrouter.ai/api
  Groq:          https://api.groq.com/openai
  Together:      https://api.together.xyz
  DeepSeek:      https://api.deepseek.com
  Self-hosted:   https://your-server.com

## Secrets

Two ways to inject secrets:

1. Literal in orb.toml: `MY_KEY = "sk-actual-value"`
2. Variable: `MY_KEY = "${VAR}"` + pass at deploy:

```
POST /v1/computers/{id}/agents
{"org_secrets":{"VAR":"actual-value"}}
```

For build-time secrets (private git repos):
```
POST /v1/computers/{id}/build
{"org_secrets":{"GITHUB_TOKEN":"ghp_..."}}
```

## Auto-Deploy on Git Push (use-and-discard PAT)

Wire up `git push` -> ORB build. ORB never stores the GitHub token.

Bootstrap, commits .github/workflows/orb-deploy.yml, stashes ORB_API_KEY
secret, triggers first build, discards the PAT:
```
POST /v1/computers/{id}/github/bootstrap
{
  "pat": "gho_...",          // required. used-and-discarded.
  "api_key": "orb_...",      // optional. default: generate "github-actions:<repo>" key.
  "branch": "main"           // optional. default: source.branch from orb.toml.
}
```

Response: {workflow_commit_sha, secret_name, api_key_id, first_build_id}

Disconnect, deletes the workflow file and the secret:
```
POST /v1/computers/{id}/github/disconnect
{"pat": "gho_..."}
```

PAT scopes needed: repo + workflow (classic), or fine-grained with
Contents:write + Secrets:write + Workflows:write on the target repo.

Suggested ways to obtain a PAT (agent can pick any):
  gh auth token
  cat ~/.config/gh/hosts.yml | grep oauth_token
  create at https://github.com/settings/tokens

Full guide: https://docs.orbcloud.dev/guides/auto-deploy

## LLM Provider Detection

ORB automatically detects and optimizes calls to these LLM providers
by hostname. No configuration needed.

Supported endpoints:

Frontier: api.anthropic.com, api.openai.com,
  generativelanguage.googleapis.com, api.x.ai, api.cohere.com,
  api.mistral.ai, api.ai21.com
Aggregators: openrouter.ai, models.inference.ai.azure.com
Fast inference: api.groq.com, api.deepinfra.com, api.together.xyz,
  api.fireworks.ai, api.perplexity.ai, api.deepseek.com,
  api.sambanova.ai, api.cerebras.ai, api.novita.ai,
  api.hyperbolic.xyz, api.lepton.run
Embeddings: api.voyageai.com, api.jina.ai
Enterprise: integrate.api.nvidia.com
Chinese: api.z.ai, open.bigmodel.cn, dashscope.aliyuncs.com,
  aip.baidubce.com
Hosting: api.replicate.com, api-inference.huggingface.co,
  router.huggingface.co, api.modal.com
Wildcards: *.openai.azure.com, bedrock-runtime.*.amazonaws.com

If your LLM endpoint is not in this list, it will be treated as
regular HTTPS traffic (not optimized for sleep/wake).

## API Endpoints

### Auth
POST /api/v1/auth/register      {"email":"you@example.com"}  -> tenant_id, api_key
POST /api/v1/auth/login         {"api_key":"KEY"}             -> JWT
POST /v1/keys                   {"name":"my-key"}             -> key_id, api_key
GET  /v1/keys                                                  -> list of keys (no secrets)
POST /v1/keys/{key_id}/rotate                                  -> new key, old revoked atomically
DELETE /v1/keys/{key_id}                                       -> revoke key

### Computers
POST   /v1/computers        {"name":"x","runtime_mb":2048,"disk_mb":4096}
GET    /v1/computers
GET    /v1/computers/{id}
DELETE /v1/computers/{id}

### Config
POST /v1/computers/{id}/config   Content-Type: application/toml
GET  /v1/computers/{id}/config

### Build
POST /v1/computers/{id}/build    (timeout 600s, optional: {"org_secrets":{...}})

### GitHub Auto-Deploy (use-and-discard PAT)
POST /v1/computers/{id}/github/bootstrap    {"pat":"gho_...","api_key":"orb_...","branch":"main"}
POST /v1/computers/{id}/github/disconnect   {"pat":"gho_..."}

### Deploy
POST /v1/computers/{id}/agents   {"org_secrets":{...}}
GET  /v1/computers/{id}/agents

### Agent Control
POST /v1/computers/{id}/agents/demote   {"port":10000}  (sleep)
POST /v1/computers/{id}/agents/promote  {"port":10000}  (wake)

### Files
GET /v1/computers/{id}/files                # list rootfs root
GET /v1/computers/{id}/files/{path}         # listing (dir) or raw bytes (file)
PUT /v1/computers/{id}/files/{path}         # upload; body = raw bytes; 200MB cap
                                            # creates parent dirs; overwrites; no agent wake

# Reads do not wake a sleeping agent. Writes do not wake either, the file
# lands on disk and the agent will see it the next time it reads from there.
# Path-traversal guarded. Don't use PUT for permanent setup, put deps in
# orb.toml [build].steps instead; the terminal UI is the interactive version.

### Terminal (human-only debug UI, not for agents)
GET  /terminal/{id}                                   # HTML page (xterm.js)
GET  /v1/computers/{id}/terminal?key=orb_...          # WebSocket upgrade, bash PTY

This is a browser shell for humans debugging their agents. Agents already
run inside their own sandbox, there is no "connect to yourself" operation.
For setup, use orb.toml [build] steps, not the terminal.

### Keys
POST   /v1/keys             {"name":"prod"}
GET    /v1/keys
DELETE /v1/keys/{id}

### Webhooks
POST   /v1/webhooks          {"url":"https://...","events":["agent.spawned"]}
GET    /v1/webhooks
DELETE /v1/webhooks/{id}

### Stats (per-computer)
GET /v1/computers/{id}/stats                # default window=30d
GET /v1/computers/{id}/stats?window=lifetime

Returns sleep_pct, active_pct, llm_calls, checkpoints, runtime/disk GB-hours,
est_cost_usd, last_active_ago_secs, avg_restore_ms, everything a customer
dashboard needs for one computer in one call. Rates: $0.005/GB-hr runtime +
$0.05/GB-month disk.

### Usage
GET /v1/usage

## Agent Lifecycle

ORB auto-detects LLM API calls via transparent proxy and optimizes:
  - During LLM waits: agent may be put to sleep on NVMe (0 RAM cost)
  - On response: agent restored, response delivered
  - Idle detection: automatic via LLM proxy clock, no config needed

Wake triggers:
  1. LLM response arrives
  2. Inbound HTTP to exposed port (https://{id}.orbcloud.dev/...)
  3. Manual POST /v1/computers/{id}/agents/promote

Sleep/wake is ~1s. Full state preserved (memory, files, TCP, browser sessions).

## Cron (scheduled tasks)

Use this, NOT system cron, crond, systemd timers, or any in-process timer (setInterval, time.sleep loops). Those live inside the agent process/sandbox and freeze when the agent goes to sleep on NVMe, missing every firing during sleep. ORB cron runs on the host, fires at wall-clock time regardless of agent state.

Write /agent/.orb/cron.json:
```json
{
  "version": 1,
  "jobs": [
    {"name":"daily","schedule":"0 9 * * *","command":"python3 review.py"},
    {"name":"ping","schedule":"*/5 * * * *","command":"curl -sf localhost:8000/health","timeout_secs":30}
  ]
}
```
Fields per job: name (unique), schedule (5 or 6-field cron), command (sh -c). Optional: env, working_dir, timeout_secs (≤3600, default 300), enabled, skip_if_running (default true).

Runtime polls the file every 5s and syncs to cloud. Scheduler executes each command inside the computer's sandbox namespaces, independent of the agent process, doesn't wake it. Results mirror to /agent/.orb/cron-history.json (last 20 runs, 512B previews).

API: GET /v1/computers/{id}/cron (jobs), GET /v1/computers/{id}/cron/runs?job=NAME&limit=50 (history), POST /v1/computers/{id}/cron/{name}/trigger (run now, ignores enabled/skip_if_running, blocks until done or timeout).

Limits: 50 jobs/computer, 10k runs/day.

## Troubleshooting

- Agent exits immediately: check entry path (relative to working_dir)
- Build fails DNS: destroy computer, create new one
- Agent never wakes: add [ports] expose for wake-on-request
- Env var empty: pass in org_secrets at deploy time