# Stella Managed Media — Images

Capabilities for image generation and editing.

Audience: AI agents. Plain text on purpose. Curl me.

## Capabilities

### `text_to_image` — generate images from text

- Profiles: `best` (default; photoreal with accurate in-image text), `fast` (quicker, lower fidelity).
- Convenience fields: `prompt`, `aspectRatio`.
- Useful `input` overrides: `quality` (`low` | `medium` | `high`), `num_images` (1–4), `output_format` (`png` | `jpeg` | `webp`).

```bash
curl -X POST "$STELLA_API/api/media/v1/generate" \
  -H "Authorization: Bearer $STELLA_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "capability": "text_to_image",
    "prompt": "cinematic rainy Tokyo alley at night",
    "aspectRatio": "9:16",
    "input": { "quality": "high", "num_images": 1 }
  }'
```

### `icon` — icons, logos, thumbnails (square)

- Single profile (`default`); square output is enforced by the backend.
- Convenience field: `prompt`. Don't pass `aspectRatio`.
- Useful `input` hints: transparent / background style, brand constraints described in the prompt.

### `image_edit` — edit an existing image

- Single profile (`default`); supports mask-aware natural-language edits.
- Required: `source` (or `sourceUrl`) of the image to edit.
- Convenience fields: `prompt`, `aspectRatio` (defaults to `auto`).
- Useful `input` overrides: `quality`, `num_images`, `mask_url`.

```bash
curl -X POST "$STELLA_API/api/media/v1/generate" \
  -H "Authorization: Bearer $STELLA_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "capability": "image_edit",
    "prompt": "remove the background, keep the subject sharp",
    "source": "data:image/png;base64,<base64>"
  }'
```

## Notes for agents

- If you need to know whether generation completed, use
  `stella-media generate --request-file ... --wait --timeout 240`. The
  command exits nonzero on failure or timeout and prints saved output paths on
  success.
- If you only need to kick off generation, submit without `--wait` and tell
  the user what you started. The Display sidebar will open with the result
  automatically when it finishes.
- For multi-image jobs, the materializer writes `<jobId>_0.png`, `<jobId>_1.png`, …
  to `state/media/outputs/` and shows them as a gallery.

## Endpoint

POST <stella-api>/api/media/v1/generate
Content-Type: application/json
Authorization: Bearer <stella-session-token>

Where `<stella-api>` is the Stella backend base URL the desktop app is signed
in against (e.g. `https://api.stella.sh`). Reuse the user's existing session
token — do not invent your own credentials.

## Request body

```json
{
  "capability": "<id>",          // required; see per-kind sections below
  "profile": "<id>",             // optional; defaults to the capability's preferred profile
  "prompt": "...",               // optional convenience field; mapped to the capability's prompt key
  "aspectRatio": "16:9",         // optional convenience field for image/video; mapped to aspect_ratio
  "sourceUrl": "https://...",    // optional; for capabilities that take a public URL
  "source": "data:image/png;base64,...", // optional; for local files (preferred)
  "sources": { "video": "data:...", "audio": "data:..." }, // for multi-input capabilities
  "input": { /* provider-specific overrides, merged on top of the convenience fields */ }
}
```

`source` accepts a `data:` URI string or `{ "base64": "...", "mimeType": "image/png" }`.
The backend wraps the value into the right shape for the picked endpoint
(e.g. `image_urls: ["data:..."]` for image edit).

## Response (202 Accepted)

```json
{
  "jobId": "job_123",
  "capability": "text_to_image",
  "profile": "best",
  "status": "queued",
  "upstreamStatus": "IN_QUEUE",
  "subscription": {
    "query": "api.media_jobs.getByJobId",
    "args": { "jobId": "job_123" }
  }
}
```

## Watching for completion

Use the local `stella-media` command when you want normal
`exec_command`-style behavior. It submits the same gateway request, can wait
until the job reaches a terminal state, and saves completed outputs to
`state/media/outputs/`.

```bash
cat > /tmp/stella-media-request.json <<'JSON'
{
  "capability": "text_to_image",
  "prompt": "a clean product render of a translucent blue desk lamp",
  "aspectRatio": "1:1"
}
JSON

stella-media generate --request-file /tmp/stella-media-request.json --wait --timeout 240
```

Without `--wait`, `stella-media generate` returns after submit with a
`jobId`. To check a job later:

```bash
stella-media status --job-id <jobId> --save
```

The Stella desktop renderer also subscribes to every succeeded media job for
the signed-in user, downloads the output to
`state/media/outputs/<jobId>_<i>.<ext>`, and pops it open in the Display
sidebar automatically. If generation fails, Stella shows a failure
notification.

If you do need the raw status, subscribe to Convex:
`useQuery(api.media_jobs.getByJobId, { jobId })`. Status values:
`queued`, `running`, `succeeded`, `failed`, `canceled`.

## Auth failure (401)

If the user is not signed in, the endpoint returns a structured 401:

```json
{
  "error": "Sign in to Stella to use media generation.",
  "code": "auth_required",
  "action": "Ask the user to open the Stella desktop app and finish signing in (Settings → Account, or the welcome screen on first launch). Once they're signed in, retry the same request — no payload changes needed.",
  "docsUrl": "https://stella.sh/docs/media"
}
```

When you see `code: "auth_required"`:
1. Stop the in-flight job — do not retry on a backoff.
2. Surface `action` to the user verbatim so they know what to do.
3. Once they confirm sign-in, re-run the original request with the same payload.

The response also sets `WWW-Authenticate: Bearer realm="stella-media"` for
non-agent HTTP clients.

## Errors

All other errors return `{ "error": "human-readable message" }` with an
appropriate status. Upstream provider errors (content policy, validation,
rate limits) are parsed and forwarded as-is — show the message to the user.