API conventions

Request correlation

every response includes Request-Id
callers can provide x-request-id to make local benchmarking and smoke output deterministic

Metering

API responses include Omni-Meter-Class
this is the canonical hint for billing/quota classification in logs, smoke output, and reconciliation jobs
hosted MCP tools/call usage rows also persist mcp_tool_name; intelligence.query and signals.dilution.enhanced.get are the current MCP tools that count toward ai_queries

MCP JSON-RPC quota envelope

Successful quota-counted MCP tool calls include top-level _omni.quota metadata with family, mcpToolName, limit, used, remaining, resetAt, and plan context. MCP quota exhaustion returns HTTP 429 with JSON-RPC error code -32005 and error.data.code: "ai_query_quota_exceeded". -32004 means mcp_tool_timeout and may be retried with bounded backoff. -32005 means quota exhaustion and should not be retried until the reset window or a plan/quota change. -32006 means mcp_tool_circuit_open; treat it as temporary service protection for that tool and retry only after backoff, not as quota exhaustion.

Cost and token headers

every /v1/* response includes Omni-Token-Count and Omni-Estimated-Cost so agents can budget natively without re-reading the pricing page
Omni-Token-Count is an approximate output-token count computed with the o200k_base tokenizer (gpt-4o / o1 / modern-frontier encoding) as a portable proxy; consumer-model token usage may vary 10–30%
Omni-Estimated-Cost is a dollar string at 4-decimal precision (e.g. $0.0042) reflecting the plan-discounted per-call rate pre-grant-offset — emitted on every /v1/* 2xx regardless of token-count opt-in
grant and credit balance are communicated separately via Omni-Free-Grant-Remaining and Omni-Budget-Used — the final billed amount reflects grant offsets on top of the per-call estimate
Omni-Token-Count-Estimated: true is emitted whenever Omni-Token-Count falls back to a ceil(bytes / 4) approximation — either because the response body exceeded 512 KB (and was not fully tokenized to protect p95 latency) or because the response is a non-JSON billable payload (filing downloads, csv/markdown/pdf exports). Absence of this header means the count is an exact tokenizer result from the JSON body
counts flagged as estimated are advisory for budget planning only; do not use them for exact tokenizer-equivalence math
error responses (4xx, 5xx), rate-limit 429, billing-required 402, and budget-gate 402 all emit Omni-Token-Count: 0 and Omni-Estimated-Cost: $0.0000 — agents should treat failed requests as free

Opting in to exact token counts (OMNI-3428)

Response-body tokenization is opt-in by default to keep p50 latency on hot paths low. Clients that want exact counts on every response add the request header:

Omni-Compute-Headers: token-count

Behavior matrix:

Mode	`Omni-Token-Count`	`Omni-Token-Count-Source`	`Omni-Estimated-Cost`
Opt-in absent (default), cache miss	`0`	`opt-in-required`	rate-card
Opt-in absent (default), cache hit	exact (from cached body)	omitted; `Omni-Cache-Hit: true`	rate-card
Opt-in present, cache miss	exact (from body tokenize)	omitted	rate-card
Opt-in present, cache hit	exact (from cached body)	omitted; `Omni-Cache-Hit: true`	rate-card
Error (4xx/5xx)	`0`	(omitted)	`$0.0000`

Cache-served responses (a subset of cacheable /v1/* GETs) carry an exact token count regardless of opt-in, because the count is precomputed once at cache-write time and stored alongside the body. The Omni-Cache-Hit: true header makes that visible to clients. The Omni-Token-Count-Source: opt-in-required advisory header marks the cheap path: if your agent needs an exact count for cost reconciliation, re-issue the request with Omni-Compute-Headers: token-count. The official SDKs will adopt the header in the next minor release tracked under OMNI-3127. Operators can override the policy with the runtime flag OMNI_METERING_TOKENIZE_BODY:

always — revert to pre-OMNI-3428 always-on body tokenization (cache hits + cache misses both emit exact counts)
auto (default) — opt-in via the request header above; cache hits emit exact precomputed counts regardless of opt-in
never — emergency kill-switch: Omni-Token-Count: 0 everywhere including cache hits. Use only if a downstream consumer breaks unexpectedly. Omni-Estimated-Cost is unaffected.

Response formats

every response-shape endpoint accepts ?view=default | compact | agent
- default (implicit): human-readable shape with full provenance, freshness, materialization, and validation metadata
- compact: trimmed subset of top-level fields — suitable for UI consumers that only need identifiers + dates
- agent: agent-optimized shape that keeps essentials and citation pointers (e.g. filingUrl, accessionNumber, offsets) and drops provenance chains, freshness metadata, and pagination context not needed for retrieval
?view=agent and ?view=compact coexist as distinct shapes; agent is not a strict subset of compact — it ships the richer “essentials + citations” shape while compact ships a UI-friendly top-level skim
token-count and cost headers naturally shrink in agent mode (agents pay less for compact responses). No client changes are required — header values reflect the final serialized body
agent-mode responses preserve camelCase field names — agent mode is a projection of the default shape, not a rename. See the per-endpoint API reference for the exact agent-mode field set
unknown values of ?view= fall back silently to default, so accidental typos never surface as a 4xx

Endpoints that support `?view=agent`

Agent mode is live on 15 endpoints as of OMNI-3084. Expect 30–95% byte-size reduction depending on endpoint shape (endpoints that carry large prose summaries see the biggest wins). OMNI-3075 (first tranche):

GET /v1/filings/latest — single filing essentials + filingUrl
GET /v1/entities/resolve — entity with primary identifiers + match metadata
GET /v1/statements — statement with rows/periods, provenance dropped
GET /v1/facts — fact point items + lifted accessionNumber / filingUrl
GET /v1/sections/search — section items + lifted startOffset / endOffset

OMNI-3084 (second tranche):

GET /v1/insiders — insider trade items with trade essentials, ownership flags dropped
GET /v1/forms/144 — Form 144 filings with citation pointers
GET /v1/offerings — offering records (S-1 / 424) with prospectus URL
GET /v1/events/ma — M&A events with deal structure + counterparty
GET /v1/events/enforcement — enforcement actions with release metadata
GET /v1/events/voting-results — voting results events with proposals array (shaped)
GET /v1/compensation — executive compensation breakdown
GET /v1/owners/13f — 13F report with nested holdings[] (shaped)
GET /v1/funds/nport/holdings — N-PORT holdings with nested holdings[] (shaped)
GET /v1/companies/subsidiaries — subsidiary list with citation envelope (API-consistency, byte-neutral)

Citation + char-range spans

Two endpoints emit machine-verifiable citations on every result row:

GET /v1/sections/search
GET /v1/search/semantic

Each result carries the following fields top-level (additive, always-on):

Field	Type	Description
`accession`	string \| null	SEC accession number (e.g. `0000950170-24-012345`)
`section_key`	string \| null	Canonical section identifier (e.g. `item_1a`, `item_7`, `risk_factors`)
`char_start`	int \| null	Inclusive start offset into the section text (see reference frame below)
`char_end`	int \| null	Exclusive end offset — `0 ≤ char_start < char_end ≤ len(section_text)`
`highlighted_snippet`	string \| null	±150 char window around the match, with query terms wrapped in `…`, sentence-boundary truncated, capped at 320 chars
`source_url`	string \| null	Public SEC.gov filing URL — usable directly in the browser
`ticker`	string \| null	Issuer ticker, when known

Reference frame

char_start and char_end index into the section markdown text (the canonical plaintext extraction stored in section_snippets.content_md), not the raw filing HTML. This frame is stable across reparses and meaningful without fetching the multi-MB filing HTML body. To re-fetch the original section text outside the response, request the section by (accession, section_key) from GET /v1/filings/{accession_number}/sections/{section_key}, and slice the returned contentMd from char_start to char_end.

Null-fallback semantics

When the validator can’t produce a span (no section text available for the row or offsets out of bounds), the response drops char_start, char_end, and highlighted_snippet to null but keeps accession, section_key, source_url, and ticker populated. A _citation_degraded field records the reason ("no_section_text" | "offsets_invalid"). Agents should switch on char_start === null rather than "char_start" in result.

`?view=agent` interaction

Citation fields are emitted on every result row regardless of ?view=. In agent mode (?view=agent), score and retrievalMode are dropped from /v1/search/semantic results — the citation surface is preserved.

REST vs MCP shape conventions

REST and MCP serve different consumer profiles, and Datastream optimizes for each:

REST endpoints support ?view={default,compact,agent}. The agent view is the wire-byte-optimized projection — large prose drops out, the dilution verification block compresses to the canonical { confidence, crossValidationsPassed, sourceSpanResolved, modelVersion } summary, and the response shrinks 30–95% depending on the endpoint. Use it from SDK clients that want low-latency / low-token responses to forward to a UI or downstream agent.
MCP tools always return the full payload (verification block, full provenance chain, all citation pointers). MCP context is already an agent context — round-tripping through compact-then-fetch wastes tokens compared to surfacing full evidence on the first call. Agents reasoning over MCP decide what to surface to the user.

This split is intentional: REST optimizes for the wire (network bytes); MCP optimizes for the model (in-context evidence). For OMNI-3089’s dilution endpoints, the rule is concrete:

GET /v1/dilution/events?view=agent → returns summarizeVerification(row.verification) (the contract from packages/contracts/src/verification.ts).
dilution.events.list MCP tool → returns the full row including the unmodified verification block. No view argument is accepted.

Substring-match heuristic

The match span is anchored by the longest query token, falling back to the snippet’s leading words and finally to a section-head citation. Multi-occurrence ambiguity is resolved by earliest occurrence. The highlighted_snippet window is the stable user surface; the precise [char_start, char_end) may move slightly as upstream rerankers improve — agents should rely on the bolded snippet for display and on the offsets only for programmatic slicing.

Versioning

production clients should send omni-version and pin to a dated release
additive fields can ship within a pinned version without requiring a client rebuild
breaking behavior changes should land behind a new dated version and be called out in /docs/changelog
migration guides should be updated in the same change when a compatibility seam is moved or removed

Error model

Error responses use a stable JSON shape:

{
  "object": "error",
  "id": "err_...",
  "code": "ownership_lookup_failed",
  "type": "api_error",
  "message": "Human-readable explanation",
  "requestId": "req_...",
  "details": {}
}

Auth conventions

machine access uses x-api-key
MCP protected-resource metadata is available before WorkOS is fully configured
OAuth authorization-server metadata intentionally returns an explicit 503 workos_authorization_server_unavailable until real WorkOS values are present

Documentation Index

​API conventions

​Request correlation

​Metering

​MCP JSON-RPC quota envelope

​Cost and token headers

​Opting in to exact token counts (OMNI-3428)

​Response formats

​Endpoints that support ?view=agent

​Citation + char-range spans

​Reference frame

​Null-fallback semantics

​?view=agent interaction

​REST vs MCP shape conventions

​Substring-match heuristic

​Versioning

​Error model

​Auth conventions