API Reference

Retrieve

Retrieve the most relevant cited passages for a query. This is the core endpoint — it runs semantic search over the corpus, reranks, and returns source-attributed chunks ready to feed to your model.

POST/v1/retrieve

Send a natural-language query with optional filters and retrieval options. The response is structured JSON — never a generated answer.

Request body#

querystringrequired
The natural-language search query, e.g. “gross margin guidance for next quarter”.
filtersobjectoptional
Optional scoping by company and period. See filters fields below.
top_knumberoptional
Number of chunks to return, 150. Defaults to 10, and is clamped to your plan’s maximum (Free 10, Pro 25, Enterprise 50) — a higher value is silently capped, never rejected.
rerankbooleanoptional
Controls relevance reranking. When true (default), results are semantically reranked over a wider candidate pool and recency-boosted so the most relevant and recent passages rank first. Set false to skip reranking entirely and return chunks in raw vector-similarity order — faster, and useful when you want to apply your own ranking.
include_segmentsbooleanoptional
When true, each chunk’s source includes a segments array — the underlying document segments with character offsets — so you can map a citation back to an exact span for highlighting. Defaults to false.

filters#

tickersstring[]optional
Restrict to one or more US tickers, e.g. ["AAPL"] or ["AMD", "NVDA"].
yearnumberoptional
Fiscal year, e.g. 2025.
quarter"Q1" | "Q2" | "Q3" | "Q4"optional
Fiscal quarter. Combine with year for a precise period.
source_typesstring[]optional
Restrict to document types, e.g. ["earnings_call"].

Example request#

request.json
{
  "query": "What did management say about data center capex?",
  "filters": {
    "tickers": ["NVDA"],
    "year": 2025,
    "quarter": "Q1",
    "source_types": ["earnings_call"]
  },
  "top_k": 8,
  "rerank": true,
  "include_segments": false
}

Response#

Returns a chunks array and a meta object. Each chunk is a passage with provenance you can cite directly.

chunk#

idstringrequired
Stable identifier within this response, e.g. chunk_01. Useful for citation markers.
textstringrequired
The full passage text. Feed this to your model.
scorenumberoptional
Relevance score (cosine distance — lower is more similar).
evidenceTextstringoptional
The one-to-three sentence span most relevant to the query, for highlighting.
sourceobjectrequired
Document provenance — see fields below.

source#

documentIdstringrequired
Stable identifier for the source document. Use it with the Documents API to fetch the full document or its segments.
documentTitlestringrequired
Human-readable document title.
documentTypestringrequired
Currently "earnings_call".
tickerstring | nulloptional
Company ticker.
yearnumber | nulloptional
Fiscal year of the document.
quarterstring | nulloptional
Fiscal quarter, e.g. "Q1".
filingTypestring | nulloptional
Reserved for future document types; null for transcripts.
sourceUrlstring | nulloptional
Link to the primary source document.
pageNumbersnumber[]optional
Page numbers the chunk spans, for paginated source types. Omitted for earnings-call transcripts (which have no page structure).
segmentsobject[]optional
Present only when include_segments is true. The document segments behind this chunk, each with id, sequence, content, and charStart/charEnd offsets for highlighting. See below.

source.segments#

Returned per chunk only when you pass include_segments: true. Each entry is a contiguous segment of the source document.

idstringrequired
Segment identifier within the document.
sequencenumberrequired
Zero-based position of the segment within the document, in reading order.
contentstringrequired
The segment text.
charStartnumberrequired
Start character offset of this segment within the reconstructed document text.
charEndnumberrequired
End character offset (exclusive) of this segment.

Example response#

response.json
{
  "chunks": [
    {
      "id": "chunk_01",
      "text": "Data center revenue grew 25% sequentially to a record $22.6 billion, driven by Hopper demand...",
      "score": 0.18,
      "evidenceText": "Data center revenue grew 25% sequentially to a record $22.6 billion.",
      "source": {
        "documentId": "doc_8f2a1c",
        "documentTitle": "NVIDIA Q1 2025 Earnings Call",
        "documentType": "earnings_call",
        "ticker": "NVDA",
        "year": 2025,
        "quarter": "Q1",
        "filingType": null,
        "sourceUrl": null
      }
    }
  ],
  "meta": {
    "total": 8,
    "periodMismatch": null,
    "requestId": "req_a1b2c3"
  }
}

Highlighting with segments#

Pass include_segments: true to get the underlying document segments on each chunk’s source, with character offsets you can use to highlight the exact span inside the full document.

source.segments
"segments": [
  {
    "id": "seg_0",
    "sequence": 0,
    "content": "Thanks, and good afternoon, everyone.",
    "charStart": 0,
    "charEnd": 37
  },
  {
    "id": "seg_1",
    "sequence": 1,
    "content": "Data center revenue grew 25% sequentially to a record $22.6 billion.",
    "charStart": 39,
    "charEnd": 107
  }
]
How offsets are computed
charStart and charEnd are computed at read time by concatenating the document’s segments in sequence order, joined with a \n\n separator. They are positions into that reconstructed text — not stored columns — so use the same join when you reassemble the document, and treat charEnd as exclusive.

Period fallback#

If you request a specific year and quarter that isn’t in the corpus, the API does not return empty. It serves the nearest prior period and flags it in meta.periodMismatch — because forward-looking guidance for, say, Q4 often lives in the Q3 call.

meta.periodMismatch
"periodMismatch": {
  "requested": "Q4 2025",
  "served": ["AMD Q3 2025"],
  "message": "No document for Q4 2025 exists in the corpus. The returned chunks come from the nearest prior period(s): AMD Q3 2025..."
}
Honor the notice
When periodMismatch is present, the chunks are from a different period than you asked for. Surface this to your users and instruct your model not to present prior-period evidence as the requested period.

Errors#

Errors use the standard envelope ({ success: false, error: {...} }). Common statuses: 400 (invalid request body), 401 (bad API key), 429 (rate limited). See Errors.