Integrations

Claude / LLM Pipelines

FocusAlpha returns retrieval, not answers — so it drops into any RAG pipeline. Retrieve cited chunks, build a grounded prompt, and let your own model do the reasoning.

This is the core pattern: call POST /v1/retrieve, format the returned chunks as context, and pass them to your LLM with an instruction to cite. Because every chunk carries a source, you can render citations and link back to the primary document.

1. Retrieve#

Scope the query with filters when you know the company or period.

retrieve.tsts

async function retrieve(query, filters) {
  const res = await fetch("https://api.focusalpha.ai/v1/retrieve", {
    method: "POST",
    headers: {
      Authorization: `Bearer ${process.env.FOCUSALPHA_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ query, filters, top_k: 8 }),
  });
  if (!res.ok) throw new Error(`retrieve failed: ${res.status}`);
  return (await res.json()).chunks;
}

2. Build grounded context#

Number each chunk so the model can cite by index, and keep the source alongside so you can resolve a citation back to its document in your UI.

context.tsts

function buildContext(chunks) {
  return chunks
    .map((c, i) => {
      const s = c.source;
      const cite = [s.ticker, s.quarter, s.year].filter(Boolean).join(" ");
      return `[${i + 1}] (${s.documentTitle} — ${cite})\n${c.text}`;
    })
    .join("\n\n");
}

3. Prompt your model#

Pass the context and instruct the model to ground its answer and cite by index. This example uses Claude; the same shape works with any model.

answer.tsts

import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic();

async function answer(question, filters) {
  const chunks = await retrieve(question, filters);
  const context = buildContext(chunks);

  const message = await anthropic.messages.create({
    model: "claude-opus-4-8",
    max_tokens: 1024,
    system:
      "Answer using ONLY the numbered sources. Cite claims as [n]. " +
      "If the sources do not answer the question, say so.",
    messages: [
      { role: "user", content: `Sources:\n\n${context}\n\nQuestion: ${question}` },
    ],
  });

  return { answer: message.content, sources: chunks.map((c) => c.source) };
}

Honor the period notice

If meta.periodMismatch is set, the chunks are from a different period than requested. Pass that into your system prompt so the model doesn’t present prior-period evidence as the requested period. See Period fallback.

Highlighting citations in your UI#

The model input only needs the chunk text and source above. But if you want to render the cited passage in context — highlighting the chunk inside the full source document — pass include_segments: true on the retrieve call. Each chunk’s source.segments then carries the document’s ordered segments, each with character offsets:

source.segmentsjson

"segments": [
  {
    "id": "3f7a1c9e-6b2d-4e81-9a0f-1c2d3e4f5a6b",
    "sequence": 0,
    "content": "Thank you, operator. Good afternoon, everyone...",
    "charStart": 0,
    "charEnd": 248
  },
  {
    "id": "a8e2b40d-5c19-4f73-8d6a-9b0c1d2e3f4a",
    "sequence": 1,
    "content": "Data center revenue grew 25% sequentially...",
    "charStart": 250,
    "charEnd": 612
  }
]

Each id is an opaque per-segment identifier — don’t parse it or assume a format. Order and reconstruct by sequence, not by id. Reconstruct the document by joining content in sequence order with a \n\n separator; charStart/charEnd are absolute offsets into that reconstructed text, so you can map a citation to an exact span and highlight it. The offsets are computed at read time from the segment content, never stored, so they always match what you render. For the full document body on its own, see the Documents API.

Off by default

include_segments adds work and payload, so it’s opt-in. The default retrieve call returns no segments — request them only when you’re rendering source context. It’s independent of enrich.

Tips for good grounding#

Use filters to keep context on-topic — a tighter retrieval beats a larger, noisier one. Pass the full chunk text to the model (it’s already the right granularity), and use evidenceText for highlighting in your UI rather than as the model input. Keep top_k modest (5–10) for focused answers; raise it only when you need breadth.

Prefer the model to retrieve itself?

If you work in Claude Desktop or Claude Code, the MCP server lets Claude call retrieval as a tool — no pipeline code required.

PreviousMCP Server

Next Financial Statements