Skip to main content
RAG systems have multiple failure modes. When answers are wrong, the problem could be poor retrieval, LLM hallucination, or a weak system prompt. Without visibility into each step, you’re debugging blind. Observability helps you:
  • Find root causes — Determine if bad answers stem from missing chunks or generation errors.
  • Improve accuracy — Identify failing queries and adjust prompts, or the search configuration.
  • Track latency — Measure time spent in retrieval vs generation to optimize the right component.
  • Collect feedback — Use thumbs up/down signals from users to surface problem areas.
Agentset integrates with observability tools like Langfuse and Helicone to trace your RAG pipeline.

Tracing with Langfuse

Wrap your RAG function with the @observe decorator to automatically trace each invocation. LLM calls are traced automatically when using the Langfuse OpenAI wrapper.
import { Agentset } from "agentset";
import { generateText } from "ai";
import { openai } from "@ai-sdk/openai";

const agentset = new Agentset();
const ns = agentset.namespace("YOUR_NAMESPACE_ID");

async function ragBot(question: string) {
  const results = await ns.search(question);
  const context = results.map((r) => r.text).join("\n\n");

  const { text } = await generateText({
    model: openai("gpt-5.1"),
    system: `Answer based on this context:\n\n${context}`,
    prompt: question,
    experimental_telemetry: { isEnabled: true },
  });

  return text;
}

Logging search results

To inspect retrieval quality separately, log search results as a span within your trace.
import { Langfuse } from "langfuse";

const langfuse = new Langfuse();

async function ragBot(question: string) {
  const trace = langfuse.trace({ name: "rag-bot" });

  const retrievalSpan = trace.span({ name: "retrieval", input: question });
  const results = await ns.search(question);
  retrievalSpan.end({ output: results });

  // Continue with LLM call...
}

Tracing with Helicone

Helicone traces LLM calls through a proxy. Change your OpenAI base URL to route requests through Helicone.
import { Agentset } from "agentset";
import { generateText } from "ai";
import { createOpenAI } from "@ai-sdk/openai";

const agentset = new Agentset();
const ns = agentset.namespace("YOUR_NAMESPACE_ID");

const openai = createOpenAI({
  baseURL: "https://oai.helicone.ai/v1",
  headers: {
    "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
  },
});

async function ragBot(question: string) {
  const results = await ns.search(question);
  const context = results.map((r) => r.text).join("\n\n");

  const { text } = await generateText({
    model: openai("gpt-5.1"),
    system: `Answer based on this context:\n\n${context}`,
    prompt: question,
  });

  return text;
}

Next steps

  • Search — Configure search parameters
  • Ranking — Improve retrieval quality with reranking
  • Data Segregation — Isolate data for multi-tenant applications