> ## Documentation Index
> Fetch the complete documentation index at: https://docs.agentset.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Observability

> Trace and monitor your RAG pipeline

RAG systems have multiple failure modes. When answers are wrong, the problem could be poor retrieval, LLM hallucination, or a weak system prompt. Without visibility into each step, you're debugging blind.

Observability helps you:

* **Find root causes** — Determine if bad answers stem from missing chunks or generation errors.
* **Improve accuracy** — Identify failing queries and adjust prompts, or the search configuration.
* **Track latency** — Measure time spent in retrieval vs generation to optimize the right component.
* **Collect feedback** — Use thumbs up/down signals from users to surface problem areas.

Agentset integrates with observability tools like [Langfuse](https://langfuse.com) and [Helicone](https://helicone.ai) to trace your RAG pipeline.

## Tracing with Langfuse

Wrap your RAG function with the `@observe` decorator to automatically trace each invocation. LLM calls are traced automatically when using the [Langfuse OpenAI wrapper](https://langfuse.com/docs/integrations/openai).

<CodeGroup>
  ```typescript TypeScript theme={null}
  import { Agentset } from "agentset";
  import { generateText } from "ai";
  import { openai } from "@ai-sdk/openai";

  const agentset = new Agentset();
  const ns = agentset.namespace("YOUR_NAMESPACE_ID");

  async function ragBot(question: string) {
    const results = await ns.search(question);
    const context = results.map((r) => r.text).join("\n\n");

    const { text } = await generateText({
      model: openai("gpt-5.1"),
      system: `Answer based on this context:\n\n${context}`,
      prompt: question,
      experimental_telemetry: { isEnabled: true },
    });

    return text;
  }
  ```

  ```python Python theme={null}
  from langfuse.openai import openai
  from langfuse import observe
  from agentset import Agentset

  client = Agentset(namespace_id="YOUR_NAMESPACE_ID")

  @observe()
  def rag_bot(question: str):
      results = client.search.execute(query=question)
      context = "\n\n".join([r.text for r in results.data])

      response = openai.responses.create(
          model="gpt-5.1",
          input=[
              {"role": "system", "content": f"Answer based on this context:\n\n{context}"},
              {"role": "user", "content": question},
          ],
      )

      return response.output_text
  ```
</CodeGroup>

## Logging search results

To inspect retrieval quality separately, log search results as a span within your trace.

<CodeGroup>
  ```typescript TypeScript theme={null}
  import { Langfuse } from "langfuse";

  const langfuse = new Langfuse();

  async function ragBot(question: string) {
    const trace = langfuse.trace({ name: "rag-bot" });

    const retrievalSpan = trace.span({ name: "retrieval", input: question });
    const results = await ns.search(question);
    retrievalSpan.end({ output: results });

    // Continue with LLM call...
  }
  ```

  ```python Python theme={null}
  from langfuse import observe, get_client

  langfuse = get_client()

  @observe()
  def rag_bot(question: str):
      with langfuse.start_as_current_observation(name="retrieval", input=question) as span:
          results = client.search.execute(query=question)
          span.update(output=[{"text": r.text, "score": r.score} for r in results.data])

      # Continue with LLM call...
  ```
</CodeGroup>

## Tracing with Helicone

Helicone traces LLM calls through a proxy. Change your OpenAI base URL to route requests through Helicone.

<CodeGroup>
  ```typescript TypeScript theme={null}
  import { Agentset } from "agentset";
  import { generateText } from "ai";
  import { createOpenAI } from "@ai-sdk/openai";

  const agentset = new Agentset();
  const ns = agentset.namespace("YOUR_NAMESPACE_ID");

  const openai = createOpenAI({
    baseURL: "https://oai.helicone.ai/v1",
    headers: {
      "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
    },
  });

  async function ragBot(question: string) {
    const results = await ns.search(question);
    const context = results.map((r) => r.text).join("\n\n");

    const { text } = await generateText({
      model: openai("gpt-5.1"),
      system: `Answer based on this context:\n\n${context}`,
      prompt: question,
    });

    return text;
  }
  ```

  ```python Python theme={null}
  from openai import OpenAI
  from agentset import Agentset
  import os

  client = Agentset(namespace_id="YOUR_NAMESPACE_ID")

  openai = OpenAI(
      base_url="https://oai.helicone.ai/v1",
      default_headers={"Helicone-Auth": f"Bearer {os.environ['HELICONE_API_KEY']}"},
  )

  def rag_bot(question: str):
      results = client.search.execute(query=question)
      context = "\n\n".join([r.text for r in results.data])

      response = openai.responses.create(
          model="gpt-5.1",
          input=[
              {"role": "system", "content": f"Answer based on this context:\n\n{context}"},
              {"role": "user", "content": question},
          ],
      )

      return response.output_text
  ```
</CodeGroup>

## Next steps

* [Search](/search-and-retrieval/search) — Configure search parameters
* [Ranking](/search-and-retrieval/ranking) — Improve retrieval quality with reranking
* [Data Segregation](/production/data-segregation) — Isolate data for multi-tenant applications
