Chunk size
SetchunkSize to control the target number of characters each chunk contains. Smaller chunks are more precise, while larger chunks preserve more context. The default is 2048 characters.
Chunk boundaries are adjusted to preserve semantic coherence. Chunk sizes are designed to be close to the target value, but vary to achieve optimal splits.
Processing mode
Setmode to control the tradeoff between speed and accuracy when processing documents.
| Mode | Description |
|---|---|
fast | Fastest processing, suitable for simple documents |
balanced | Default. Good balance of speed and quality |
accurate | Best layout detection, ideal for complex documents with tables or figures |
LLM-assisted parsing
EnableuseLlm to improve extraction of tables, forms, inline math, and complex layouts. This is enabled by default.
Force OCR
SetforceOcr to run OCR even when selectable text exists. This is useful for scanned documents where the embedded text layer is unreliable.
Language
SpecifylanguageCode to optimize text processing for a specific language. If omitted, the language is detected automatically.
Combining options
Pass multiple config options together to customize processing.Next steps
- API Reference — Chunking parameters and options
- Document Metadata — Attach metadata for filtering and citations
- Search — Query your uploaded content
- Ranking — Configure result ranking