Single URL
Ingest a specific web page by providing its URL.Crawling
Crawl a website to ingest multiple pages automatically. Agentset follows links from a starting URL and processes each page it discovers.Basic crawl
Provide a starting URL to crawl a website.Crawl options
Control how the crawler navigates your site with theoptions parameter.
| Option | Type | Default | Description |
|---|---|---|---|
maxDepth | number | 5 | How many links deep to follow from the starting URL. Depth 1 crawls only the initial page. |
limit | number | 50 | Maximum number of pages to crawl. |
includePaths | string[] | — | Only crawl URLs matching these path prefixes. |
excludePaths | string[] | — | Skip URLs matching these path prefixes. |
headers | object | — | Custom HTTP headers to send with requests. |
Limiting depth and pages
SetmaxDepth and limit to control the scope of your crawl.
Filtering paths
UseincludePaths to crawl only specific sections, or excludePaths to skip certain areas.
Authenticated crawling
Pass custom headers to crawl pages that require authentication.With metadata
Attach metadata to ingested pages for filtering during search.URL ingestion and crawls are processed asynchronously. Learn how to check upload status.
Next steps
- API Reference — Crawl parameters and options
- Document Metadata — Learn more about metadata filtering
- Upload Status — Monitor crawl progress
- Search — Query your crawled content