LLM Integration Patterns | Firecrawl

RAG Pipelines

The most common Firecrawl integration pattern is feeding web content into retrieval augmented generation systems. Use Crawl to ingest entire documentation sites or knowledge bases, then chunk the markdown output and embed it into your vector database. Firecrawl's clean markdown output eliminates the need for custom HTML parsing and produces consistently structured content that chunks well.

Typical workflow: Map a site to discover all URLs, Batch Scrape the relevant pages, chunk the markdown output (Firecrawl returns content that respects heading hierarchy, making it easy to split by sections), embed chunks into Pinecone/Weaviate/Chroma, then retrieve relevant chunks when your LLM needs context.

AI Agent Web Access via MCP

Firecrawl provides a dedicated MCP (Model Context Protocol) server that gives AI coding agents like Claude Code, Cursor, and Windsurf direct access to web scraping. The setup is a single command:

npx -y firecrawl-cli@latest init --all --browser

Once configured, your AI agent can scrape any URL, search the web, crawl sites, and extract structured data as part of its normal workflow. The MCP server exposes all Firecrawl endpoints as tools: firecrawl_scrape, firecrawl_map, firecrawl_search, firecrawl_crawl, firecrawl_extract, firecrawl_agent, and browser management tools.

The remote hosted MCP URL is: https://mcp.firecrawl.dev/{API_KEY}/v2/mcp

Framework Integrations

Firecrawl has native integrations with the major LLM frameworks:

LangChain: FirecrawlLoader document loader for seamless integration with LangChain RAG chains
LlamaIndex: FirecrawlWebReader for ingesting web content into LlamaIndex indices
OpenAI: Function calling compatible responses for building OpenAI tool use agents

Structured Extraction for Databases

Use the Extract endpoint with Pydantic schemas to pull typed, structured data directly from web pages into your database. For example, extract product names, prices, and descriptions from e commerce sites into a consistent JSON format. This eliminates manual data cleaning and transformation steps.

Real Time Context for AI Assistants

Use the Search endpoint to give AI assistants access to current web information. When a user asks a question that requires up to date data, search the web via Firecrawl and include the scraped content in the LLM context. This is how many "search grounded" AI assistants work under the hood.