Open Source vs. Cloud: What You Get with Each

AGPL 3.0 (open source) / Cloud hosted1 min read

Open Source (Self Hosted)

Firecrawl's core is open source under AGPL 3.0, available at github.com/firecrawl/firecrawl with 83,000+ stars and 138+ contributors. You can self host using Docker Compose. The self hosted version includes the basic scraping engine, markdown conversion, crawling logic, and API structure.

However, the self hosted version is explicitly described as "not fully ready for self hosted deployment" in the repository README. It is missing several production critical features that are only available in the cloud version.

Cloud Only Features

  • Fire engine: Firecrawl's proprietary scraping engine with significantly higher success rates on complex websites
  • Stealth proxies: Rotating proxy infrastructure with anti bot bypass capabilities
  • Actions: Click, scroll, type, and interact with pages before extraction
  • Dashboard and analytics: Usage tracking, credit management, and activity logs
  • Enhanced mode: Higher success rate proxies for difficult websites
  • Agent endpoint: The autonomous AI research agent
  • Browser Sandbox: Managed browser environments for AI agents
  • SOC II Type 2 compliance: Enterprise security certification

When to Self Host

Self hosting makes sense when you have strict data residency requirements, need to process sensitive content that cannot leave your infrastructure, or want to modify the scraping logic for specialized use cases. For most production workloads, the cloud API is recommended due to the significant feature gap and the operational overhead of maintaining scraping infrastructure.

Licensing Note

The core repository is AGPL 3.0, which requires that derivative works also be open sourced under AGPL 3.0 if distributed. The SDKs and some UI components use the more permissive MIT license. If AGPL is a concern for your use case, the cloud API avoids this issue entirely since you are consuming a service, not distributing the software.