Firecrawl
Authentication Type: API Key
Description: Extract structured data from websites using AI. Turn any website into LLM-ready data with a single API call.
Extract
Extract structured data from web pages using advanced AI models.
Synchronous Extract
Extract structured data from one or more URLs synchronously. Ideal for single pages or small datasets.
Operation Type: Mutation (Write)
Parameters:
- urls
array of strings
(required): Array of URLs to extract from. For best results, use only 1 URL. Supports wildcards with /* - prompt
string
(required): Natural language prompt describing the data you want to extract. Use empty string if not needed - enableWebSearch
boolean
(required): When true, extraction can follow links outside the specified domain. Defaults to false - agent
object
(nullable): Agent configuration for complex extraction tasks. Use null if not needed- model
string
: The AI agent model to use for complex extraction tasks. Options: "FIRE-1"
- model
Returns:
- The extracted structured data (flexible JSON structure based on your prompt)
Example Usage:
{
"urls": ["https://techcrunch.com/2024/01/15/ai-startup-funding"],
"prompt": "Extract the article title, author, publication date, main content, and any mentioned company names with their funding amounts",
"enableWebSearch": false,
"agent": {
"model": "FIRE-1"
}
}
Asynchronous Extract
Start an asynchronous extraction job for large datasets or multiple URLs. Returns a job ID for tracking progress.
Operation Type: Mutation (Write)
Parameters:
- prompt
string
(required): Natural language prompt describing the data you want to extract. Use empty string if not needed - enableWebSearch
boolean
(required): When true, extraction can follow links outside the specified domain. Defaults to false - agent
object
(nullable): Agent configuration for complex extraction tasks. Use null if not needed- model
string
: The AI agent model to use for complex extraction tasks. Options: "FIRE-1"
- model
- urls
array of strings
(required): Array of URLs to extract from. For best results, use only 1 URL. Supports wildcards with /*
Returns:
- id
string
: Job ID for tracking the extraction - status
string
: Current job status. Options: "completed", "processing", "failed", "cancelled" - expiresAt
string
: ISO timestamp when the job expires
Example Usage:
{
"prompt": "Extract all product names, prices, descriptions, and availability status from e-commerce product pages",
"enableWebSearch": false,
"agent": {
"model": "FIRE-1"
},
"urls": ["https://example-store.com/products/*"]
}
Check Extraction Status
Check the status and retrieve results of an asynchronous extraction job using the job ID.
Operation Type: Query (Read)
Parameters:
- jobId
string
(required): The job ID returned from async extraction
Returns:
- success
boolean
: Whether the job completed successfully - data
any
(nullable): The extracted structured data (if job completed) - status
string
: Current job status. Options: "completed", "processing", "failed", "cancelled" - expiresAt
string
: ISO timestamp when the job expires - error
string
(nullable): Error message if job failed - warning
string
(nullable): Warning message if applicable
Example Usage:
{
"jobId": "job_abc123def456"
}
Extract from Prompt
Extract data using only a natural language prompt without specific URLs. Firecrawl will find and extract from relevant pages.
Operation Type: Mutation (Write)
Parameters:
- prompt
string
(required): Natural language prompt for extraction without specific URLs - enableWebSearch
boolean
(required): When true, extraction can search for relevant URLs. Defaults to false - agent
object
(nullable): Agent configuration for complex extraction tasks. Use null if not needed- model
string
: The AI agent model to use for complex extraction tasks. Options: "FIRE-1"
- model
Returns:
- success
boolean
: Whether the extraction was successful - data
any
: The extracted structured data
Example Usage:
{
"prompt": "Find the latest earnings reports from major tech companies and extract revenue, profit, and growth metrics for Q4 2023",
"enableWebSearch": true,
"agent": {
"model": "FIRE-1"
}
}
Common Use Cases
Market Research and Competitive Analysis:
- Extract product information, pricing, and specifications from competitor websites for market analysis
- Monitor news articles and press releases for industry trends and company announcements
- Gather customer reviews and ratings from multiple platforms for sentiment analysis
Content Aggregation and Monitoring:
- Extract articles, blog posts, and news content for content curation and research purposes
- Monitor specific websites for changes in pricing, product availability, or policy updates
- Aggregate job listings from multiple career sites with detailed position information and requirements
Data Collection for AI and Analytics:
- Extract structured data from unstructured web pages for training datasets and machine learning models
- Gather financial data, stock information, and market metrics from financial news and reporting sites
- Collect real estate listings with detailed property information for market analysis and valuation models
Automated Business Intelligence:
- Extract contact information and company details from business directories and professional networks
- Monitor regulatory changes and compliance updates from government and industry websites
- Gather event information, conference details, and industry announcements for business planning and networking