GitHub Action
ScrapingAnt GitHub Action lets you scrape any URL directly from your GitHub Actions workflows — get HTML, Markdown, or AI-extracted structured data with rotating proxies and headless Chrome.
Installation
No installation required. Add the action to any workflow file in your repository:
- name: Scrape a webpage
uses: scrapingant/scrape-action@v1
with:
api-key: ${{ secrets.SCRAPINGANT_API_KEY }}
url: 'https://example.com'
Credentials setup
- Register at ScrapingAnt and get your API key (10,000 free credits/month)
- In your GitHub repository, go to Settings > Secrets and variables > Actions
- Click New repository secret
- Name it
SCRAPINGANT_API_KEYand paste your API key
Available operations
The action supports three output modes via the output-type input:
| Operation | API Endpoint | Description |
|---|---|---|
HTML (html) | /v2/general | Scrape a webpage and return raw HTML. Supports headless Chrome, datacenter/residential proxies, and country targeting. |
Markdown (markdown) | /v2/markdown | Convert a webpage to clean Markdown — optimized for LLM and RAG pipelines. |
AI Extract (extract) | /v2/extract | Extract structured data from any page using natural language. Describe what you need and get back JSON. |
Usage examples
HTML scraping
- name: Scrape HTML
id: scrape
uses: scrapingant/scrape-action@v1
with:
api-key: ${{ secrets.SCRAPINGANT_API_KEY }}
url: 'https://example.com'
output-type: 'html'
browser: 'false'
- name: Use result
run: echo "${{ steps.scrape.outputs.content }}" | head -20
Markdown for LLM/AI
- name: Get page as Markdown
id: markdown
uses: scrapingant/scrape-action@v1
with:
api-key: ${{ secrets.SCRAPINGANT_API_KEY }}
url: 'https://example.com'
output-type: 'markdown'
- name: Feed to AI
run: echo "${{ steps.markdown.outputs.content }}"
AI data extraction
- name: Extract product data
id: extract
uses: scrapingant/scrape-action@v1
with:
api-key: ${{ secrets.SCRAPINGANT_API_KEY }}
url: 'https://www.amazon.com/dp/B0EXAMPLE'
output-type: 'extract'
extract-properties: 'product title, price, rating, availability'
- name: Use extracted JSON
run: echo "${{ steps.extract.outputs.content }}"
Save output to file
- name: Scrape and save
uses: scrapingant/scrape-action@v1
with:
api-key: ${{ secrets.SCRAPINGANT_API_KEY }}
url: 'https://example.com'
output-type: 'markdown'
output-file: 'scraped-content.md'
Inputs
| Input | Required | Default | Description |
|---|---|---|---|
api-key | Yes | — | ScrapingAnt API key |
url | Yes | — | URL to scrape |
output-type | No | html | Response type: html, markdown, or extract |
extract-properties | No | — | Comma-separated fields to extract (for extract mode) |
browser | No | true | Enable headless Chrome JS rendering |
proxy-type | No | datacenter | Proxy type: datacenter or residential |
proxy-country | No | — | Two-letter country code for geo-targeting (e.g. us, uk, de) |
timeout | No | 60 | Max request time in seconds (5–60) |
output-file | No | — | File path to save output |
Outputs
| Output | Description |
|---|---|
content | Scraped content (HTML, Markdown, or JSON depending on output-type) |
status-code | HTTP status code from ScrapingAnt API |
url | Final URL after redirects (for markdown output-type) |
Real-world use cases
- Competitor price monitoring — schedule a cron workflow to scrape product pages daily and save results to your repo
- Content change detection — scrape a page, compare with the previous version, and alert on diff
- AI-powered data pipeline — scrape, extract structured data, and feed to an LLM in the next workflow step
- SEO monitoring — scrape your pages as Markdown and check for content issues
- Post-deploy verification — verify your site renders correctly after each deployment
Credit costs
Credit costs vary by rendering mode and proxy type. See the credit cost reference for current pricing.
browser: true (default) costs 10 credits per request. browser: false costs 1 credit per request. Only successful responses are charged.
Resources
Issues and tracking
In order to help us improve our service, please don't hesitate to create an issue or feature request at GitHub.