Skip to main content
Version: v2

GitHub Action

ScrapingAnt GitHub Action lets you scrape any URL directly from your GitHub Actions workflows — get HTML, Markdown, or AI-extracted structured data with rotating proxies and headless Chrome.

Installation

No installation required. Add the action to any workflow file in your repository:

- name: Scrape a webpage
uses: scrapingant/scrape-action@v1
with:
api-key: ${{ secrets.SCRAPINGANT_API_KEY }}
url: 'https://example.com'

Credentials setup

  1. Register at ScrapingAnt and get your API key (10,000 free credits/month)
  2. In your GitHub repository, go to Settings > Secrets and variables > Actions
  3. Click New repository secret
  4. Name it SCRAPINGANT_API_KEY and paste your API key

Available operations

The action supports three output modes via the output-type input:

OperationAPI EndpointDescription
HTML (html)/v2/generalScrape a webpage and return raw HTML. Supports headless Chrome, datacenter/residential proxies, and country targeting.
Markdown (markdown)/v2/markdownConvert a webpage to clean Markdown — optimized for LLM and RAG pipelines.
AI Extract (extract)/v2/extractExtract structured data from any page using natural language. Describe what you need and get back JSON.

Usage examples

HTML scraping

- name: Scrape HTML
id: scrape
uses: scrapingant/scrape-action@v1
with:
api-key: ${{ secrets.SCRAPINGANT_API_KEY }}
url: 'https://example.com'
output-type: 'html'
browser: 'false'

- name: Use result
run: echo "${{ steps.scrape.outputs.content }}" | head -20

Markdown for LLM/AI

- name: Get page as Markdown
id: markdown
uses: scrapingant/scrape-action@v1
with:
api-key: ${{ secrets.SCRAPINGANT_API_KEY }}
url: 'https://example.com'
output-type: 'markdown'

- name: Feed to AI
run: echo "${{ steps.markdown.outputs.content }}"

AI data extraction

- name: Extract product data
id: extract
uses: scrapingant/scrape-action@v1
with:
api-key: ${{ secrets.SCRAPINGANT_API_KEY }}
url: 'https://www.amazon.com/dp/B0EXAMPLE'
output-type: 'extract'
extract-properties: 'product title, price, rating, availability'

- name: Use extracted JSON
run: echo "${{ steps.extract.outputs.content }}"

Save output to file

- name: Scrape and save
uses: scrapingant/scrape-action@v1
with:
api-key: ${{ secrets.SCRAPINGANT_API_KEY }}
url: 'https://example.com'
output-type: 'markdown'
output-file: 'scraped-content.md'

Inputs

InputRequiredDefaultDescription
api-keyYesScrapingAnt API key
urlYesURL to scrape
output-typeNohtmlResponse type: html, markdown, or extract
extract-propertiesNoComma-separated fields to extract (for extract mode)
browserNotrueEnable headless Chrome JS rendering
proxy-typeNodatacenterProxy type: datacenter or residential
proxy-countryNoTwo-letter country code for geo-targeting (e.g. us, uk, de)
timeoutNo60Max request time in seconds (5–60)
output-fileNoFile path to save output

Outputs

OutputDescription
contentScraped content (HTML, Markdown, or JSON depending on output-type)
status-codeHTTP status code from ScrapingAnt API
urlFinal URL after redirects (for markdown output-type)

Real-world use cases

  • Competitor price monitoring — schedule a cron workflow to scrape product pages daily and save results to your repo
  • Content change detection — scrape a page, compare with the previous version, and alert on diff
  • AI-powered data pipeline — scrape, extract structured data, and feed to an LLM in the next workflow step
  • SEO monitoring — scrape your pages as Markdown and check for content issues
  • Post-deploy verification — verify your site renders correctly after each deployment

Credit costs

Credit costs vary by rendering mode and proxy type. See the credit cost reference for current pricing.

browser: true (default) costs 10 credits per request. browser: false costs 1 credit per request. Only successful responses are charged.

Resources

Issues and tracking

In order to help us improve our service, please don't hesitate to create an issue or feature request at GitHub.