Skip to main content
Version: v2

Markdown Transformation Endpoint

Welcome to the documentation for ScrapingAnt's new endpoint which extracts data from websites and automatically converts the HTML output into Markdown format. This feature is particularly useful for leveraging results in Language Learning Models (LLMs) and Retrieval-Augmented Generation (RAG).

Endpoint Description

The Markdown Transformation Endpoint provides a seamless way to scrape web content, converting it directly from HTML to Markdown. This simplifies the process of integrating scraped data into text-based models and applications.

Features

  • HTML to Markdown Conversion: Automatically converts the extracted HTML content to Markdown, maintaining the essential structure and style in a simpler text format.
  • Easy Integration with LLMs and RAG: The output in Markdown format is ready to be used with various language models and retrieval systems without additional processing.

Pricing

The Markdown Transformation Endpoint is available as part of the ScrapingAnt API subscription plans. The cost of using this endpoint is based on the number of API credits consumed per request. The pricing details can be found in the API credits cost documentation.

Request Format

Endpoint

https://api.scrapingant.com/v2/markdown

Request data

Markdown Transformation Endpoint accepts the same request structure as the general endpoint. The request data should include the URL of the website to scrape and any additional parameters required for the extraction.

You can find more details about the request structure in the Request and response format documentation.

Only 2 parameters are required:

  • url - The URL of the website to scrape and convert to Markdown.
  • x-api-key - Your ScrapingAnt API key. You can find it in your ScrapingAnt account.

It also supports all the HTTP methods like GET, POST, PUT, DELETE.

Response Format

The Markdown Transformation Endpoint returns the extracted content in Markdown format. The response includes the following properties:

{
"url": "https://example.com",
"markdown": "# Heading 1\n\nThis is a paragraph of text.\n\n## Heading 2\n\nAnother paragraph of text."
}

The markdown property contains the extracted content in Markdown format, ready to be used in LLMs and RAG systems.

Example Usage

Here is an example of using the Markdown Transformation Endpoint to extract content from a website and convert it to Markdown:

import requests

api_key = "YOUR_SCRAPINGANT_API_KEY"
url = "https://example.com"

response = requests.get("https://api.scrapingant.com/v2/markdown", params={"url": url, "x-api-key": api_key})

if response.status_code == 200:
markdown_content = response.json()["markdown"]
print(markdown_content)
else:
print("Error:", response.text)