Orsa
Docs
WEB EXTRACTION

Any URL,
clean Markdown.

Orsa's Scrape Markdown endpoint turns any webpage into LLM-ready Markdown fast. We handle proxy escalation, JS rendering, and HTML-to-Markdown conversion — you get text you can paste into prompts, vector stores, or docs.

Try the endpointRead the docs
https://
Sign up for free — access all APIs

Scrape any webpage and get clean markdown content.

Start free, no card required, most teams ship a first integration in under ten minutes.

Why this endpoint exists

Not another JSON page.
A production workflow in one call

One call in. Markdown out.

Without Orsa

Before Orsa

  • A browser worker, proxy account, parser, retry queue, and schema contract for one endpoint.
  • Product teams wait while platform teams debug website-specific failures.
  • Every new data field becomes another brittle scraper branch.
With Orsa

After Orsa

  • One call in. Markdown out. With retries, rendering, and validation handled behind the API.
  • The response shape is typed, documented, and ready for product code.
  • Teams combine it with adjacent Orsa endpoints without adding vendors.
How it works

Teach the workflow,
then show the endpoint

Point Orsa at a url, let the platform handle the web work, and receive a response your product can trust.

Input01

Send a url

https://notion.com/blog/introducing-projects

Orsa runtime02

Render, retry, enrich

Orsa handles browser execution, proxy escalation, parsing, validation, caching, and typed response shaping behind the API.

Output03

One call in. Markdown out.

Use the result in rag knowledge bases without owning the extraction stack.

REST path
/api/v1/web/scrape/markdownSame endpoint used by the SDK examples below.
Input shape
URLhttps://notion.com/blog/introducing-projects
SDKs
TypeScript, Python, cURLStart with TypeScript, Python, or direct cURL.
Best first use
RAG knowledge basesCrawl a docs site, convert every page to Markdown, chunk it, embed it — Orsa handles capture and cleanup.
Production numbers

Performance your product
can actually plan around

Every endpoint page should answer the practical buying question: will this hold up once it leaves the demo?

p50 latency890msMeasured as production API latency, not a static mock.
p99 latency3.2sBuilt for the long tail of real websites and crawler paths.
quality bar98.6% extraction cleanliness98.6% extraction cleanliness (production sample)
credits per call1 keySelf-serve usage with predictable metering and no scraping infrastructure to own.
Feature layer

More than the response.
The operating layer behind it

Stripe pages teach the system around the API: inputs, retries, observability, adjacent products, and the code path. This section does the same for Orsa endpoints.

  • Typed response contracts for product code and AI tools.
  • Browser, proxy, cache, and validation logic handled by Orsa.
  • Direct fit for rag knowledge bases and ai agent context.

Endpoint

/api/v1/web/scrape/markdown

Example input

https://notion.com/blog/introducing-projects

Promise

One call in. Markdown out.

Pairs with

Scrape Sitemap, Crawl Website, Scrape HTML

Built for the job

What teams ship with
any url, clean markdown.

Each product page now speaks to the real workflow behind the endpoint, with concrete jobs instead of a generic feature list.

01

RAG knowledge bases

Crawl a docs site, convert every page to Markdown, chunk it, embed it — Orsa handles capture and cleanup.

02

AI agent context

When your agent needs to read a webpage, Markdown is the format that actually works with LLMs.

03

Content migration

Point Orsa at your sitemap and get clean Markdown for every post without maintaining a scraper.

Implementation

Keep the code small.
Let Orsa do the messy part

Use the endpoint directly, then combine it with adjacent Orsa APIs as the workflow grows.

Get an API keyAPI reference
request
Responsetyped json
{
  "url": "https://notion.com/blog/introducing-projects",
  "title": "Introducing Projects",
  "markdown": "# Introducing Projects\n\nProjects is the new way...",
  "word_count": 1247,
  "reading_time_seconds": 312,
  "published_at": "2026-01-14T09:00:00Z",
  "language": "en"
}
Combine with

Build the full workflow,
not another point solution

The best product integrations usually combine two or three Orsa endpoints behind one customer experience.

Scrape SitemapCrawl WebsiteScrape HTML
FAQ

The questions teams ask
before shipping

Short answers for the practical details: rendering, limits, freshness, and how this fits into production.

Get started

Put this endpoint
in your product today

Try the live endpoint, then wire the same response into your app with one API key.

Try this endpoint

One API key for every Orsa endpoint · No card required to start.

Orsa

Turn any URL into structured data. Instantly, with a single API.

Engineering essays, API updates, and customer stories. Weekly, no spam.

Products

  • Scrape HTML
  • Scrape Markdown
  • Scrape Images
  • Scrape Sitemap
  • Company Logos
  • Company Colors
  • Company Styleguide
  • Transaction Identification
  • Website Screenshot
  • View all 15 →

Use cases

  • Power LLMs with Web Context
  • Logo Link
  • Reduce Onboarding Friction
  • Automated Brand Kits
  • Programmatic Theming
  • Enrich Company Profiles
  • Transaction & Billing Data
  • Stock Ticker Enrichment
  • Zapier Automations

Compare

  • vs Brandfetch
  • vs Clearbit
  • vs Logo.dev
  • vs Klazify
  • vs RiteKit
  • vs Parget
  • All comparisons →

Developers

  • Documentation
  • Changelog
  • TypeScript SDK
  • Python SDK
  • Ruby SDK
  • MCP
  • Status
  • API reference
  • Playground

Free tools

  • AI-SEO / GEO Checker
  • Design Extractor for LLMs
  • Sitemap Generator
  • URL to Markdown
  • Open Graph Checker
  • Email Signature Generator
  • DNS Lookup
  • Twitter Card Validator
  • Brand Font Finder

Company

  • Blog
  • Customers
  • Pricing
  • FAQ
  • Contact
  • Security
  • Acceptable Use
  • Request Brand Update
  • Careers

© 2026 Orsa Workshops, Inc.

PrivacyTermsDPAChart the web
ORSA