Orsa
Docs
WEB EXTRACTION

Every URL on
any domain.

Discover the full shape of a website in one call. Recursive index handling, gzip, and malformed XML are handled silently.

Try the endpointRead the docs
https://
Sign up for free — access all APIs

Discover all URLs from a website's sitemap.

Start free, no card required, most teams ship a first integration in under ten minutes.

Why this endpoint exists

Not another JSON page.
A production workflow in one call

One call in. The full URL graph out.

Without Orsa

Before Orsa

  • A browser worker, proxy account, parser, retry queue, and schema contract for one endpoint.
  • Product teams wait while platform teams debug website-specific failures.
  • Every new data field becomes another brittle scraper branch.
With Orsa

After Orsa

  • One call in. The full URL graph out. With retries, rendering, and validation handled behind the API.
  • The response shape is typed, documented, and ready for product code.
  • Teams combine it with adjacent Orsa endpoints without adding vendors.
How it works

Teach the workflow,
then show the endpoint

Point Orsa at a url, let the platform handle the web work, and receive a response your product can trust.

Input01

Send a url

nytimes.com

Orsa runtime02

Render, retry, enrich

Orsa handles browser execution, proxy escalation, parsing, validation, caching, and typed response shaping behind the API.

Output03

One call in. The full URL graph out.

Use the result in crawl planning without owning the extraction stack.

REST path
/api/v1/web/scrape/sitemapSame endpoint used by the SDK examples below.
Input shape
URLnytimes.com
SDKs
TypeScript, Python, cURLStart with TypeScript, Python, or direct cURL.
Best first use
Crawl planningSeed crawls with the real surface area of a site, not just the homepage.
Production numbers

Performance your product
can actually plan around

Every endpoint page should answer the practical buying question: will this hold up once it leaves the demo?

p50 latency410msMeasured as production API latency, not a static mock.
p99 latency1.9sBuilt for the long tail of real websites and crawler paths.
quality barHigh recall on standard and nested sitemap indexesHigh recall on standard and nested sitemap indexes
credits per call1 keySelf-serve usage with predictable metering and no scraping infrastructure to own.
Feature layer

More than the response.
The operating layer behind it

Stripe pages teach the system around the API: inputs, retries, observability, adjacent products, and the code path. This section does the same for Orsa endpoints.

  • Typed response contracts for product code and AI tools.
  • Browser, proxy, cache, and validation logic handled by Orsa.
  • Direct fit for crawl planning and seo ops.

Endpoint

/api/v1/web/scrape/sitemap

Example input

nytimes.com

Promise

One call in. The full URL graph out.

Pairs with

Crawl Website, Scrape Markdown, Scrape HTML

Built for the job

What teams ship with
every url on any domain.

Each product page now speaks to the real workflow behind the endpoint, with concrete jobs instead of a generic feature list.

01

Crawl planning

Seed crawls with the real surface area of a site, not just the homepage.

02

SEO ops

Diff sitemaps over time to catch indexing regressions early.

03

Docs mirrors

Pull every docs path before you snapshot content to Markdown.

Implementation

Keep the code small.
Let Orsa do the messy part

Use the endpoint directly, then combine it with adjacent Orsa APIs as the workflow grows.

Get an API keyAPI reference
request
Responsetyped json
{
  "domain": "nytimes.com",
  "urls": [
    "https://www.nytimes.com/",
    "https://www.nytimes.com/section/world"
  ],
  "sources": ["https://www.nytimes.com/sitemap.xml"]
}
Combine with

Build the full workflow,
not another point solution

The best product integrations usually combine two or three Orsa endpoints behind one customer experience.

Crawl WebsiteScrape MarkdownScrape HTML
FAQ

The questions teams ask
before shipping

Short answers for the practical details: rendering, limits, freshness, and how this fits into production.

Get started

Put this endpoint
in your product today

Try the live endpoint, then wire the same response into your app with one API key.

Try this endpoint

One API key for every Orsa endpoint · No card required to start.

Orsa

Turn any URL into structured data. Instantly, with a single API.

Engineering essays, API updates, and customer stories. Weekly, no spam.

Products

  • Scrape HTML
  • Scrape Markdown
  • Scrape Images
  • Scrape Sitemap
  • Company Logos
  • Company Colors
  • Company Styleguide
  • Transaction Identification
  • Website Screenshot
  • View all 15 →

Use cases

  • Power LLMs with Web Context
  • Logo Link
  • Reduce Onboarding Friction
  • Automated Brand Kits
  • Programmatic Theming
  • Enrich Company Profiles
  • Transaction & Billing Data
  • Stock Ticker Enrichment
  • Zapier Automations

Compare

  • vs Brandfetch
  • vs Clearbit
  • vs Logo.dev
  • vs Klazify
  • vs RiteKit
  • vs Parget
  • All comparisons →

Developers

  • Documentation
  • Changelog
  • TypeScript SDK
  • Python SDK
  • Ruby SDK
  • MCP
  • Status
  • API reference
  • Playground

Free tools

  • AI-SEO / GEO Checker
  • Design Extractor for LLMs
  • Sitemap Generator
  • URL to Markdown
  • Open Graph Checker
  • Email Signature Generator
  • DNS Lookup
  • Twitter Card Validator
  • Brand Font Finder

Company

  • Blog
  • Customers
  • Pricing
  • FAQ
  • Contact
  • Security
  • Acceptable Use
  • Request Brand Update
  • Careers

© 2026 Orsa Workshops, Inc.

PrivacyTermsDPAChart the web
ORSA