Skip to main content
The AI category brings together actors that feed, augment, or evaluate AI systems. You’ll find website crawlers optimized for LLM ingestion, transcript extractors that produce LLM-ready output, search scrapers that ground agents in real-time data, and purpose-built RAG construction tools. This is the foundation layer for most AI-native agent architectures.
This category contains 2,887 APIs, updated daily from Apify’s marketplace.

Top APIs in this category

APIs are ranked by Bayesian quality score, which balances average rating with review volume to surface consistently high-quality actors — not just ones with a single glowing review.
APIRatingDescription
Website Content Crawler⭐ 4.38 (179)Crawls websites and extracts clean Markdown text to feed LLMs, vector databases, and RAG pipelines. Integrates with LangChain and LlamaIndex.
Google Search Results Scraper⭐ 4.81 (115)Scrapes Google SERPs including organic results, AI overviews, ads, People Also Ask, and prices across any country or language.
Y Combinator Scraper⭐ 5.0 (33)Extracts startup leads, founder emails, LinkedIn profiles, and hiring data from YC companies and founders.
Reddit Scraper Lite⭐ 4.43 (28)Pay-per-result Reddit scraper for posts, comments, communities, and users without requiring a login.
YouTube Transcript Ninja⭐ 4.99 (23)Extracts YouTube video transcripts in the specified format from any public video URL.
Video Transcript Scraper⭐ 4.39 (24)Scrapes transcripts from YouTube, X, Facebook, TikTok, and more in any available language. Outputs in JSON and LLM-ready formats.
AlphaScrape⭐ 5.0 (20)Analyzes earnings-day data, revenue trends, and executive commentary to predict stock movement with confidence scores.
Job Posting Scraper⭐ 5.0 (14)Extracts real-time job postings at scale from Indeed, LinkedIn, and Google Jobs in a single run.
Web Accessibility Scanner⭐ 5.0 (13)Scans websites for WCAG compliance issues with detailed error reports and support for login-gated pages.
Universal News Article Intelligence Agent⭐ 5.0 (11)Extracts clean Markdown and full-text from paywalled news domains including Bloomberg, WSJ, and the FT. Success-only billing.
WHO Health Intelligence Scraper⭐ 4.99 (11)Aggregates WHO health data from publications, GHO statistics, and ClinicalTrials.gov with NLP location detection.
Product Hunt Scraper⭐ 4.99 (12)Scrapes product data and team members from Product Hunt daily launches.
Handshake Jobs Scraper⭐ 5.0 (11)Extracts real-time job postings from Handshake at scale for recruitment and HR research.
Tester MCP Client⭐ 4.99 (8)MCP client that connects to any MCP server via Streamable HTTP and displays the conversation in a chat-like UI for testing.
RAG Web Browser⭐ 2.72 (18)Queries Google Search, scrapes the top N pages, and returns Markdown content for LLM processing — similar to ChatGPT’s browser.

Use with your agent

The example below uses the Website Content Crawler to pull a site’s content as Markdown, ready to insert into a vector store or LLM context window.
import requests

response = requests.post(
    "https://api.apify.com/v2/acts/apify~website-content-crawler/run-sync-get-dataset-items",
    headers={"Authorization": "Bearer YOUR_APIFY_TOKEN"},
    json={
        "startUrls": [{"url": "https://docs.example.com"}],
        "maxCrawlPages": 100,
        "outputFormats": ["markdown"]
    }
)

pages = response.json()
for page in pages:
    print(page["url"])
    print(page["markdown"])

What you can build

  • RAG pipelines that crawl documentation sites and product pages, then chunk and index the output for retrieval-augmented generation
  • Knowledge graph builders that extract entities and relationships from web content and structure them for graph-based reasoning
  • AI model comparison tools that benchmark different LLM responses to the same prompts side by side
  • Real-time grounding systems that query Google Search and scrape top results before passing context to an LLM

Agents APIs

Autonomous workflows and multi-step reasoning actors

Developer Tools APIs

Web scrapers, crawlers, and data extraction infrastructure

Build docs developers (and LLMs) love