Back to Templates

Scrape and analyze websites with custom prompts using Gemini, Apify, and LangChain

Last update

Last update 4 months ago

Categories

Share


๐Ÿ” AI-Powered Website Prompt Executor (Apify + OpenRouter)

This workflow combines the power of Apify and OpenRouter to scrape website content and execute any custom prompt using AI. You define what you want โ€” whether itโ€™s extracting contact details, summarizing content, collecting job offers, or anything else โ€” and the system intelligently processes the site to give you results.

๐Ÿš€ Overview

This workflow allows you to:

  1. Input a URL and define a prompt.
  2. Scrape the specified number of pages from the website.
  3. Process each pageโ€™s metadata and Markdown content.
  4. Use AI to interpret and respond to the prompt on each page.
  5. Aggregate and return structured output.

๐Ÿง  How It Works

Input Example

{
  "enqueue": true,
  "maxPages": 5,
  "url": "https://apify.com",
  "method": "GET",
  "prompt": "collect all contact informations available on this website"
}

Workflow Steps

Step Action
1 Triggered by another workflow with JSON input.
2 Calls the Apify actor firescraper-ai-website-content-markdown-scraper to scrape content.
3 Loops through the scraped pages.
4 AI analyzes each page based on the input prompt.
5 Aggregates AI outputs across all pages.
6 Final AI processing step to return a clean structured result.

๐Ÿ›  Technologies Used

  • Apify โ€“ Scrapes structured content and Markdown from websites.
  • OpenRouter โ€“ Provides access to advanced AI models like Gemini.
  • LangChain โ€“ Handles AI agent orchestration and prompt interpretation.

๐Ÿ”ง Customization

Customize the workflow via the following input fields:

  • url: Starting point for scraping
  • maxPages: Limit the number of pages to crawl
  • prompt: Define any instruction (e.g., โ€œsummarize this website,โ€ โ€œextract product data,โ€ โ€œlist all emails,โ€ etc.)

This allows dynamic, flexible use across various use cases.


๐Ÿ“ฆ Output

The workflow returns a JSON result that includes:

  • Processed prompt responses from each page
  • Aggregated AI insights
  • Structured and machine-readable format

๐Ÿงช Example Use Cases

  • ๐Ÿ” Extracting contact information from websites
  • ๐Ÿ“„ Summarizing articles or company profiles
  • ๐Ÿ›๏ธ Collecting product information
  • ๐Ÿ“‹ Extracting job listings or news
  • ๐Ÿ“ฌ Generating outreach lists from public data
  • ๐Ÿค– Used as a tool within other AI agents for real-time web analysis
  • ๐Ÿงฉ Integrated as an external tool in MCP (Multi-Component Prompt) servers to enhance AI capabilities

๐Ÿ” API Credentials Required

You will need:

  • Apify API token โ€“ For running the scraper actor
  • OpenRouter API key โ€“ For AI-powered prompt processing

Set these credentials in your environment or n8n credential manager before running.