Combine web search with intelligent scraping using SearchGraph
The SearchGraph combines search engine capabilities with ScrapeGraphAI’s extraction power. It automatically searches the web, finds relevant pages, and extracts structured data.
Define a schema to get structured, validated results:
import osfrom typing import Listfrom dotenv import load_dotenvfrom pydantic import BaseModel, Fieldfrom scrapegraphai.graphs import SearchGraphfrom scrapegraphai.utils import convert_to_csv, convert_to_json, prettify_exec_infoload_dotenv()# Define the output schema for the graphclass Dish(BaseModel): name: str = Field(description="The name of the dish") description: str = Field(description="The description of the dish")class Dishes(BaseModel): dishes: List[Dish]# Define the configuration for the graphopenai_key = os.getenv("OPENAI_APIKEY")graph_config = { "llm": { "api_key": openai_key, "model": "openai/gpt-4o" }, "max_results": 2, "verbose": True,}# Create the SearchGraph instance and run itsearch_graph = SearchGraph( prompt="List me Chioggia's famous dishes", config=graph_config, schema=Dishes)result = search_graph.run()print(result)# Get graph execution infograph_exec_info = search_graph.get_execution_info()print(prettify_exec_info(graph_exec_info))# Save to json and csvconvert_to_csv(result, "result")convert_to_json(result, "result")
Import SearchGraph and Pydantic for schema definition.
2
Define your schema
class Dish(BaseModel): name: str = Field(description="The name of the dish") description: str = Field(description="The description of the dish")class Dishes(BaseModel): dishes: List[Dish]
Create Pydantic models to structure the search results.
3
Configure search parameters
graph_config = { "llm": { "api_key": openai_key, "model": "openai/gpt-4o" }, "max_results": 2, # Number of search results to process "verbose": True,}
Set max_results to control how many search results to scrape.
4
Create and run search graph
search_graph = SearchGraph( prompt="List me Chioggia's famous dishes", config=graph_config, schema=Dishes)result = search_graph.run()
The graph automatically searches, finds relevant pages, and extracts data.
{ "dishes": [ { "name": "Sarde in Saor", "description": "Traditional Venetian sweet and sour sardines with onions, pine nuts, and raisins" }, { "name": "Risotto di Gò", "description": "Creamy risotto made with gò, a local lagoon fish" }, { "name": "Moleche", "description": "Soft-shell crabs, a seasonal delicacy from the Venice lagoon" } ]}
from scrapegraphai.graphs import SearchLinkGraphsearch_link_graph = SearchLinkGraph( prompt="List me the best AI tools", config=graph_config,)# Returns URLs firstlinks = search_link_graph.run()print("Found links:", links)# Then you can scrape specific linksfrom scrapegraphai.graphs import SmartScraperMultiGraphscraper = SmartScraperMultiGraph( prompt="Extract tool name, description, and pricing", source=links[:3], # Scrape top 3 results config=graph_config,)result = scraper.run()