Overview
SpeechGraph is a scraping pipeline that scrapes the web, provides an answer to a given prompt, and generates an audio file from the extracted information. It combines web scraping with text-to-speech capabilities.
Class Signature
Constructor Parameters
The natural language prompt describing what information to extract and convert to speech.
The source to scrape. Can be:
- A URL starting with
http://orhttps:// - A local directory path for offline HTML files
Configuration parameters for the graph. Must include:
llm: LLM configuration (e.g.,{"model": "openai/gpt-4o"})tts_model: Text-to-speech model configuration
output_path(str): Path to save the audio file (default: “output.mp3”)verbose(bool): Enable detailed loggingheadless(bool): Run browser in headless modeadditional_info(str): Extra context for the LLM
Optional Pydantic model defining the expected output structure.
Attributes
The user’s extraction prompt.
The source URL or local directory path.
Configuration dictionary for the graph.
Optional output schema for structured data extraction.
The configured language model instance.
Either “url” or “local_dir” based on the source type.
Methods
run()
Executes the scraping process, generates audio, and returns the text answer.The extracted information as a string. The audio file is saved to disk.
ValueError: If no audio was generated from the text.
Basic Usage
Text-to-Speech Configuration
Using OpenAI TTS
Available Voices
OpenAI TTS offers six voice options:- alloy: Neutral and balanced
- echo: Male voice
- fable: British accent
- onyx: Deep male voice
- nova: Female voice
- shimmer: Soft female voice
Advanced Usage
Custom Output Path
High-Quality Audio
Graph Workflow
The SpeechGraph uses the following node pipeline:- FetchNode: Fetches the web page content
- ParseNode: Parses and chunks the content
- GenerateAnswerNode: Extracts information based on the prompt
- TextToSpeechNode: Converts the answer to audio
Use Cases
- Accessibility: Convert web content to audio for visually impaired users
- Learning: Create audio summaries of educational content
- News Briefings: Generate audio news summaries
- Podcast Generation: Create podcast episodes from articles
- Audiobooks: Convert written content to audio format
Example: News Briefing
Example: Educational Summary
Accessing Results
Error Handling
Cost Considerations
OpenAI TTS pricing (as of 2024):- tts-1: $0.015 per 1,000 characters
- tts-1-hd: $0.030 per 1,000 characters
Performance Tips
- Use tts-1 for faster generation and lower cost
- Use tts-1-hd for higher quality audio when needed
- Keep prompts concise to reduce text length and TTS costs
- Use additional_info to guide the LLM toward audio-friendly output
- Test different voices to find the best fit for your use case
Limitations
- Audio generation adds processing time and cost
- Maximum text length depends on TTS provider limits
- Audio quality depends on the TTS model used
- Requires OpenAI API key with TTS access
Related Graphs
- SmartScraperGraph - Text-only extraction
- SearchGraph - Search and scrape multiple sources
- OmniScraperGraph - Include image analysis
