From chatbot.py:481-560, the knowledge base is initialized interactively:
Do you want to initialize the Twitter knowledge base? (y/n): yDo you want to clear the existing Twitter knowledge base? (y/n): yKnowledge base clearedDo you want to update the Twitter knowledge base with KOL tweets? (y/n): y=== Starting Twitter Knowledge Base Update ===Found 5 KOLs in character configSelected 5 KOLs for processing...
From twitter_agent/twitter_knowledge_base.py:170-340, the update function:
twitter_agent/twitter_knowledge_base.py
async def update_knowledge_base(twitter_client: TwitterClient, knowledge_base, kol_list: List[Dict]): """Update the knowledge base with recent tweets from top KOLs.""" TOP_KOLS = 5 TWEETS_PER_KOL = 15 REQUEST_DELAY = 5 # Select random sample of KOLs selected_kols = random.sample(kol_list, min(TOP_KOLS, len(kol_list))) # Clear existing knowledge base knowledge_base.clear_collection() # Process each selected KOL for kol in selected_kols: tweets = await twitter_client.get_user_tweets( user_id=kol['user_id'], max_results=TWEETS_PER_KOL ) if tweets: knowledge_base.add_tweets(tweets) # Rate limiting await asyncio.sleep(REQUEST_DELAY)
Configuration Constants:
TOP_KOLS = 5 - Number of KOLs to sample per update
TWEETS_PER_KOL = 15 - Maximum tweets to fetch per KOL
REQUEST_DELAY = 5 - Seconds between API calls (rate limiting)
The knowledge base is registered as a tool in chatbot.py:304-308:
chatbot.py
if os.getenv("USE_TWITTER_KNOWLEDGE_BASE", "true").lower() == "true" and knowledge_base is not None: tools.append(Tool( name="query_twitter_knowledge_base", description="""Query the Twitter knowledge base for insights from key opinion leaders. Returns relevant tweets that match your query. Use this to understand current discussions, trends, and opinions from influential accounts. Example: query_twitter_knowledge_base("latest developments in AI") """, func=lambda query: knowledge_base.query_knowledge_base(query) ))
User: What are KOLs saying about Ethereum scaling?Agent: [Queries Twitter KB with "Ethereum scaling solutions"]AI: Based on recent tweets from key opinion leaders, there's discussion about:- Vitalik mentioned progress on Proto-Danksharding- Several KOLs are excited about Layer 2 adoption metrics...
[ { "speaker": "Host", "content": "Welcome to today's episode about blockchain technology..." }, { "speaker": "Guest", "content": "Thanks for having me. I'm excited to discuss..." }]
The default directory is youtube_scraper/jsonoutputs/
3
Initialize Knowledge Base
When starting the agent:
Do you want to initialize the Podcast knowledge base? (y/n): yPodcast knowledge base initialized successfullyCurrent podcast knowledge base stats: {'count': 0, 'last_update': ...}Checking for new podcast transcripts...Found 3 new JSON files to processAdded 245 segments to knowledge base
From chatbot.py:362-369, the podcast KB is registered as a tool:
chatbot.py
if os.getenv("USE_PODCAST_KNOWLEDGE_BASE", "true").lower() == "true" and podcast_knowledge_base is not None: tools.append(Tool( name="query_podcast_knowledge_base", func=lambda query: podcast_knowledge_base.format_query_results( podcast_knowledge_base.query_knowledge_base(query) ), description="""Query the podcast knowledge base for information from podcast transcripts. Returns relevant segments from podcast episodes. Use this to answer questions about topics discussed in the podcast. Example: query_podcast_knowledge_base("What did the guest say about DeFi?") """ ))
Error getting tweets for user 12345: 429 Too Many Requests
Solution:
The framework uses wait_on_rate_limit=True (line 30 of twitter_agent/custom_twitter_actions.py)
Increase REQUEST_DELAY in update function
Reduce TWEETS_PER_KOL or TOP_KOLS
ChromaDB Persistence Issues
Error initializing collection: database is locked
Solution:
Ensure only one agent instance is running
Check for orphaned processes: ps aux | grep python
Delete lock file: rm chroma_db/*.lock
Embedding Model Errors
Error loading model 'all-mpnet-base-v2'
Solution:
# Model downloads automatically on first use# If download fails, try manually:python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('all-mpnet-base-v2')"
No Results Returned
No results found in knowledge base
Solution:
Verify KB is populated: knowledge_base.get_collection_stats()