Master the art of defining custom analysis criteria to accurately identify tweets you want to delete
The quality of your tweet analysis depends entirely on how well you define your criteria. This guide covers advanced strategies for customizing your analysis rules.
Forbidden words trigger exact, case-insensitive matching. From analyzer.py:113-115:
if settings.criteria.forbidden_words: words = ", ".join(settings.criteria.forbidden_words) criteria_parts.append(f"Contains any of these words: {words}")
{ "criteria": { "topics_to_exclude": [ "Profanity or unprofessional language", "Personal attacks or insults directed at individuals", "Complaints about employers or coworkers", "Excessive personal drama or venting", "Controversial political statements", "Inappropriate jokes or humor" ] }}
Political neutrality
Remove all political content:
config.json
{ "criteria": { "topics_to_exclude": [ "Political opinions or election commentary", "Partisan statements about political parties", "Policy advocacy or activism", "Government or political figure criticism", "Political memes or jokes" ] }}
Technology focus
Keep only technical/professional content:
config.json
{ "criteria": { "topics_to_exclude": [ "Non-technical personal updates", "Entertainment or pop culture commentary", "Sports opinions or fandom", "Food, travel, or lifestyle content", "General social media conversations" ] }}
Outdated opinions
Remove content that doesn’t reflect current views:
config.json
{ "criteria": { "topics_to_exclude": [ "Cryptocurrency or NFT enthusiasm from 2021-2022", "Hot takes or strong opinions that I no longer hold", "Technology predictions that proved incorrect", "Trend-chasing or hype-driven content", "Overly optimistic or pessimistic forecasts" ] }}
{ "criteria": { "tone_requirements": [ "Professional and courteous language only", "Respectful disagreement without personal attacks", "Evidence-based claims with sources when possible", "Constructive criticism rather than negativity", "Thoughtful communication, not reactive hot takes" ] }}
"additional_instructions": "Be especially strict with tweets from 2020-2021 during the pandemic and crypto boom. Content from this period often doesn't reflect my current views."
Threshold guidance
Set the bar for deletion:
"additional_instructions": "When in doubt, mark for deletion. I prefer a clean timeline over preserving borderline content. Err on the side of caution."
Or the opposite:
"additional_instructions": "Only flag tweets that are clearly problematic. Preserve content unless it strongly violates criteria. When uncertain, keep it."
Context-specific rules
Handle special cases:
"additional_instructions": "Technical tweets about blockchain technology are fine - only flag cryptocurrency speculation or financial advice. Distinguish between technology discussion and hype."
Reputation focus
Emphasize the goal:
"additional_instructions": "Focus on professional reputation protection. Would a future employer, client, or colleague be concerned by this tweet? If yes, flag it."
{ "criteria": { "forbidden_words": [ "damn", "hell", "wtf", "crap", "sucks" ], "topics_to_exclude": [ "Profanity or unprofessional language", "Personal attacks on individuals or companies", "Complaints about employers or coworkers", "Controversial political or religious opinions", "Excessive personal drama or life updates", "Inappropriate jokes or off-color humor" ], "tone_requirements": [ "Professional and courteous language", "Respectful disagreement without insults", "Constructive rather than purely critical", "Thoughtful analysis over hot takes" ], "additional_instructions": "Flag any content that could negatively impact professional opportunities. Would I want a hiring manager or client to see this? If no, delete it." }}
{ "criteria": { "forbidden_words": [ "crypto", "NFT", "web3", "HODL", "wagmi", "gm" ], "topics_to_exclude": [ "Cryptocurrency or NFT speculation", "Non-technical personal life updates", "Political opinions unrelated to technology policy", "Sports, entertainment, or pop culture", "Food, travel, or lifestyle content", "Clickbait or engagement-bait posts" ], "tone_requirements": [ "Technical accuracy and precision", "Objective analysis over emotional reactions", "Nuanced discussion of trade-offs", "Educational or informative content", "Professional communication standards" ], "additional_instructions": "Keep only tweets about software development, technology, engineering, or related professional topics. Remove everything else to maintain a focused technical profile." }}
{ "criteria": { "forbidden_words": [], "topics_to_exclude": [ "Political opinions or partisan statements", "Election commentary or predictions", "Policy advocacy or activism", "Criticism of political figures or parties", "Social issues with political dimensions", "Government actions or legislation commentary" ], "tone_requirements": [ "Non-partisan communication", "Objective analysis over opinion", "Professional neutrality maintained" ], "additional_instructions": "Remove all political content regardless of viewpoint. Keep technology, professional, and neutral educational content only." }}
{ "criteria": { "forbidden_words": [ "COVID", "pandemic", "quarantine", "lockdown", "crypto", "NFT", "web3", "metaverse" ], "topics_to_exclude": [ "COVID-19 or pandemic-related hot takes", "Cryptocurrency or NFT enthusiasm", "Predictions about web3 or metaverse", "Work-from-home lifestyle tweets", "Quarantine or lockdown content", "Technology hype from 2020-2022 that aged poorly" ], "tone_requirements": [ "Timeless content that remains relevant", "Measured opinions rather than reactive takes", "Evidence-based rather than speculative" ], "additional_instructions": "Focus on removing dated content from 2020-2022 that was reactive to temporary circumstances or speculative hype cycles. Preserve evergreen technical and professional content." }}
Add “only flag if clearly problematic” to additional instructions
Review forbidden words for common false positives
Test with smaller word list
If the tool misses obvious deletions:
Add more forbidden words from missed tweets
Create more specific topic categories
Strengthen tone requirements
Add “when in doubt, flag it” to additional instructions
Review missed tweets for patterns
If similar tweets get different decisions:
Make criteria more specific and unambiguous
Provide examples in additional instructions
Reduce overlapping topics
Test same criteria multiple times (AI can be non-deterministic)
AI analysis has inherent variability. Running the same tweet twice might yield different results. Focus on overall patterns, not individual edge cases.
# After analysis, find tweets NOT in resultsimport pandas as pdall_tweets = pd.read_csv('data/tweets/transformed/tweets.csv')flagged = pd.read_csv('data/tweets/processed/results.csv')# Extract IDs from URLs in flagged tweetsflagged['id'] = flagged['tweet_url'].str.split('/').str[-1]kept = all_tweets[~all_tweets['id'].isin(flagged['id'])]print(f"Kept {len(kept)} tweets")print(kept.sample(20)) # Review random kept tweets
"topics_to_exclude": [ "Political opinions or partisan statements"]
Vague additional instructions
❌ Problem: Instructions too open to interpretation
"additional_instructions": "Delete bad tweets"
✅ Solution: Be specific about what “bad” means
"additional_instructions": "Flag any content that could harm my professional reputation as a software engineer, including unprofessional language, controversial opinions, or outdated technical takes"