Skip to main content
This guide walks you through a complete tweet audit workflow, from setting up your environment to reviewing the final results.

Prerequisites checklist

Before you begin, ensure you have:
1

X archive downloaded

Request your archive from X.com → Settings → Download archive. This takes 24-48 hours.
2

Gemini API key ready

Get a free API key from Google AI Studio.
3

Python 3.12+ installed

Verify with python --version in your terminal.

Complete workflow

Step 1: Initial setup

Clone the repository and install dependencies:
git clone https://github.com/DanielPopoola/tweet-audit-impl.git
cd tweet-audit
pip install -r requirements.txt

Step 2: Configure environment

Create a .env file in the project root with your credentials:
.env
# Required: Your Gemini API key
GEMINI_API_KEY=AIzaSyD_example_key_here_abc123

# Required: Your X username
X_USERNAME=yourhandle

# Optional: Rate limiting (default: 1.0)
RATE_LIMIT_SECONDS=1.0

# Optional: Log level (default: INFO)
LOG_LEVEL=INFO
Start with the default RATE_LIMIT_SECONDS=1.0 setting. This processes ~3,600 tweets per hour while staying well within Gemini’s free tier limits (1,500 requests per day).

Step 3: Create your criteria

Create a config.json file with your deletion criteria:
config.json
{
  "criteria": {
    "forbidden_words": ["damn", "wtf", "crypto", "NFT"],
    "topics_to_exclude": [
      "Profanity or unprofessional language",
      "Personal attacks or insults",
      "Outdated political opinions"
    ],
    "tone_requirements": [
      "Professional language only",
      "Respectful communication"
    ],
    "additional_instructions": "Flag any content that could harm professional reputation"
  }
}
If you don’t create config.json, the tool uses sensible defaults focused on maintaining professional content.

Step 4: Prepare your archive

Extract your X archive and place the tweets file in the correct location:
mkdir -p data/tweets
cp ~/Downloads/twitter-archive/data/tweets.json data/tweets/tweets.json
Verify your archive is in place:
ls -lh data/tweets/tweets.json
Expected output:
-rw-r--r-- 1 user user 25M Jan 15 10:30 data/tweets/tweets.json

Step 5: Extract tweets

Convert the JSON archive to CSV format for processing:
python src/main.py extract-tweets
Expected output:
Extracting tweets from archive...
Successfully extracted 5243 tweets
This creates data/tweets/transformed/tweets.csv with columns:
  • id - Tweet ID
  • content - Tweet text

Step 6: Analyze tweets

Run the AI analysis to flag tweets for deletion:
python src/main.py analyze-tweets
Expected output:
Analyzing tweets...
Processing batch 1/525 (10 tweets)
Processing batch 2/525 (10 tweets)
...
Successfully analyzed 5243 tweets
The tool processes tweets in batches of 10 and saves progress automatically. If interrupted, simply run the command again to resume where you left off.

Step 7: Review results

Open the generated CSV file:
cat data/tweets/processed/results.csv
Example output:
tweet_url,deleted
https://x.com/yourhandle/status/1234567890,false
https://x.com/yourhandle/status/9876543210,false
https://x.com/yourhandle/status/5555555555,false
Each flagged tweet is listed with:
  • tweet_url - Direct link to the tweet
  • deleted - Status tracker (initially false)

Step 8: Manual deletion

Review and delete flagged tweets:
1

Open the tweet URL

Click or copy the URL from the CSV to view the tweet on X.
2

Decide if you agree

Read the tweet and determine if you want to delete it.
3

Delete on X

If you agree, click the three dots on the tweet and select “Delete”.
4

Track your progress

Update the deleted column in the CSV to true for tweets you’ve removed.
The tool never deletes tweets automatically. You maintain complete control over what gets removed from your profile.

Understanding the output

After analysis, your data/ directory structure looks like:
data/
├── tweets/
│   ├── tweets.json              # Original X archive
│   ├── transformed/
│   │   └── tweets.csv           # Extracted tweets (5,243 rows)
│   └── processed/
│       └── results.csv          # Flagged tweets (127 rows)
└── checkpoint.txt               # Resume point (batch 525)

Result statistics

From the example above:
  • Total tweets: 5,243
  • Flagged for deletion: 127 (2.4%)
  • Processing time: ~1.5 hours at 1 req/sec

Next steps

Custom criteria

Learn how to fine-tune your deletion criteria

Large archives

Handle archives with 10,000+ tweets efficiently

Build docs developers (and LLMs) love