Skip to main content

Prerequisites

Before installing Tweet Audit Tool, ensure you have the following:
1

Python 3.12 or higher

Verify your Python installation:
python --version
# or
python3 --version
The tool requires Python 3.12+ for modern type hints and language features.
2

X (Twitter) archive

Request and download your complete X archive:
  1. Go to X.comMoreSettings and PrivacyYour Account
  2. Click “Download an archive of your data”
  3. Wait for confirmation email (typically 24-48 hours)
  4. Download and extract the ZIP file
Archive requests can take 24-48 hours to process. Request yours early!
The extracted archive should contain:
your-archive/
├── data/
│   └── tweets.json    # This is the file you need
└── ...
3

Gemini API key

Get a free API key from Google AI Studio:
  1. Visit Google AI Studio
  2. Sign in with your Google account
  3. Click “Create API Key”
  4. Copy your API key (you’ll need it for configuration)
Gemini 2.5 Flash offers 15 requests per minute and 1,500 per day for free—enough to analyze 1,000+ tweets daily.

Installation

1

Clone the repository

git clone https://github.com/DanielPopoola/tweet-audit-impl.git
cd tweet-audit-impl
2

Install dependencies

Choose your preferred package manager:
pip install -r requirements.txt
uv is recommended for faster dependency resolution and installation. It’s a modern Python package installer that’s significantly faster than pip.
The tool requires these core dependencies:
  • google-genai>=1.63.0 - Google Gemini AI client
  • python-dotenv>=1.2.1 - Environment variable management
  • pytest>=9.0.2 - Testing framework
  • ruff>=0.15.1 - Code linting and formatting
3

Verify installation

Test that the tool is working:
python src/main.py
You should see the help message:
usage: main.py [-h] [{extract-tweets,analyze-tweets}]

Evaluate tweets against predetermined criteria

positional arguments:
  {extract-tweets,analyze-tweets}
                        Command to execute

Environment configuration

Create a .env file in the project root with your configuration:
1

Create .env file

touch .env
2

Add required variables

Open .env in your editor and add:
.env
# Required: Your Gemini API key from Google AI Studio
GEMINI_API_KEY=your_api_key_here

# Required: Your X (Twitter) username
X_USERNAME=your_x_username
Never commit your .env file to version control. It’s already included in .gitignore.
3

Add optional variables

Customize these settings as needed:
.env
# Optional: Gemini model to use (default: gemini-2.5-flash)
GEMINI_MODEL=gemini-2.5-flash

# Optional: Rate limit between API calls in seconds (default: 1.0)
RATE_LIMIT_SECONDS=1.0

# Optional: Log level (default: INFO)
# Options: DEBUG, INFO, WARNING, ERROR, CRITICAL
LOG_LEVEL=INFO
RATE_LIMIT_SECONDS controls the delay between API calls. Increase this value (e.g., 2.0) if you encounter rate limiting errors.

Criteria configuration

Define what makes a tweet “unaligned” with your values:
1

Create config.json (optional)

Create a config.json file in the project root:
touch config.json
If you skip this step, the tool uses sensible defaults focused on professional content moderation.
2

Define your criteria

Add your custom criteria to config.json:
config.json
{
  "criteria": {
    "forbidden_words": [
      "crypto",
      "NFT",
      "web3",
      "damn",
      "wtf"
    ],
    "topics_to_exclude": [
      "Profanity or unprofessional language",
      "Personal attacks or insults",
      "Outdated political opinions",
      "Controversial statements",
      "Cryptocurrency hype"
    ],
    "tone_requirements": [
      "Professional language only",
      "Respectful communication",
      "No offensive humor",
      "Constructive criticism only"
    ],
    "additional_instructions": "Flag any content that could harm professional reputation or doesn't align with current values. Be extra cautious with tweets from 2020-2021."
  }
}
3

Understand the criteria types

Exact word matches (case-insensitive). The AI flags tweets containing any of these words.Example:
"forbidden_words": ["damn", "wtf", "crypto"]
  • Flags: “Crypto is the future!” ✓
  • Keeps: “Cryptocurrency adoption” ✗ (different word)
High-level content categories. The AI interprets these broadly and flags tweets matching these themes.Example:
"topics_to_exclude": [
  "Political opinions",
  "Cryptocurrency hype",
  "Personal attacks"
]
Stylistic rules for communication tone. The AI checks if tweets violate these standards.Example:
"tone_requirements": [
  "Professional language",
  "No sarcasm",
  "Constructive criticism only"
]
Free-form guidance for the AI. Use this for nuanced instructions or context-specific rules.Example:
"additional_instructions": "Be extra cautious with tweets from 2020-2021 during controversial periods. Flag anything that could be misinterpreted out of context."

Default criteria

If you don’t create config.json, the tool uses these defaults (defined in src/config.py:51-64):
Criteria(
    forbidden_words=[],
    topics_to_exclude=[
        "Profanity or unprofessional language",
        "Personal attacks or insults",
        "Outdated political opinions",
    ],
    tone_requirements=[
        "Professional language only",
        "Respectful communication",
    ],
    additional_instructions="Flag any content that could harm professional reputation"
)

Prepare your Twitter archive

Place your extracted Twitter archive in the correct location:
1

Create data directory

mkdir -p data/tweets
2

Copy tweets.json

Copy the tweets.json file from your extracted X archive:
cp /path/to/your/archive/data/tweets.json data/tweets/tweets.json
The tool expects to find your tweets at data/tweets/tweets.json by default. This path is configured in src/config.py:35.
3

Verify archive structure

Your project should now look like:
tweet-audit-impl/
├── data/
│   └── tweets/
│       └── tweets.json    # Your X archive
├── src/
│   └── ...
├── .env                    # Your environment variables
├── config.json             # Your criteria (optional)
└── ...

Verify setup

Before running analysis, verify everything is configured correctly:
python -c "from dotenv import load_dotenv; import os; load_dotenv(); print('API key loaded' if os.getenv('GEMINI_API_KEY') else 'Missing GEMINI_API_KEY')"
If all checks pass, you’re ready to start analyzing tweets!

Troubleshooting

Dependencies weren’t installed correctly. Try:
pip install --upgrade -r requirements.txt
Your .env file is missing or doesn’t contain GEMINI_API_KEY. Verify:
cat .env | grep GEMINI_API_KEY
Ensure the line reads:
GEMINI_API_KEY=your_actual_api_key_here
Your Twitter archive isn’t in the expected location. Check:
ls data/tweets/
Copy your archive to the correct location:
mkdir -p data/tweets
cp /path/to/archive/data/tweets.json data/tweets/
The tool requires Python 3.12+. Check your version:
python --version
If you have multiple Python versions:
python3.12 src/main.py

Next steps

Quick start

Now that you’re set up, run your first tweet analysis

Build docs developers (and LLMs) love