Architecture

This document provides an overview of twitter-cli’s architecture, module organization, and key design patterns.

Project Structure

The codebase is organized as a single Python package with clear separation of concerns:

twitter_cli/
├── __init__.py          # Package metadata and version
├── cli.py               # CLI entry point and command handlers
├── client.py            # Twitter GraphQL API client
├── auth.py              # Cookie authentication and extraction
├── config.py            # Configuration loading and validation
├── constants.py         # API constants and headers
├── filter.py            # Tweet scoring and filtering
├── formatter.py         # Terminal output formatting (rich tables)
├── models.py            # Data models (Tweet, Author, Metrics)
└── serialization.py     # JSON serialization/deserialization

Module Overview

cli.py — Command-Line Interface

Purpose: Entry point for all CLI commands, argument parsing, and command orchestration. Key responsibilities:

Define all CLI commands using Click decorators
Handle command-line arguments and options
Coordinate between client, config, filter, and formatter modules
Error handling and user-facing messages

Key functions:

cli() — Main CLI group (entry point)
feed() — Fetch home timeline or following feed
favorites() — Fetch bookmarks
search() — Search tweets by keyword
user(), user_posts(), likes() — User-related commands
tweet() — Fetch tweet detail with replies
list_timeline() — Fetch list timeline
followers(), following() — Fetch user connections
post(), delete_tweet() — Write operations
like(), unlike(), retweet(), unretweet() — Engagement operations
favorite(), unfavorite() — Bookmark operations

Example flow:

# twitter feed --max 50 --filter
# cli.py:169-208
def feed(feed_type, max_count, as_json, input_file, output_file, do_filter):
    config = load_config()  # Load config.yaml
    client = _get_client(config)  # Authenticate via auth.py
    tweets = client.fetch_home_timeline(fetch_count)  # Fetch via client.py
    filtered = filter_tweets(tweets, config)  # Apply filter.py
    print_tweet_table(filtered, console)  # Display via formatter.py

Reference: cli.py:1-527

client.py — Twitter API Client

Purpose: Core GraphQL API client with anti-detection features, rate limiting, and pagination. Key responsibilities:

Make authenticated GraphQL GET/POST requests to Twitter’s API
TLS fingerprint impersonation using curl_cffi (Chrome 133)
Automatic query ID resolution (fallback → live extraction → GitHub)
Parse GraphQL responses into domain models
Pagination and cursor management
Rate limit handling with exponential backoff
Generate x-client-transaction-id headers
Extract live feature flags from Twitter’s frontend

Key class: TwitterClient Read operations:

fetch_home_timeline() — For You feed
fetch_following_feed() — Following feed (chronological)
fetch_bookmarks() — Bookmarked tweets
fetch_user() — User profile by screen name
fetch_user_tweets() — User’s tweets
fetch_user_likes() — Tweets liked by user
fetch_search() — Search tweets (Top/Latest/Photos/Videos)
fetch_tweet_detail() — Tweet + replies
fetch_list_timeline() — Twitter List timeline
fetch_followers() — User’s followers
fetch_following() — Users followed by user

Write operations:

create_tweet() — Post a new tweet or reply
delete_tweet() — Delete a tweet
like_tweet() / unlike_tweet() — Like/unlike
retweet() / unretweet() — Retweet/undo retweet
bookmark_tweet() / unbookmark_tweet() — Bookmark/unbookmark

Anti-detection features:

TLS impersonation: Uses curl_cffi with Chrome 133 fingerprint
Full cookie forwarding: Sends ALL browser cookies, not just auth tokens
Transaction ID generation: Dynamic x-client-transaction-id headers
Live feature extraction: Parses feature flags from x.com homepage
Request jitter: Randomized delays between requests (0.7-1.5× base delay)
Write delays: 1.5-4s random delay after write operations

Query ID resolution strategy:

Check in-memory cache
Use hardcoded fallback query IDs (fast path)
On 404, fetch from GitHub (twitter-openapi project)
Scan x.com JavaScript bundles for live query IDs
Automatic retry with refreshed ID

Example:

# client.py:291-307
class TwitterClient:
    def __init__(self, auth_token, ct0, rate_limit_config=None, cookie_string=None):
        self._auth_token = auth_token
        self._ct0 = ct0
        self._cookie_string = cookie_string  # Full browser cookies
        self._request_delay = 2.5  # Configurable via config.yaml
        self._max_retries = 3
        self._client_transaction = ClientTransaction(...)  # For transaction IDs

    def fetch_home_timeline(self, count=20):
        return self._fetch_timeline(
            "HomeTimeline",
            count,
            lambda data: _deep_get(data, "data", "home", "home_timeline_urt", "instructions"),
        )

Reference: client.py:1-1109

Purpose: Manage Twitter cookie authentication with browser extraction and caching. Key responsibilities:

Extract cookies from local browsers (Chrome, Edge, Firefox, Brave)
Load cookies from environment variables
Cache cookies with 24h TTL
Verify cookie validity before use
Handle macOS Keychain access (required for Chrome cookie decryption)

Authentication priority:

TWITTER_AUTH_TOKEN + TWITTER_CT0 environment variables
File cache (~/.cache/twitter-cli/cookies.json, 24h TTL)
Browser extraction (auto-detect)

Cookie extraction strategy:

In-process first: Required on macOS for Keychain access
Subprocess fallback: Handles SQLite lock when browser is running
Full cookie extraction: Extracts ALL Twitter cookies for full browser fingerprint

Key functions:

get_cookies() — Main entry point, returns {"auth_token": ..., "ct0": ..., "cookie_string": ...}
load_from_env() — Load from environment variables
extract_from_browser() — Auto-detect and extract from browsers
verify_cookies() — Verify tokens are valid via API call

Cookie cache:

Location: ~/.cache/twitter-cli/cookies.json
TTL: 24 hours
Automatically invalidated on 401/403 errors
Permissions: 0o600 (owner read/write only)

Example:

# auth.py:254-299
def get_cookies():
    # 1. Try environment variables
    cookies = load_from_env()
    if cookies:
        return cookies

    # 2. Try cache (24h TTL)
    cookies = _load_cookie_cache()
    if cookies:
        return cookies

    # 3. Extract from browser
    cookies = extract_from_browser()
    if cookies:
        _save_cookie_cache(cookies)
        return cookies

    raise RuntimeError("No Twitter cookies found...")

Reference: auth.py:1-352

config.py — Configuration Management

Purpose: Load and validate configuration from YAML files with sensible defaults. Key responsibilities:

Load config.yaml from current directory or project root
Deep-merge user config with defaults
Validate and normalize configuration values
Provide type-safe config access

Configuration structure:

fetch:
  count: 50                  # Default fetch count

filter:
  mode: "topN"               # "topN" | "score" | "all"
  topN: 20
  minScore: 50
  lang: []                   # Language filter (empty = all)
  excludeRetweets: false
  weights:
    likes: 1.0
    retweets: 3.0
    replies: 2.0
    bookmarks: 5.0
    views_log: 0.5

rateLimit:
  requestDelay: 2.5          # Delay between requests (seconds)
  maxRetries: 3              # Retry count on rate limit
  retryBaseDelay: 5.0        # Base delay for exponential backoff
  maxCount: 200              # Hard cap on fetched items

Key functions:

load_config() — Main entry point, returns merged config dict
_resolve_config_path() — Find config file in standard locations
_deep_merge() — Recursively merge user config into defaults
_normalize_config() — Validate types and bounds

Config file search path:

./config.yaml (current working directory)
<project-root>/config.yaml (parent of twitter_cli package)

Reference: config.py:1-167

filter.py — Tweet Scoring & Filtering

Purpose: Rank tweets by engagement metrics and apply user-defined filters. Key responsibilities:

Calculate engagement scores using weighted formula
Filter tweets by language, retweet status
Apply filtering modes: topN, score threshold, or all
Sort tweets by score in descending order

Scoring formula:

score = (likes × w_likes)
      + (retweets × w_retweets)
      + (replies × w_replies)
      + (bookmarks × w_bookmarks)
      + (log10(views) × w_views_log)

Filter modes:

topN: Keep top N tweets by score (default: 20)
score: Keep tweets where score ≥ minScore (default: 50)
all: Return all tweets, sorted by score

Key functions:

score_tweet() — Calculate score for a single tweet
filter_tweets() — Apply all filters and return sorted list

Example:

# filter.py:48-87
def filter_tweets(tweets, config):
    # 1. Filter by language
    if config.get("lang"):
        filtered = [t for t in tweets if t.lang in lang_set]

    # 2. Exclude retweets
    if config.get("excludeRetweets"):
        filtered = [t for t in filtered if not t.is_retweet]

    # 3. Score all tweets
    scored = [replace(t, score=score_tweet(t, weights)) for t in filtered]

    # 4. Sort by score descending
    scored.sort(key=lambda t: t.score, reverse=True)

    # 5. Apply mode
    if mode == "topN":
        return scored[:topN]
    elif mode == "score":
        return [t for t in scored if t.score >= minScore]
    return scored

Reference: filter.py:1-97

formatter.py — Terminal Output

Purpose: Format tweets and user profiles for terminal display using Rich library. Key responsibilities:

Render tweets as rich tables with emoji indicators
Display tweet details in panels
Format user profiles
Display filter statistics
Number formatting (K/M suffixes)

Key functions:

print_tweet_table() — Display tweets in a table format
print_tweet_detail() — Display single tweet with full details in a panel
print_user_profile() — Display user profile in a panel
print_user_table() — Display user list in a table
print_filter_stats() — Show before/after filter counts
format_number() — Convert numbers to readable format (1.2K, 3.5M)

Output features:

Verified badge (✓) for verified users
Retweet indicator (🔄)
Media type icons (📷 📹 🎞️)
Quoted tweet preview
Engagement metrics with emoji
Tweet links

Example table output:

┌──────────────────────────────────────────────────────────────┐
│ 📱 Twitter — 20 tweets                                       │
├───┬──────────┬─────────────────────────┬──────────┬────────┤
│ # │ Author   │ Tweet                   │ Stats    │ Score  │
├───┼──────────┼─────────────────────────┼──────────┼────────┤
│ 1 │ @user ✓  │ Tweet text...           │ ❤️ 1.2K  │  45.3  │
│   │          │ 🔗 x.com/user/status/id │ 🔄 234   │        │
│   │          │                         │ 💬 56    │        │
│   │          │                         │ 👁️ 45.2K │        │
└───┴──────────┴─────────────────────────┴──────────┴────────┘

Reference: formatter.py:1-239

models.py — Data Models

Purpose: Define core data structures as simple Python dataclasses. Key models: Tweet — Represents a single tweet

@dataclass
class Tweet:
    id: str
    text: str
    author: Author
    metrics: Metrics
    created_at: str
    media: List[TweetMedia] = field(default_factory=list)
    urls: List[str] = field(default_factory=list)
    is_retweet: bool = False
    lang: str = ""
    retweeted_by: Optional[str] = None  # Screen name if retweet
    quoted_tweet: Optional[Tweet] = None
    score: Optional[float] = None  # Populated by filter.py

Author — Tweet author information

@dataclass
class Author:
    id: str
    name: str
    screen_name: str
    profile_image_url: str = ""
    verified: bool = False

Metrics — Engagement metrics

@dataclass
class Metrics:
    likes: int = 0
    retweets: int = 0
    replies: int = 0
    quotes: int = 0
    views: int = 0
    bookmarks: int = 0

TweetMedia — Media attachments

@dataclass
class TweetMedia:
    type: str  # "photo" | "video" | "animated_gif"
    url: str
    width: Optional[int] = None
    height: Optional[int] = None

UserProfile — User account information

@dataclass
class UserProfile:
    id: str
    name: str
    screen_name: str
    bio: str = ""
    location: str = ""
    url: str = ""
    followers_count: int = 0
    following_count: int = 0
    tweets_count: int = 0
    likes_count: int = 0
    verified: bool = False
    profile_image_url: str = ""
    created_at: str = ""

Reference: models.py:1-70

serialization.py — JSON Serialization

Purpose: Convert domain models to/from JSON for import/export functionality. Key responsibilities:

Serialize tweets to JSON (for --json and --output flags)
Deserialize tweets from JSON (for --input flag)
Serialize user profiles to JSON
Handle nested objects (Author, Metrics, quoted tweets)

Key functions:

tweet_to_dict() — Convert Tweet → dict
tweet_from_dict() — Convert dict → Tweet
tweets_to_json() — Convert List[Tweet] → JSON string
tweets_from_json() — Convert JSON string → List[Tweet]
user_profile_to_dict() — Convert UserProfile → dict
users_to_json() — Convert List[UserProfile] → JSON string

JSON schema example:

{
  "id": "1234567890",
  "text": "Tweet content",
  "author": {
    "id": "123456",
    "name": "Display Name",
    "screenName": "username",
    "verified": true
  },
  "metrics": {
    "likes": 1234,
    "retweets": 567,
    "replies": 89,
    "views": 12345
  },
  "createdAt": "2024-01-01T12:00:00Z",
  "media": [{"type": "photo", "url": "..."}],
  "urls": ["https://..."],
  "isRetweet": false,
  "score": 45.3
}

Reference: serialization.py:1-176

Key Design Patterns

1. Separation of Concerns

Each module has a single, well-defined responsibility:

cli.py: User interface
client.py: API communication
auth.py: Authentication
filter.py: Business logic
formatter.py: Presentation

2. Configuration-Driven Behavior

All runtime behavior is configurable via:

config.yaml for persistent settings
Environment variables for credentials and proxy
Command-line flags for per-run overrides

3. Anti-Detection Layering

Multiple techniques work together:

TLS fingerprinting (curl_cffi)
Full cookie forwarding
Dynamic header generation
Request timing randomization
Live feature flag extraction

4. Graceful Degradation

Query ID fallback chain (hardcoded → GitHub → live)
Multiple cookie extraction strategies (in-process → subprocess)
Automatic retry on rate limits
Silent feature flag extraction failures

5. Data Flow

User Command (cli.py)
    ↓
Load Config (config.py)
    ↓
Authenticate (auth.py)
    ↓
Fetch Data (client.py)
    ↓
Filter/Score (filter.py)
    ↓
Format Output (formatter.py)
    ↓
Display to User

Testing & Development

While the codebase prioritizes rapid iteration over test coverage, key areas to understand for development: Debugging tips:

# Enable verbose logging
twitter --verbose feed

# Test with small datasets
twitter feed --max 5

# Use JSON for inspection
twitter feed --max 5 --json | python3 -m json.tool

# Test filters offline
twitter feed --input tweets.json --filter

Development workflow:

# Install dev dependencies
uv sync --extra dev

# Lint
uv run ruff check .

# Run tests
uv run pytest -q

Extension Points

To add new functionality:

New CLI command: Add to cli.py with @cli.command() decorator
New API endpoint: Add method to TwitterClient class in client.py
New filter mode: Extend filter_tweets() in filter.py
New output format: Add formatter function in formatter.py

Example: Adding a new command

# In cli.py
@cli.command()
@click.argument("topic")
def trending(topic):
    """Fetch trending tweets for a topic."""
    config = load_config()
    client = _get_client(config)
    tweets = client.fetch_trending(topic)  # New client method
    print_tweet_table(tweets, console)

Dependencies

Key external libraries:

click: CLI framework
rich: Terminal formatting
curl_cffi: TLS fingerprint impersonation
browser-cookie3: Cookie extraction from browsers
x-client-transaction: Transaction ID generation
pyyaml: Configuration parsing
beautifulsoup4: HTML parsing for header extraction

Performance Considerations

Startup time: 3-5s due to x.com initialization (required for anti-detection)
Pagination: Automatic with jittered delays to avoid rate limits
Cookie caching: 24h TTL reduces browser access overhead
Query ID caching: In-memory cache reduces bundle scanning

Security Notes

Cookies never leave the local machine — no external uploads
Cookie cache permissions: 0o600 (owner-only)
Cookie cache location: ~/.cache/twitter-cli/cookies.json
Proxy support: All requests route through TWITTER_PROXY if set
No API keys required — uses your browser’s existing session

References

For detailed implementation:

CLI commands: cli.py:116-526
API client core: client.py:291-826
Authentication flow: auth.py:254-300
Filter algorithm: filter.py:23-87
Output formatting: formatter.py:21-238

Get Started

Read Commands

Write Commands

Configuration

Guides

Reference

Project Structure

Module Overview

cli.py — Command-Line Interface

client.py — Twitter API Client

config.py — Configuration Management

filter.py — Tweet Scoring & Filtering

formatter.py — Terminal Output

models.py — Data Models

serialization.py — JSON Serialization

Key Design Patterns

1. Separation of Concerns

2. Configuration-Driven Behavior

3. Anti-Detection Layering

4. Graceful Degradation

5. Data Flow

Testing & Development

Extension Points

Dependencies

Performance Considerations

Security Notes

References

Build docs developers (and LLMs) love

Get Started

Read Commands

Write Commands

Configuration

Guides

Reference

​Project Structure

​Module Overview

​cli.py — Command-Line Interface

​client.py — Twitter API Client

​auth.py — Authentication & Cookie Extraction

​config.py — Configuration Management

​filter.py — Tweet Scoring & Filtering

​formatter.py — Terminal Output

​models.py — Data Models

​serialization.py — JSON Serialization

​Key Design Patterns

​1. Separation of Concerns

​2. Configuration-Driven Behavior

​3. Anti-Detection Layering

​4. Graceful Degradation

​5. Data Flow

​Testing & Development

​Extension Points

​Dependencies

​Performance Considerations

​Security Notes

​References

Build docs developers (and LLMs) love

Project Structure

Module Overview

cli.py — Command-Line Interface

client.py — Twitter API Client

auth.py — Authentication & Cookie Extraction

config.py — Configuration Management

filter.py — Tweet Scoring & Filtering

formatter.py — Terminal Output

models.py — Data Models

serialization.py — JSON Serialization

Key Design Patterns

1. Separation of Concerns

2. Configuration-Driven Behavior

3. Anti-Detection Layering

4. Graceful Degradation

5. Data Flow

Testing & Development

Extension Points

Dependencies

Performance Considerations

Security Notes

References