Skip to main content

Data Ingestion Best Practices

This guide covers best practices for implementing scalable, maintainable tracking in Mixpanel. Follow these guidelines to ensure high-quality data and avoid common pitfalls.

Create a Tracking Plan

A tracking plan is a centralized document that defines what data you’re collecting and why. It should:
  • Define your business goals and KPIs
  • Outline events, event properties, and user profile properties
  • Serve as the source of truth for your implementation
  • Be continuously updated as your product evolves
  • Be shared across product, marketing, and engineering teams

Tracking Plan Template

Download our tracking plan template to get started

Tracking Plan Methodology

1

Define KPIs

Start with your top KPIs and metrics that measure success. Don’t try to track everything - prioritize what matters most.
2

Map User Flows

Map each KPI to the user actions that influence it. Consider different paths users take to achieve outcomes.
3

Translate to Events

Break down user flows into specific events and properties. Each event should represent a meaningful user action.
4

Iterate

Start with your most critical data and iterate. Tracking everything leads to wasted effort and unused data.

Industry-Specific Templates

Mixpanel provides tracking plan templates for different industries:

Server-Side Best Practices

Server-side tracking is more reliable than client-side tracking. Follow these practices to get the most value:

Track Browser, Device, and OS

Mixpanel’s web and mobile SDKs automatically parse the User-Agent header. For server-side tracking, you need to do this manually:
from mixpanel import Mixpanel
from ua_parser import user_agent_parser

mp = Mixpanel("YOUR_TOKEN")

def track_to_mp(request, event_name, properties):
    # Parse User-Agent header
    parsed = user_agent_parser.Parse(request.headers["User-Agent"])
    
    # Add browser, device, and OS properties
    properties.update({
        "$browser": parsed["user_agent"]["family"],
        "$device": parsed["device"]["family"],
        "$os": parsed["os"]["family"],
    })
    
    # Set client IP for geolocation
    properties["ip"] = request.remote_addr
    
    mp.track(request.user_id, event_name, properties)

Track UTM Parameters and Referrer

Capture marketing attribution data by parsing URL parameters and headers:
from urllib.parse import urlparse
from mixpanel import Mixpanel

mp = Mixpanel("YOUR_TOKEN")

def track_to_mp(request, event_name, properties):
    # Capture referrer
    if "Referrer" in request.headers:
        properties.update({
            "$referrer": request.headers["Referrer"],
            "$referring_domain": urlparse(request.headers["Referrer"]).hostname
        })
    
    # Capture UTM parameters
    utm_keys = ["utm_source", "utm_medium", "utm_campaign", "utm_content", "utm_term"]
    utm_values = {key: request.args[key] for key in utm_keys if request.args.get(key)}
    properties.update(utm_values)
    
    # Set client IP
    properties["ip"] = request.remote_addr
    
    mp.track(request.user_id, event_name, properties)

Track Page Views Consistently

For server-side implementations:
  • Use a single event name for all page views (e.g., “Page Viewed”)
  • Track page name as a property, not as different events
  • Fire page view events only on successful responses
  • Handle both anonymous and identified users
  • Parse headers and URL for analytics properties
def track_page_view(request):
    properties = {
        "page_name": request.path,
        "page_url": request.url,
        "referrer": request.headers.get("Referrer"),
        "ip": request.remote_addr
    }
    
    # Add parsed user agent
    parsed = user_agent_parser.Parse(request.headers["User-Agent"])
    properties.update({
        "$browser": parsed["user_agent"]["family"],
        "$os": parsed["os"]["family"]
    })
    
    mp.track(request.user_id, "Page Viewed", properties)

Handle Geolocation

By default, Mixpanel uses the IP address of the request. For server-side tracking:
# Pass client IP for accurate geolocation
mp.track('user_123', 'Button Clicked', {
    'button_name': 'Sign Up',
    'ip': request.remote_addr  # Client's IP, not server's
})

# Or skip geolocation entirely
mp.track('user_123', 'Button Clicked', {
    'button_name': 'Sign Up',
    'ip': 0  # Skips geolocation
})
Read our full guide on managing geolocation.

Identity Management

Server-Side Identity Management

Server-side SDKs don’t generate IDs automatically. You’re responsible for:
  • Generating unique IDs for users
  • Maintaining ID persistence across requests
  • Linking anonymous users to identified users
Read our Server-side ID Management guide.

Best Practices for IDs

  • Choose a format and stick with it (e.g., database IDs, UUIDs)
  • Never change a user’s distinct_id after it’s set
  • Use the same ID across all platforms (web, mobile, server)
For anonymous users:
  • Generate a unique device_id on first visit
  • Store it in a cookie or local storage
  • Use it as distinct_id until user identifies
  • Call $identify to merge when user signs up
// Before signup - anonymous
mixpanel.track('Page Viewed');

// User signs up
mixpanel.identify('user_123');

// After signup - all events linked to user_123
mixpanel.track('Profile Updated');
When a user signs up or logs in:
# Option 1: Simplified ID Merge (recommended)
mp.track('user_123', '$identify', {
    '$anon_id': 'device_xyz'
})

# This merges all events from device_xyz into user_123
Only merge once per user to avoid data issues.

Event Design Best Practices

Naming Conventions

Use Object + Action format: “Video Played”, “Purchase Completed”, “Profile Updated”This makes events easier to read and organize.
Good Examples:
  • “Video Played”
  • “Purchase Completed”
  • “Page Viewed”
  • “Form Submitted”
Bad Examples:
  • “play_video” (inconsistent format)
  • “user clicked the signup button” (too verbose)
  • “event_123” (meaningless)

Property Best Practices

  • Choose a naming convention (snake_case or camelCase) and stick with it
  • Use the same property name across all events when referring to the same thing
  • Document your naming conventions in your tracking plan
// Good - consistent naming
mixpanel.track('Video Played', {
    video_title: 'Getting Started',
    video_duration: 120,
    video_category: 'Tutorial'
});

// Bad - inconsistent naming
mixpanel.track('Video Played', {
    title: 'Getting Started',
    videoDuration: 120,
    Category: 'Tutorial'
});
  • Strings: Names, categories, IDs
  • Numbers: Counts, prices, durations
  • Booleans: Yes/no flags
  • Dates: ISO 8601 format or Unix timestamps
# Good - correct data types
mp.track('user_123', 'Purchase Completed', {
    'product_name': 'Premium Plan',      # String
    'price': 29.99,                       # Number
    'is_trial': False,                    # Boolean
    'timestamp': 1698023982               # Unix timestamp
})
Use consistent values to make filtering and segmentation easier:
// Good - consistent values
mixpanel.track('Button Clicked', {
    button_location: 'header'  // Always lowercase
});

// Bad - inconsistent values
// Sometimes 'Header', sometimes 'header', sometimes 'HEADER'
Don’t send personally identifiable information:
  • ❌ Full names, addresses, phone numbers
  • ❌ Credit card numbers, SSNs
  • ❌ Passwords or tokens
  • ✅ User IDs, anonymized identifiers
  • ✅ Email (if properly hashed)

Debugging Your Implementation

Create a Test Project

Always create a separate development project to validate your implementation without contaminating production data.

Use Events View

The Events view shows a live feed of incoming events:
  1. Fire test events from your own device
  2. Search by your distinct_id or device_id
  3. Expand events to inspect all properties
  4. Verify event and property names are correct
  5. Check that property values have correct data types
Events Filter

Enable Debug Mode

Most client-side SDKs support debug mode:
// JavaScript
mixpanel.init('YOUR_TOKEN', {
    debug: true
});
// Swift
Mixpanel.initialize(token: "YOUR_TOKEN", trackAutomaticEvents: true)
Mixpanel.mainInstance().loggingEnabled = true
// Android
val config = MixpanelAPI.Properties()
config.setDebug(true)
val mixpanel = MixpanelAPI.getInstance(context, "YOUR_TOKEN", config)

Check Browser Console (Web)

For web implementations:
  1. Enable debug mode
  2. Open browser developer console
  3. Go to Network > Fetch/XHR tab
  4. Perform actions that trigger events
  5. Look for requests to Mixpanel API:
    • US: api.mixpanel.com/track
    • EU: api-eu.mixpanel.com/track
    • India: api-in.mixpanel.com/track
  6. Verify the token and payload are correct

Common Issues

Possible causes:
  • Wrong project token
  • Wrong API endpoint for data residency
  • Events older than 5 days sent to /track (use /import instead)
  • Ad-blockers (for client-side tracking)
  • Events are hidden in Lexicon
Solutions:
  • Verify project token in Project Settings
  • Check that you’re using the correct API endpoint
  • For old events, use the /import endpoint
  • Check Lexicon for hidden events
Cause: Server-side tracking uses the server’s IP by defaultSolution: Pass the client’s IP:
mp.track('user_123', 'Event', {
    'ip': request.remote_addr  # Client IP
})
Cause: Events sent multiple times or from multiple sourcesSolution: Use $insert_id to deduplicate:
mixpanel.track('Purchase', {
    $insert_id: 'purchase_abc_123',  // Unique ID
    amount: 99.99
});
Cause: Mobile SDKs batch events and flush periodicallySolution:
  • iOS: Flushes every 60 seconds or when app backgrounds
  • Android: Flushes every 60 seconds or after 40 events
  • Call flush() manually for important events
// Swift
Mixpanel.mainInstance().track(event: "Sign Up")
Mixpanel.mainInstance().flush()  // Send immediately

Performance & Reliability

Use Batching

Batch multiple events in a single request:
import requests

events = [
    {"event": "Event 1", "properties": {...}},
    {"event": "Event 2", "properties": {...}},
    # Up to 2000 events
]

requests.post(
    "https://api.mixpanel.com/track",
    json=events
)

Implement Retry Logic

import time
import requests

def send_with_retry(url, payload, max_retries=3):
    for attempt in range(max_retries):
        try:
            response = requests.post(url, json=payload, timeout=10)
            if response.status_code == 200:
                return response
            if response.status_code == 429:  # Rate limited
                wait_time = (2 ** attempt) * 1
                time.sleep(wait_time)
        except requests.exceptions.RequestException as e:
            if attempt == max_retries - 1:
                raise
            time.sleep(2 ** attempt)
    return None

Queue Events

For high-volume applications, queue events and send asynchronously:
import queue
import threading
from mixpanel import Mixpanel

class MixpanelQueue:
    def __init__(self, token):
        self.mp = Mixpanel(token)
        self.queue = queue.Queue()
        self.worker = threading.Thread(target=self._process_queue)
        self.worker.start()
    
    def track(self, distinct_id, event, properties):
        self.queue.put((distinct_id, event, properties))
    
    def _process_queue(self):
        while True:
            distinct_id, event, properties = self.queue.get()
            try:
                self.mp.track(distinct_id, event, properties)
            except Exception as e:
                # Log error, implement retry logic
                pass
            self.queue.task_done()

Security Best Practices

Never expose API secrets in client-side codeYour project token is safe to use in client-side code, but your API secret should only be used server-side.
  • ✅ Use project token in web/mobile apps
  • ✅ Use API secret only on servers
  • ❌ Don’t commit API secrets to version control
  • ❌ Don’t send sensitive data (PII, passwords, etc.)

Data Quality Checklist

Before launching:
  • Created a tracking plan
  • Tested in a development project
  • Verified events appear in Events view
  • Checked event and property names follow conventions
  • Confirmed property data types are correct
  • Tested identity merge for signup/login flows
  • Verified geolocation is accurate
  • Set up separate dev/staging/production projects
  • Documented implementation for team
  • No PII or sensitive data in events

Next Steps

ID Management

Learn advanced identity management strategies

Debugging Guide

Detailed debugging and troubleshooting guide

Build docs developers (and LLMs) love