Skip to main content
The known_badAgents() function tests your network’s ability to detect and block HTTP requests using known malicious user agent strings. These user agents are associated with spam bots, web scrapers, vulnerability scanners, and other automated malicious activities.

Data Source

This module uses a comprehensive list of bad user agents:
  • Bad User Agents List - https://raw.githubusercontent.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker/master/_generator_lists/bad-user-agents.list
This list is maintained by the nginx-ultimate-bad-bot-blocker project and contains user agent strings from known malicious bots, scrapers, spammers, and attack tools.

How It Works

1

Download Bad User Agents List

Downloads the latest list of known malicious user agent strings from the GitHub repository.
urls = 'https://raw.githubusercontent.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker/master/_generator_lists/bad-user-agents.list'
response = requests.get(urls)
if response.status_code == 200:
    file_name = urls.split("/")[-1]
    with open(file_name, "w") as f:
        f.write(response.text)
2

Random Sampling

Randomly selects 15 malicious user agent strings from the list.
for file in saved_files:
    with open(file, 'r') as f:
        lines = f.readlines()
        for _ in range(15):
            randomAgent = random.choice(lines)
            sampleAgent.append(randomAgent)
sampleAgent = [x.strip() for x in sampleAgent]
3

HTTP Requests with Bad User Agents

Sends HTTPS requests to Google using each malicious user agent string in the HTTP headers.
url = 'https://google.com'
for agent in sampleAgent:
    headers = {'User-Agent': agent}
    response = requests.get(url, headers=headers)
4

Results Logging

Logs all requests with timestamps to Agent_Results.txt.
5

Cleanup

Removes temporary downloaded files after testing completes.

Output Format

Results are saved to Agent_Results.txt with the following format:
Timestamp:14:55:10 URL:Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; MRA 4.3 (build 01218); .NET CLR 1.1.4322) test DONE
Timestamp:14:55:12 URL:sqlmap/1.0-dev-nongit-20150902 (http://sqlmap.org) test DONE
Timestamp:14:55:14 URL:masscan/1.0 (https://github.com/robertdavidgraham/masscan) test DONE
def known_badAgents():
    print("Simulating traffic using known bad User-Agent(SPAM,botnet,etc)")
    urls = 'https://raw.githubusercontent.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker/master/_generator_lists/bad-user-agents.list'
    saved_files = []
    response = requests.get(urls)
    if response.status_code == 200:
        file_name = urls.split("/")[-1]
        with open(file_name, "w") as f:
            f.write(response.text)
            saved_files.append(file_name)
    sampleAgent = []
    for file in saved_files:
        with open(file, 'r') as f:
            lines = f.readlines()
            for _ in tqdm(range(15),desc='Downloading Samples'):
                randomAgent = random.choice(lines)
                sampleAgent.append(randomAgent)
    sampleAgent = [x.strip() for x in sampleAgent]
    myFile = open("Agent_Results.txt", mode="a+")
    url = 'https://google.com'
    for agent in tqdm(sampleAgent,desc='Sending HTTPS request to Google with known bad User-Agent'):
        headers = {'User-Agent': agent}
        response = requests.get(url, headers=headers)
        current_time = time.strftime("%X")
        result = f"Timestamp:{str(current_time)} URL:{str(agent)}" + " test DONE\n"
        myFile.write(result)
    for file_name in saved_files:
        os.remove(file_name)

What to Monitor

Web Application Firewalls

WAF should block or flag requests with known malicious user agent strings.

Web Server Logs

Review access logs for requests using suspicious or malicious user agents.

API Gateways

API security should validate user agents and block known malicious ones.

SIEM Correlation

Security monitoring should correlate bad user agents with other attack indicators.

Types of Malicious User Agents

Automated tools that scan for security vulnerabilities:
  • sqlmap - SQL injection testing tool
  • Nikto - Web server scanner
  • Acunetix - Commercial vulnerability scanner
  • masscan - Port scanner
  • ZmEu - Vulnerability scanner
These tools are often used by attackers to find exploitable weaknesses.
Bots that steal website content:
  • Email harvesting bots
  • Copyright infringement scrapers
  • Competitor intelligence bots
  • SEO manipulation tools
May violate terms of service and intellectual property rights.
Automated tools for spreading spam:
  • Comment spam bots
  • Form submission bots
  • Registration bots
  • Referrer spam bots
Degrade user experience and consume resources.
User agents associated with malicious software:
  • Botnet command & control
  • Backdoor communication
  • Data exfiltration tools
  • Trojan downloaders
Indicate compromised systems or active attacks.
Distributed denial of service attack tools:
  • Low Orbit Ion Cannon (LOIC)
  • High Orbit Ion Cannon (HOIC)
  • Slowloris variants
  • HTTP flood tools
Used to overwhelm web services with traffic.

Common Bad User Agent Examples

The test may include user agents from tools like:
  • sqlmap - Automated SQL injection tool
  • Nikto - Web server security scanner
  • masscan - High-speed port scanner
  • Python-urllib - Often used by malicious scripts
  • Go-http-client - Frequently abused for automated attacks
  • curl - Command-line tool (legitimate but often abused)
  • Wget - Download tool (legitimate but often abused)
  • ZmEu - Known vulnerability scanner
  • MJ12bot - Aggressive crawler often blocked
  • AhrefsBot - SEO crawler sometimes unwanted
While some tools like curl and wget have legitimate uses, they frequently appear in attack traffic and are often blocked by default in security policies.

Testing Workflow

# Run Somnium and select option 8
python main.py
# Choose: #8 Test connection using known bad user agents

# Review results
cat Agent_Results.txt

# Check your security controls
# - WAF logs for blocked requests
# - Web server access logs
# - API gateway security events
# - IPS alerts for malicious user agents
If requests with malicious user agents reach your web servers without being flagged or blocked, this indicates:
  • Inadequate WAF configuration
  • Missing user agent validation
  • Potential vulnerability to automated attacks
  • Risk of content scraping and data theft

Detection and Blocking Strategies

1

WAF Rules

Configure Web Application Firewall rules to block known malicious user agent patterns.
2

User Agent Validation

Implement application-level user agent validation and blocking logic.
3

Rate Limiting

Apply aggressive rate limiting to suspicious user agents.
4

Behavioral Analysis

Monitor for bot-like behavior patterns regardless of user agent string.

Security Controls to Validate

  • Web Application Firewalls (WAF) - Should block requests with known bad user agents
  • API Gateways - Should validate user agents for API requests
  • Bot Management Solutions - Should identify and block malicious bots
  • Rate Limiting - Should throttle or block automated requests
  • SIEM Rules - Should alert on malicious user agent patterns
  • CDN Security - Should filter bad bots at the edge

Advanced Evasion Techniques

Be aware that sophisticated attackers may:
  • Rotate user agents to mimic legitimate browsers
  • Use user agent strings from real browsers
  • Randomize user agents to avoid pattern detection
  • Combine user agent spoofing with other evasion techniques
Effective bot protection requires multiple layers:
  1. User agent filtering (basic)
  2. JavaScript challenges
  3. CAPTCHA for suspicious requests
  4. Behavioral analysis
  5. Machine learning-based detection

Impact of Malicious Bots

Uncontrolled bot traffic can cause:
  • Performance Degradation - Excessive resource consumption
  • Increased Costs - Higher bandwidth and infrastructure expenses
  • Data Theft - Scraping of proprietary content and user data
  • Security Risks - Vulnerability scanning leading to exploits
  • Analytics Pollution - Skewed metrics and reporting
  • Competitive Intelligence - Unauthorized data collection by competitors

Best Practices

  1. Maintain Blocklists - Keep user agent blocklists updated regularly
  2. Monitor Trends - Track emerging malicious user agents
  3. Layer Defense - Don’t rely solely on user agent filtering
  4. Log Everything - Maintain detailed logs for forensic analysis
  5. Regular Testing - Use tools like Somnium to validate controls
  6. Tune Rules - Balance security with legitimate automated access

Build docs developers (and LLMs) love