Overview
The DNS lookup phase is the most resource-intensive part of the BlackWeb update process. It validates millions of domains through actual DNS queries to exclude nonexistent or invalid domains from the final blocklist.
High Resource Consumption! This process uses parallel DNS queries that can saturate CPU and network bandwidth, especially on limited connections like satellite links.
Why DNS Validation?
Many public blocklist sources contain:
Expired domains
Nonexistent domains (typos in original lists)
Domains that never existed
Domains that have been taken down
By validating each domain via DNS, BlackWeb ensures only active, resolvable domains are blocked, reducing false positives and improving performance.
Two-Step Validation Process
The script performs DNS lookup in two steps with different timeout values:
Step 1: Initial Lookup (1-second timeout)
# STEP 1:
if [ ! -e " $bwupdate " /dnslookup1 ]; then
echo "${ bw08 [ $lang ]}"
sed 's/^\.//g' finalclean | sort -u > step1
total = $( wc -l < step1 )
(
while sleep 1 ; do
processed = $( wc -l < dnslookup1 2> /dev/null )
percent = $( awk -v p=" $processed " -v t=" $total " 'BEGIN { if (t > 0) printf "%.2f", (p/t)*100; else print 100 }' )
printf "Processed: %d / %d (%s%%)\r" " $processed " " $total " " $percent "
done
) &
progress_pid = $!
if [ -s dnslookup1 ]; then
awk 'FNR==NR {seen[$2]=1;next} seen[$1]!=1' dnslookup1 step1
else
cat step1
fi | xargs -I {} -P " $PROCS " sh -c "if host -W 1 {} >/dev/null; then echo HIT {}; else echo FAULT {}; fi" >> dnslookup1
kill " $progress_pid " 2> /dev/null
echo
sed '/^FAULT/d' dnslookup1 | awk '{print $2}' | awk '{print "." $1}' | sort -u > hit.txt
sed '/^HIT/d' dnslookup1 | awk '{print $2}' | awk '{print "." $1}' | sort -u >> fault.txt
sort -o fault.txt -u fault.txt
echo "OK"
fi
Key Features:
Uses host -W 1 (1-second timeout)
Marks domains as HIT (resolved) or FAULT (failed)
Runs in parallel using xargs -P $PROCS
Real-time progress display
Resumes from checkpoint if interrupted
Step 2: Retry Failed Domains (2-second timeout)
sleep 10
# STEP 2:
echo "${ bw09 [ $lang ]}"
sed 's/^\.//g' fault.txt | sort -u > step2
total = $( wc -l < step2 )
(
while sleep 1 ; do
processed = $( wc -l < dnslookup2 2> /dev/null )
percent = $( awk -v p=" $processed " -v t=" $total " 'BEGIN { if (t > 0) printf "%.2f", (p/t)*100; else print 100 }' )
printf "Processed: %d / %d (%s%%)\r" " $processed " " $total " " $percent "
done
) &
progress_pid = $!
if [ -s dnslookup2 ]; then
awk 'FNR==NR {seen[$2]=1;next} seen[$1]!=1' dnslookup2 step2
else
cat step2
fi | xargs -I {} -P " $PROCS " sh -c "if host -W 2 {} >/dev/null; then echo HIT {}; else echo FAULT {}; fi" >> dnslookup2
kill " $progress_pid " 2> /dev/null
echo
sed '/^FAULT/d' dnslookup2 | awk '{print $2}' | awk '{print "." $1}' | sort -u >> hit.txt
sed '/^HIT/d' dnslookup2 | awk '{print $2}' | awk '{print "." $1}' | sort -u > fault.txt
echo "OK"
Why Two Steps?
First pass (1s timeout) : Quickly filters out obviously dead domains
10-second pause : Prevents overwhelming DNS infrastructure
Second pass (2s timeout) : Gives slower-responding domains a second chance
Parallel Processing Configuration
The number of parallel DNS queries is controlled by the PROCS variable:
Understanding PROCS
The formula is:
PROCS = Number of Logical CPUs × Multiplier
Recommended Settings
Conservative
Balanced
Aggressive (Default)
Extreme
PROCS = $(($( nproc ))) # Network-friendly
Best for:
Limited bandwidth connections
Satellite or metered internet
Shared DNS servers
Low-power systems
PROCS = $(($( nproc ) * 2)) # Balanced approach
Best for:
Standard broadband connections
Typical home/office setups
General use cases
PROCS = $(($( nproc ) * 4)) # Fast but intensive
Best for:
High-bandwidth connections
Dedicated servers
Powerful CPUs
Local DNS caching
PROCS = $(($( nproc ) * 8)) # Maximum speed, use with caution
Best for:
Data center environments
Very high bandwidth
Dedicated hardware
May overwhelm DNS servers or hit rate limits
Example: Core i5 CPU
For a Core i5 with 4 physical cores and 8 threads (Hyper-Threading):
nproc → 8
PROCS = $(( 8 * 4 )) → 32 parallel queries
Checking Your CPU Configuration
# Physical cores
grep '^core id' /proc/cpuinfo | sort -u | wc -l
# Logical CPUs (threads)
nproc
# Xargs parallel limit (usually 127+)
xargs --show-limits
Real-Time Progress Display
The script shows live processing statistics:
Processed: 2463489 / 7244989 (34.00%)
This progress indicator:
Updates every second
Shows domains processed vs. total
Displays percentage completion
Runs in a background process
Implementation
total = $( wc -l < step1 )
(
while sleep 1 ; do
processed = $( wc -l < dnslookup1 2> /dev/null )
percent = $( awk -v p=" $processed " -v t=" $total " 'BEGIN { if (t > 0) printf "%.2f", (p/t)*100; else print 100 }' )
printf "Processed: %d / %d (%s%%)\r" " $processed " " $total " " $percent "
done
) &
progress_pid = $!
The progress monitor is killed after completion:
kill " $progress_pid " 2> /dev/null
DNS Query Results
HIT (Domain Resolved)
HIT google.com
google.com has address 142.251.35.238
google.com has IPv6 address 2607:f8b0:4008:80b::200e
google.com mail is handled by 10 smtp.google.com.
A “HIT” means:
Domain exists
DNS resolves successfully
Domain will be included in final blocklist
FAULT (Domain Failed)
FAULT testfaultdomain.com
Host testfaultdomain.com not found: 3 ( NXDOMAIN )
A “FAULT” means:
Domain doesn’t exist (NXDOMAIN)
DNS query timed out
Temporary DNS failure
Domain will be excluded from blocklist
Resume Capability
The script can resume DNS lookup if interrupted:
if [ -s dnslookup1 ]; then
awk 'FNR==NR {seen[$2]=1;next} seen[$1]!=1' dnslookup1 step1
else
cat step1
fi | xargs -I {} -P " $PROCS " sh -c "..."
This logic:
Checks if dnslookup1 file exists and has content
Excludes already-processed domains
Only queries remaining domains
Prevents duplicate work
If you interrupt the script during DNS lookup (Ctrl+C), it automatically resumes from where it left off on the next run.
Adjusting for Your Environment
Factors to Consider
Factor Lower PROCS Higher PROCS CPU Older/slower CPU Modern multi-core CPU Network Satellite, metered, slow Fiber, unlimited, fast DNS Server Public DNS (8.8.8.8) Local caching DNS System Load Production server Dedicated test machine Priority Minimize impact Maximize speed
Recommended Adjustments
Edit bwupdate.sh line 388:
# Change this line based on your needs:
PROCS = $(($( nproc ) * 4)) # Default: Aggressive
Replace with your preferred setting:
PROCS = $(($( nproc ))) # Conservative
PROCS = $(($( nproc ) * 2)) # Balanced
PROCS = $(($( nproc ) * 8)) # Extreme
Network Saturation : High PROCS values can saturate DNS servers, causing:
Rate limiting
Temporary bans
Increased FAULT results (false negatives)
Slower overall performance
While the script runs, monitor:
# CPU usage
htop
# Network traffic
iftop
# DNS query rate
watch -n 1 'wc -l dnslookup1'
# System load
uptime
Troubleshooting
Reduce PROCS value (network/DNS overload)
Check DNS server responsiveness
Verify internet connection stability
Consider using local caching DNS (dnsmasq, unbound)
Increase PROCS value (if system can handle it)
Use faster DNS servers (Cloudflare 1.1.1.1, Google 8.8.8.8)
Check for bandwidth throttling
Verify CPU isn’t maxed out
System becomes unresponsive
Immediately reduce PROCS value
Kill the script and restart with lower parallelism
Monitor system resources before restarting
Consider running on dedicated hardware
Use local recursive DNS resolver
Reduce PROCS significantly
Add delays between queries
Spread queries across multiple DNS servers
Next Steps
Domain Debugging Learn about domain validation, TLD checking, Punycode conversion, and ASCII cleanup