A lightweight, zero-dependency Deep Packet Inspection engine written in pure Python. It reads .pcap network capture files, identifies application traffic using TLS SNI, HTTP Host headers, and DNS queries, and filters out packets based on configurable blocking rules.
dpi_python/
├── main.py # CLI entry point
├── generate_test_pcap.py # Synthetic traffic generator for testing
├── test_dpi.pcap # Sample test capture file
├── src/
│ ├── types.py # AppType enum, FiveTuple, Flow, PCAP structs
│ ├── pcap_reader.py # Binary PCAP reader + writer (pure struct)
│ ├── packet_parser.py # Ethernet / IPv4 / TCP / UDP parser
│ ├── sni_extractor.py # TLS SNI · HTTP Host · DNS · QUIC extractors
│ ├── rule_manager.py # Thread-safe IP / App / Domain rules
│ ├── connection_tracker.py # Flow state tracker with expiry
│ ├── dpi_engine.py # Single-threaded engine
│ └── dpi_engine_mt.py # Multi-threaded load-balancer → fingerprinter pipeline
└── tests/
└── test_dpi.py # 34 unit + integration tests (all passing ✅)
- Python 3.8 or higher
- No external packages needed — 100% Python standard library (
struct,threading,queue)
Check your Python version:
python --versionunzip Deep Packet Inspection (DPI) Engine
cd DPIA sample test_dpi.pcap is already included, but you can regenerate it:
python generate_test_pcap.py test_dpi.pcappython main.py test_dpi.pcap output.pcapThe engine will analyze every packet, print a live report to the terminal, and write a filtered output.pcap containing only the allowed traffic.
python main.py <input.pcap> <output.pcap> [options]
| Option | Description | Example |
|---|---|---|
--block-app <name> |
Block traffic by app name | --block-app YouTube |
--block-ip <ip> |
Block traffic by IP address | --block-ip 192.168.1.50 |
--block-domain <keyword> |
Block traffic by domain keyword | --block-domain facebook |
--mt |
Enable multi-threaded mode | --mt |
--lbs <n> |
Number of load-balancer threads (MT mode) | --lbs 2 |
--fps <n> |
Number of fingerprinter threads (MT mode) | --fps 2 |
Block a single app:
python main.py test_dpi.pcap output.pcap --block-app YouTubeBlock multiple apps:
python main.py test_dpi.pcap output.pcap --block-app TikTok --block-app InstagramBlock by IP address:
python main.py test_dpi.pcap output.pcap --block-ip 192.168.1.50Block by domain keyword:
python main.py test_dpi.pcap output.pcap --block-domain facebookCombine all rule types:
python main.py test_dpi.pcap output.pcap \
--block-app YouTube \
--block-app TikTok \
--block-ip 192.168.1.50 \
--block-domain facebookMulti-threaded mode (faster for large files):
python main.py test_dpi.pcap output.pcap --block-app Netflix --mt --lbs 2 --fps 2| App | App | App |
|---|---|---|
| YouTube | TikTok | |
| Netflix | Amazon | Microsoft |
| Apple | Telegram | Spotify |
| Zoom | Discord | GitHub |
| Cloudflare | — |
python tests/test_dpi.pyExpected output:
test_dns_response_ignored ... ok
test_extract_domain ... ok
test_extract_host ... ok
...
Ran 34 tests in X.XXXs
OK
All 34 tests cover unit and integration scenarios including TLS SNI extraction, HTTP Host parsing, DNS extraction, rule management, and the full single-threaded pipeline.
The engine uses Deep Packet Inspection (DPI) to look inside network packets and identify which application generated the traffic. It works in three stages:
Each raw packet from the .pcap file is parsed layer by layer:
- Ethernet → extracts MAC addresses
- IPv4 → extracts source/destination IP addresses
- TCP/UDP → extracts port numbers and payload
The payload is analyzed using three methods:
| Method | Protocol | How it works |
|---|---|---|
| TLS SNI Extraction | HTTPS | Reads the Server Name Indication field from the TLS Client Hello handshake |
| HTTP Host Extraction | HTTP | Reads the Host: header from plain HTTP requests |
| DNS Query Extraction | DNS | Reads the queried domain name from DNS request packets |
The identified hostname (e.g., www.youtube.com) is then matched against a list of 20+ application signatures to identify the app.
If a packet matches a blocking rule (by app, IP, or domain keyword), it is dropped. All remaining packets are written to the output .pcap file.
When --mt is used, the engine runs a producer-consumer pipeline:
PCAP Reader → Load Balancers (distribute flows) → Fingerprinters (analyze & filter) → Output Writer
Flows are consistently hashed to the same thread to preserve packet ordering within each connection.
A .pcap file is a Packet Capture file — a binary recording of all network traffic on a connection or interface, similar to a "video recording" of your internet activity.
- Create one: Use Wireshark (File → Record) or
tcpdump -w capture.pcap - Open one: Use Wireshark — it shows every packet with source/destination IPs, protocols, and payload details
- This project: Reads your
.pcap, identifies and filters app traffic, writes a clean.pcapback out
| Criterion | Score | Notes |
|---|---|---|
| Architecture | 9/10 | Clean separation of components across files |
| Documentation | 10/10 | Best-in-class README for a learning project |
| Code Quality | 8/10 | Clean, readable Python throughout |
| Features | 8/10 | TLS SNI, HTTP Host, DNS, 20+ app signatures, MT pipeline |
| Testing | 9/10 | 34 passing unit + integration tests |
| Dependencies | 10/10 | Zero external packages — pure stdlib |
Overall: 8.5 / 10
This project is for educational and research purposes.