Parsing TCP Packets with BPF and Python

Want to inspect live TCP packets using Python — without depending on Wireshark, libpcap, or other heavy tools? This post shows how to parse TCP events in real-time using eBPF with the BCC (BPF Compiler Collection) Python bindings. Perfect for writing your own observability tooling, monitoring socket behavior, or even building a custom network profiler. Setup You'll need a Linux system with kernel version 4.1 or newer, and root privileges. Install BCC and its Python bindings: sudo apt install bpfcc-tools linux-headers-$(uname -r) python3-bcc pip install bcc Full Working Code from bcc import BPF from socket import inet_ntop, AF_INET from struct import pack import ctypes bpf_code = """ include include include include struct tcp_event_t { u32 pid; u32 saddr; u32 daddr; u16 sport; u16 dport; }; BPF_PERF_OUTPUT(events); int trace_tcp_rcv(struct pt_regs *ctx, struct sock *sk) { struct tcp_event_t data = {}; u16 sport = 0, dport = 0; data.pid = bpf_get_current_pid_tgid() >> 32; bpf_probe_read_kernel(&data.saddr, sizeof(data.saddr), &sk->__sk_common.skc_rcv_saddr); bpf_probe_read_kernel(&data.daddr, sizeof(data.daddr), &sk->__sk_common.skc_daddr); bpf_probe_read_kernel(&sport, sizeof(sport), &sk->__sk_common.skc_num); bpf_probe_read_kernel(&dport, sizeof(dport), &sk->__sk_common.skc_dport); data.sport = sport; data.dport = ntohs(dport); events.perf_submit(ctx, &data, sizeof(data)); return 0; } """ class TcpEvent(ctypes.Structure): fields = [ ("pid", ctypes.c_uint32), ("saddr", ctypes.c_uint32), ("daddr", ctypes.c_uint32), ("sport", ctypes.c_uint16), ("dport", ctypes.c_uint16) ] b = BPF(text=bpf_code) b.attach_kprobe(event="tcp_rcv_established", fn_name="trace_tcp_rcv") def handle_event(cpu, data, size): event = ctypes.cast(data, ctypes.POINTER(TcpEvent)).contents print(f"PID {event.pid} {inet_ntop(AF_INET, pack('>I', event.saddr))}:{event.sport} → " f"{inet_ntop(AF_INET, pack('>I', event.daddr))}:{event.dport}") print("Tracing TCP connections... Press Ctrl-C to stop.") b["events"].open_perf_buffer(handle_event) try: while True: b.perf_buffer_poll() except KeyboardInterrupt: pass Explanation This code uses a kprobe to tap into tcp_rcv_established, the kernel function that receives TCP data. It extracts metadata like PID, IP addresses, and ports from the kernel socket structure (sock). These are emitted to user space via BPF’s perf buffer, where Python receives and prints them in near real-time. Pros & Cons ✅ Pros No libpcap or raw socket setup High-performance and low-overhead Accessible from user-space Python Works on live production systems ⚠️ Cons Linux-only (requires BPF syscalls) Root privileges needed Not payload-level inspection — just metadata Python BCC can be finicky across distros Wrap-Up This technique gives you programmable access to TCP stream metadata without leaving Python or relying on bloated tools. Ideal for custom observability agents, connection profiling, or learning how sockets work at the kernel level. If this was useful, you can Buy Me a Coffee ☕

Apr 22, 2025 - 21:16
 0
Parsing TCP Packets with BPF and Python

Want to inspect live TCP packets using Python — without depending on Wireshark, libpcap, or other heavy tools? This post shows how to parse TCP events in real-time using eBPF with the BCC (BPF Compiler Collection) Python bindings. Perfect for writing your own observability tooling, monitoring socket behavior, or even building a custom network profiler.

Setup


You'll need a Linux system with kernel version 4.1 or newer, and root privileges. Install BCC and its Python bindings:

sudo apt install bpfcc-tools linux-headers-$(uname -r) python3-bcc
pip install bcc

Full Working Code


from bcc import BPF
from socket import inet_ntop, AF_INET
from struct import pack
import ctypes

bpf_code = """

include

include

include

include

struct tcp_event_t {
u32 pid;
u32 saddr;
u32 daddr;
u16 sport;
u16 dport;
};
BPF_PERF_OUTPUT(events);

int trace_tcp_rcv(struct pt_regs *ctx, struct sock *sk) {
struct tcp_event_t data = {};
u16 sport = 0, dport = 0;

data.pid = bpf_get_current_pid_tgid() >> 32;

bpf_probe_read_kernel(&data.saddr, sizeof(data.saddr), &sk->__sk_common.skc_rcv_saddr);
bpf_probe_read_kernel(&data.daddr, sizeof(data.daddr), &sk->__sk_common.skc_daddr);
bpf_probe_read_kernel(&sport, sizeof(sport), &sk->__sk_common.skc_num);
bpf_probe_read_kernel(&dport, sizeof(dport), &sk->__sk_common.skc_dport);

data.sport = sport;
data.dport = ntohs(dport);

events.perf_submit(ctx, &data, sizeof(data));
return 0;

}
"""

class TcpEvent(ctypes.Structure):
fields = [
("pid", ctypes.c_uint32),
("saddr", ctypes.c_uint32),
("daddr", ctypes.c_uint32),
("sport", ctypes.c_uint16),
("dport", ctypes.c_uint16)
]

b = BPF(text=bpf_code)
b.attach_kprobe(event="tcp_rcv_established", fn_name="trace_tcp_rcv")

def handle_event(cpu, data, size):
event = ctypes.cast(data, ctypes.POINTER(TcpEvent)).contents
print(f"PID {event.pid} {inet_ntop(AF_INET, pack('>I', event.saddr))}:{event.sport} → "
f"{inet_ntop(AF_INET, pack('>I', event.daddr))}:{event.dport}")

print("Tracing TCP connections... Press Ctrl-C to stop.")
b["events"].open_perf_buffer(handle_event)

try:
while True:
b.perf_buffer_poll()
except KeyboardInterrupt:
pass

Explanation


This code uses a kprobe to tap into tcp_rcv_established, the kernel function that receives TCP data. It extracts metadata like PID, IP addresses, and ports from the kernel socket structure (sock). These are emitted to user space via BPF’s perf buffer, where Python receives and prints them in near real-time.

Pros & Cons

✅ Pros


  • No libpcap or raw socket setup
  • High-performance and low-overhead
  • Accessible from user-space Python
  • Works on live production systems

⚠️ Cons


  • Linux-only (requires BPF syscalls)
  • Root privileges needed
  • Not payload-level inspection — just metadata
  • Python BCC can be finicky across distros

Wrap-Up


This technique gives you programmable access to TCP stream metadata without leaving Python or relying on bloated tools. Ideal for custom observability agents, connection profiling, or learning how sockets work at the kernel level.

If this was useful, you can Buy Me a Coffee