eBPF Use Cases: Revolutionizing Security, Networking, and Observability#

eBPF has transformed from an interesting kernel technology to a production-ready solution powering critical infrastructure worldwide. This post explores how eBPF is being used in three major domains: security, networking, and observability, with real-world examples and practical implementations.

Security: The New Frontier of Runtime Protection#

Runtime Security Monitoring#

eBPF enables unprecedented visibility into system behavior without modifying applications or requiring kernel modules.

Example: Detecting Suspicious Process Behavior#

1
SEC("tracepoint/sched/sched_process_exec")
2
int detect_suspicious_exec(struct trace_event_raw_sched_process_exec *ctx) {
3
    char comm[TASK_COMM_LEN];
4
    bpf_get_current_comm(&comm, sizeof(comm));
5

6
    // Detect execution from /tmp or /dev/shm
7
    if (strstr(ctx->filename, "/tmp/") || strstr(ctx->filename, "/dev/shm/")) {
8
        // Alert on suspicious execution location
9
        bpf_printk("ALERT: Suspicious exec from %s by %s\n",
10
                   ctx->filename, comm);
11
    }
12
    return 0;
13
}

Container Security#

eBPF provides deep visibility into container operations at the kernel level, enabling security tools to:

Monitor system calls per container
Track network connections
Detect container escapes
Enforce security policies

Production Example: Falco#

1
# Falco rule using eBPF for container security
2
- rule: Container Drift Detected
3
  desc: Detect new executables in a running container
4
  condition: >
5
    container and proc.name != <known_processes>
6
    and evt.type = execve
7
  output: >
8
    New process launched in container
9
    (user=%user.name command=%proc.cmdline container=%container.name)

Network Security#

DDoS Mitigation with XDP#

1
SEC("xdp")
2
int xdp_ddos_mitigate(struct xdp_md *ctx) {
3
    void *data_end = (void *)(long)ctx->data_end;
4
    void *data = (void *)(long)ctx->data;
5
    struct ethhdr *eth = data;
6

7
    // Rate limiting per source IP
8
    struct iphdr *ip = data + sizeof(*eth);
9
    if (ip + 1 > data_end)
10
        return XDP_PASS;
11

12
    __u32 src_ip = ip->saddr;
13
    struct rate_limit *rl = bpf_map_lookup_elem(&rate_map, &src_ip);
14

15
    if (rl && rl->packets > THRESHOLD) {
16
        // Drop packets exceeding threshold
17
        return XDP_DROP;
18
    }
19

20
    // Update rate limit counter
21
    update_rate_limit(src_ip);
22
    return XDP_PASS;
23
}

Real-World Security Deployments#

1. Cloudflare - L4Drop#

Challenge: Mitigate DDoS attacks at line rate
Solution: XDP-based packet filtering
Result: Drops malicious packets in < 1 microsecond

2. Netflix - Cloud Security#

Challenge: Monitor thousands of microservices
Solution: eBPF-based system call monitoring
Result: Real-time security insights with < 1% overhead

3. Google - gVisor TLS Inspector#

Challenge: Inspect encrypted traffic for security policies
Solution: eBPF program capturing TLS handshake SNI
Result: Policy enforcement without decryption

Networking: Performance at Scale#

High-Performance Load Balancing#

Facebook’s Katran#

1
// Simplified version of Facebook's Katran load balancer
2
SEC("xdp")
3
int xdp_load_balancer(struct xdp_md *ctx) {
4
    // Parse packet headers
5
    struct header_pointers hdr = parse_headers(ctx);
6
    if (!hdr.ip)
7
        return XDP_PASS;
8

9
    // Compute hash for consistent hashing
10
    __u32 hash = jhash_2words(hdr.ip->saddr,
11
                              hdr.tcp ? hdr.tcp->source : 0,
12
                              HASH_SEED);
13

14
    // Select backend based on hash
15
    struct backend *backend = select_backend(hash);
16
    if (!backend)
17
        return XDP_PASS;
18

19
    // Encapsulate and redirect
20
    encap_and_redirect(ctx, backend);
21
    return XDP_REDIRECT;
22
}

Performance Impact:

10x more packets per second vs IPVS
Sub-microsecond latency
Linear scaling with CPU cores

Service Mesh Data Plane#

Cilium - eBPF-powered Kubernetes Networking#

1
apiVersion: cilium.io/v2
2
kind: CiliumNetworkPolicy
3
metadata:
4
  name: api-access
5
spec:
6
  endpointSelector:
7
    matchLabels:
8
      app: api-service
9
  ingress:
10
    - fromEndpoints:
11
        - matchLabels:
12
            app: frontend
13
      toPorts:
14
        - ports:
15
            - port: "80"
16
              protocol: TCP
17
          rules:
18
            http:
19
              - method: "GET"
20
                path: "/api/v1/.*"

Benefits:

No sidecar proxy overhead
Kernel-level policy enforcement
Connection-level load balancing
Transparent service mesh

Network Performance Optimization#

TCP Congestion Control#

1
// Custom TCP congestion control algorithm
2
struct tcp_congestion_ops my_cc = {
3
    .name = "my_ebpf_cc",
4
    .init = ebpf_cc_init,
5
    .cong_avoid = ebpf_cong_avoid,
6
    .set_state = ebpf_set_state,
7
    .undo_cwnd = ebpf_undo_cwnd,
8
};
9

10
SEC("struct_ops/tcp_congestion_ops")
11
struct tcp_congestion_ops *ebpf_cc = &my_cc;

Production Networking Examples#

1. LinkedIn - TCP Optimization#

Challenge: Optimize TCP for datacenter networks
Solution: eBPF-based congestion control
Result: 30% reduction in tail latency

2. Shopify - Edge Load Balancing#

Challenge: Handle Black Friday traffic spikes
Solution: XDP-based load balancer
Result: 5x increase in capacity

Observability: Deep Insights with Minimal Overhead#

Distributed Tracing#

Trace HTTP Requests Across Services#

1
struct trace_event {
2
    __u64 timestamp;
3
    __u32 pid;
4
    __u32 tid;
5
    char comm[16];
6
    __u64 duration_ns;
7
};
8

9
SEC("uprobe/http_handler")
10
int trace_http_start(struct pt_regs *ctx) {
11
    __u64 pid_tgid = bpf_get_current_pid_tgid();
12
    __u64 ts = bpf_ktime_get_ns();
13

14
    // Store start timestamp
15
    bpf_map_update_elem(&start_times, &pid_tgid, &ts, BPF_ANY);
16
    return 0;
17
}
18

19
SEC("uretprobe/http_handler")
20
int trace_http_end(struct pt_regs *ctx) {
21
    __u64 pid_tgid = bpf_get_current_pid_tgid();
22
    __u64 *start_ts = bpf_map_lookup_elem(&start_times, &pid_tgid);
23

24
    if (!start_ts)
25
        return 0;
26

27
    struct trace_event event = {};
28
    event.timestamp = bpf_ktime_get_ns();
29
    event.duration_ns = event.timestamp - *start_ts;
30
    event.pid = pid_tgid >> 32;
31
    event.tid = pid_tgid;
32
    bpf_get_current_comm(&event.comm, sizeof(event.comm));
33

34
    // Send to user space
35
    bpf_perf_event_output(ctx, &events, BPF_F_CURRENT_CPU,
36
                          &event, sizeof(event));
37
    return 0;
38
}

Continuous Profiling#

CPU Profiling with Stack Traces#

1
SEC("perf_event")
2
int profile_cpu(struct bpf_perf_event_data *ctx) {
3
    __u32 pid = bpf_get_current_pid_tgid() >> 32;
4

5
    // Get stack trace
6
    __u64 stack_id = bpf_get_stackid(ctx, &stacks,
7
                                      BPF_F_USER_STACK);
8

9
    // Count samples per stack
10
    struct stack_count_key key = {
11
        .pid = pid,
12
        .kernel_stack_id = -1,
13
        .user_stack_id = stack_id,
14
    };
15

16
    increment_stack_count(&key);
17
    return 0;
18
}

Custom Metrics and Monitoring#

Application-Specific Metrics#

1
# Using bcc to create custom metrics
2
from bcc import BPF
3

4
program = """
5
BPF_HISTOGRAM(latency_hist, u64);
6

7
int trace_db_query(struct pt_regs *ctx) {
8
    u64 ts = bpf_ktime_get_ns();
9
    u64 pid = bpf_get_current_pid_tgid();
10
    start_times.update(&pid, &ts);
11
    return 0;
12
}
13

14
int trace_db_query_return(struct pt_regs *ctx) {
15
    u64 pid = bpf_get_current_pid_tgid();
16
    u64 *start = start_times.lookup(&pid);
17
    if (start) {
18
        u64 delta = bpf_ktime_get_ns() - *start;
19
        latency_hist.increment(bpf_log2l(delta));
20
        start_times.delete(&pid);
21
    }
22
    return 0;
23
}
24
"""
25

26
b = BPF(text=program)
27
b.attach_uprobe(name="mysql", sym="mysql_query", fn_name="trace_db_query")
28
b.attach_uretprobe(name="mysql", sym="mysql_query", fn_name="trace_db_query_return")

Real-World Observability Deployments#

1. Netflix - FlameScope#

Challenge: Understand performance variability
Solution: eBPF-based subsecond offset heat maps
Result: Identified previously invisible performance patterns

2. Datadog - NPM (Network Performance Monitoring)#

Challenge: Monitor network flows without agents
Solution: eBPF programs tracking connections
Result: Complete network visibility with minimal overhead

3. New Relic - Pixie#

Challenge: Instant Kubernetes observability
Solution: eBPF-based automatic instrumentation
Result: No code changes required for full observability

Advanced Use Cases#

1. Chaos Engineering#

1
// Inject controlled failures for testing
2
SEC("kprobe/tcp_sendmsg")
3
int inject_network_delay(struct pt_regs *ctx) {
4
    if (should_inject_fault()) {
5
        // Add artificial delay
6
        bpf_ktime_get_ns();  // Simulate processing
7
        return -EAGAIN;  // Force retry
8
    }
9
    return 0;
10
}

2. Smart Traffic Shaping#

1
// Prioritize traffic based on application behavior
2
SEC("tc")
3
int shape_traffic(struct __sk_buff *skb) {
4
    // Identify application based on payload
5
    if (is_video_streaming(skb)) {
6
        skb->priority = TC_PRIO_INTERACTIVE;
7
    } else if (is_bulk_transfer(skb)) {
8
        skb->priority = TC_PRIO_BULK;
9
    }
10
    return TC_ACT_OK;
11
}

3. Database Query Optimization#

Monitor and optimize database queries in real-time:

Track query patterns
Identify slow queries
Suggest index improvements
Cache frequently accessed data

Best Practices for Production eBPF#

1. Performance Considerations#

Use per-CPU maps to avoid contention
Minimize work in eBPF programs
Batch operations when possible
Profile eBPF programs themselves

2. Security Guidelines#

Implement proper RBAC for eBPF access
Audit eBPF program loading
Use signed eBPF programs
Monitor eBPF-related syscalls

3. Reliability Patterns#

Implement circuit breakers
Use ringbuffer maps for reliable event delivery
Handle map full conditions gracefully
Test with kernel version variations

Future Directions#

Emerging Trends#

eBPF for Databases: Query optimization and caching
Edge Computing: Running eBPF on IoT devices
Hardware Offload: SmartNIC eBPF processing
Cross-Platform: Windows and macOS support

Integration with Modern Stack#

Service Mesh: Replacing sidecars with eBPF
Serverless: Fast cold starts with eBPF
GitOps: eBPF program deployment via Git
AIOps: ML-driven eBPF program generation

Conclusion#

eBPF has proven itself as a transformative technology across security, networking, and observability domains. Its ability to provide deep kernel-level insights and control with minimal overhead has made it the foundation for next-generation infrastructure tools.

From Cloudflare’s DDoS protection to Netflix’s performance monitoring, eBPF is powering critical systems at scale. As the ecosystem matures and tools become more accessible, we can expect eBPF adoption to accelerate across all aspects of systems engineering.

The examples in this post demonstrate that eBPF is not just a research project or experimental technology—it’s production-ready and solving real problems today. Whether you’re building security tools, optimizing networks, or improving observability, eBPF provides the primitives to innovate at the kernel level.

Getting Started with These Use Cases#

Security: Start with Falco or Tetragon for runtime security
Networking: Try Cilium for Kubernetes or Katran for load balancing
Observability: Use bpftrace for ad-hoc analysis or Pixie for Kubernetes