eBPF TLS Tracing: The Past, Present and Future of Encrypted Traffic Observability
Instrumentation-free, eBPF-based observability tools such as DeepFlow and Pixie aim to provide comprehensive observability coverage out of the box. By leveraging eBPF, these tools inspect all network traffic by hooking into Linux’s network stack through syscalls. However, the prevalence of TLS and encrypted traffic obscures this global view, necessitating probing applications higher in the software stack (at the TLS library layer) to regain access to plaintext data.
This evolution moves eBPF instrumentation from stable kernel syscalls to unstable user space interfaces (TLS libraries, application binaries, etc.). This comprehensive guide explores the TLS tracing tactics used by open source projects, how they’ve evolved to address unstable user space library interfaces, and where the future is headed.
The TLS Observability Challenge
graph TB subgraph "Traditional Network Observability" T1[Network Packets] --> T2[Syscall Hooks] T2 --> T3[Plaintext Traffic] T3 --> T4[Application Insights] end
subgraph "TLS-Encrypted Observability Challenge" E1[Encrypted Packets] --> E2[Syscall Hooks] E2 --> E3[Encrypted Traffic] E3 --> E4[Limited Insights] E5[TLS Library Hooks] --> E6[Plaintext Access] E6 --> E7[Complete Observability] end
style T3 fill:#c8e6c9 style E3 fill:#ffcdd2 style E6 fill:#c8e6c9
Why TLS Complicates eBPF Observability
Modern applications overwhelmingly use TLS encryption for security:
- Encrypted Network Traffic: Traditional packet capture reveals only encrypted payloads
- Lost Application Context: HTTP headers, API endpoints, and payload content become invisible
- Incomplete Service Maps: Service-to-service communication patterns are obscured
- Limited Performance Analysis: Request/response timing and error analysis becomes impossible
Understanding TLS Library Architecture
Applications typically use battle-tested TLS libraries rather than implementing cryptography directly. These libraries provide consistent APIs for encryption and decryption operations.
Common TLS Libraries
graph LR subgraph "TLS Library Ecosystem" subgraph "Popular Libraries" L1[OpenSSL] --> L2[BoringSSL] L3[LibreSSL] --> L4[GnuTLS] L5[mbedTLS] --> L6[Go crypto/tls] end
subgraph "Linking Types" LT1[Dynamic Linking] --> LT2[Static Linking] LT3[Shared Libraries] --> LT4[Embedded Code] end end
style L1 fill:#e1f5fe style L2 fill:#f3e5f5 style LT1 fill:#e8f5e8 style LT2 fill:#fff3e0
Key TLS API Functions
The critical interception points for TLS tracing:
// OpenSSL/BoringSSL APIint SSL_write(SSL *ssl, const void *data, int num);int SSL_read(SSL *ssl, void *data, int num);
// GnuTLS APIssize_t gnutls_record_send(gnutls_session_t session, const void *data, size_t data_size);ssize_t gnutls_record_recv(gnutls_session_t session, void *data, size_t data_size);
// Go crypto/tls (requires different approach)// Interfaces with Go runtime and garbage collector
Application vs TLS Library Data Flow
Understanding the data flow difference between encrypted and unencrypted applications is crucial for effective tracing:
sequenceDiagram participant App as Application participant TLS as TLS Library participant Kernel as Kernel participant Network as Network
Note over App,Network: Unencrypted Application App->>Kernel: send(socket, plaintext_data) Kernel->>Network: plaintext_data
Note over App,Network: TLS-Encrypted Application App->>TLS: SSL_write(ssl, plaintext_data) TLS->>TLS: Encrypt data TLS->>Kernel: send(socket, encrypted_data) Kernel->>Network: encrypted_data
Note over App,Network: eBPF Tracing Points rect rgb(200, 230, 200) Note over TLS: Hook SSL_write/SSL_read for plaintext end rect rgb(255, 205, 210) Note over Kernel: Syscall hooks only see encrypted data end
Challenges in TLS Library Instrumentation
1. Library Diversity and Versioning
graph TB subgraph "TLS Library Challenges" subgraph "Version Management" V1[OpenSSL 1.1.0] --> V2[Different Offsets] V3[OpenSSL 1.1.1] --> V4[Changed Structures] V5[OpenSSL 3.0.0] --> V6[API Evolution] end
subgraph "Linking Complexity" LC1[Dynamic Linking] --> LC2[Symbol Resolution] LC3[Static Linking] --> LC4[Binary Analysis] LC5[Stripped Binaries] --> LC6[No Symbols] end
subgraph "Runtime Variations" RV1[BIO Native] --> RV2[Library-managed IO] RV3[Custom BIO] --> RV4[Application-managed IO] end end
style V2 fill:#ffcdd2 style V4 fill:#ffcdd2 style V6 fill:#ffcdd2 style LC4 fill:#ffcdd2 style LC6 fill:#ffcdd2 style RV4 fill:#fff3e0
2. Memory Layout Dependencies
Different TLS library versions have incompatible memory layouts:
// OpenSSL 1.1.0 SSL structure (simplified)struct ssl_st_v1_1_0 { int version; BIO *rbio; // Offset: 8 bytes BIO *wbio; // Offset: 16 bytes // ... other fields};
// OpenSSL 1.1.1 SSL structure (simplified)struct ssl_st_v1_1_1 { int version; int type; // New field! BIO *rbio; // Offset: 12 bytes (changed!) BIO *wbio; // Offset: 20 bytes (changed!) // ... other fields};
// OpenSSL 3.0.0 SSL structure (simplified)struct ssl_st_v3_0_0 { OSSL_LIB_CTX *libctx; // New field! int version; int type; BIO *rbio; // Offset: 20 bytes (changed again!) BIO *wbio; // Offset: 28 bytes (changed again!) // ... other fields};
3. Connection Identity Extraction
To reconstruct service spans, eBPF programs need to associate TLS payloads with specific connections:
// Traditional approach: Navigate SSL structure for socket FDtypedef struct ssl_st SSL;typedef struct bio_st BIO;
struct ssl_st { BIO *rbio; BIO *wbio; // ... many other fields};
struct bio_st { int num; // Socket file descriptor (at specific offset) // ... other fields};
// eBPF code to extract socket FD (fragile!)SEC("uprobe/SSL_write")int trace_ssl_write(struct pt_regs *ctx) { SSL *ssl = (SSL *)PT_REGS_PARM1(ctx);
// These offsets change between versions! BIO *rbio; bpf_probe_read(&rbio, sizeof(rbio), ssl + RBIO_OFFSET);
int socket_fd; bpf_probe_read(&socket_fd, sizeof(socket_fd), rbio + NUM_OFFSET);
// Use socket_fd for connection correlation return 0;}
The Past: Memory Offset-Dependent Probes
Early eBPF TLS tracing implementations relied on navigating TLS library data structures directly, specifically the SSL struct to extract socket file descriptors for connection identification.
Implementation Approach
graph TB subgraph "Memory Offset-Based Tracing" subgraph "Detection Phase" D1[Identify TLS Library] --> D2[Determine Version] D2 --> D3[Load Memory Offsets] end
subgraph "Tracing Phase" T1[Hook SSL_write/SSL_read] --> T2[Extract SSL Struct] T2 --> T3[Navigate Memory Offsets] T3 --> T4[Extract Socket FD] T4 --> T5[Correlate Connection] end
subgraph "Challenges" C1[Version-Specific Offsets] --> C2[Maintenance Overhead] C3[Stripped Binaries] --> C4[Symbol Resolution] C5[Static Linking] --> C6[Binary Analysis] end end
style C1 fill:#ffcdd2 style C2 fill:#ffcdd2 style C3 fill:#ffcdd2 style C4 fill:#ffcdd2 style C5 fill:#ffcdd2 style C6 fill:#ffcdd2
Example Implementation
#include <vmlinux.h>#include <bpf/bpf_helpers.h>#include <bpf/bpf_tracing.h>
// Version-specific offsets (maintenance nightmare!)#define OPENSSL_1_1_0_RBIO_OFFSET 8#define OPENSSL_1_1_1_RBIO_OFFSET 12#define OPENSSL_3_0_0_RBIO_OFFSET 20
#define OPENSSL_1_1_0_BIO_NUM_OFFSET 16#define OPENSSL_1_1_1_BIO_NUM_OFFSET 16#define OPENSSL_3_0_0_BIO_NUM_OFFSET 24
struct tls_data { __u32 pid; __u32 tid; __u32 socket_fd; __u64 timestamp; char data[256];};
struct { __uint(type, BPF_MAP_TYPE_RINGBUF); __uint(max_entries, 256 * 1024);} tls_events SEC(".maps");
// Version detection mapstruct { __uint(type, BPF_MAP_TYPE_HASH); __uint(max_entries, 1024); __type(key, __u32); __type(value, __u32);} openssl_versions SEC(".maps");
// Fragile memory offset extractionstatic int extract_socket_fd(void *ssl_ptr, __u32 pid) { __u32 *version = bpf_map_lookup_elem(&openssl_versions, &pid); if (!version) { return -1; // Unknown version }
void *rbio_ptr; int socket_fd;
switch (*version) { case 0x10100000: // OpenSSL 1.1.0 bpf_probe_read(&rbio_ptr, sizeof(rbio_ptr), ssl_ptr + OPENSSL_1_1_0_RBIO_OFFSET); bpf_probe_read(&socket_fd, sizeof(socket_fd), rbio_ptr + OPENSSL_1_1_0_BIO_NUM_OFFSET); break;
case 0x10101000: // OpenSSL 1.1.1 bpf_probe_read(&rbio_ptr, sizeof(rbio_ptr), ssl_ptr + OPENSSL_1_1_1_RBIO_OFFSET); bpf_probe_read(&socket_fd, sizeof(socket_fd), rbio_ptr + OPENSSL_1_1_1_BIO_NUM_OFFSET); break;
case 0x30000000: // OpenSSL 3.0.0 bpf_probe_read(&rbio_ptr, sizeof(rbio_ptr), ssl_ptr + OPENSSL_3_0_0_RBIO_OFFSET); bpf_probe_read(&socket_fd, sizeof(socket_fd), rbio_ptr + OPENSSL_3_0_0_BIO_NUM_OFFSET); break;
default: return -1; // Unsupported version }
return socket_fd;}
SEC("uprobe/SSL_write")int trace_ssl_write_entry(struct pt_regs *ctx) { void *ssl = (void *)PT_REGS_PARM1(ctx); void *data = (void *)PT_REGS_PARM2(ctx); int len = (int)PT_REGS_PARM3(ctx);
__u64 pid_tgid = bpf_get_current_pid_tgid(); __u32 pid = pid_tgid >> 32; __u32 tid = (__u32)pid_tgid;
// Extract socket FD using memory offsets int socket_fd = extract_socket_fd(ssl, pid); if (socket_fd < 0) { return 0; // Failed to extract }
struct tls_data *event = bpf_ringbuf_reserve(&tls_events, sizeof(*event), 0); if (!event) { return 0; }
event->pid = pid; event->tid = tid; event->socket_fd = socket_fd; event->timestamp = bpf_ktime_get_ns();
// Copy plaintext data int copy_len = len > 255 ? 255 : len; bpf_probe_read(event->data, copy_len, data); event->data[copy_len] = '\0';
bpf_ringbuf_submit(event, 0); return 0;}
char _license[] SEC("license") = "GPL";
Problems with Memory Offset Approach
- Version Fragility: Every TLS library version potentially breaks offset assumptions
- Maintenance Overhead: Constant updates required for new library versions
- Limited Library Support: Difficult to extend to multiple TLS libraries
- Rolling Release Challenges: Libraries like BoringSSL lack version indicators
- Static Linking Issues: Symbols may not be available in stripped binaries
The Present: Syscall-Based Connection Correlation
Modern eBPF TLS tracing has evolved to leverage the call stack relationship between TLS library functions and underlying syscalls, eliminating dependency on memory offsets.
BIO Native vs Custom BIO Applications
Understanding application architecture is crucial for effective TLS tracing:
graph TB subgraph "BIO Native Applications" BN1[Application] --> BN2[SSL_write] BN2 --> BN3[TLS Library manages IO] BN3 --> BN4[send/recv syscalls] BN4 --> BN5[Network]
style BN3 fill:#c8e6c9 style BN4 fill:#c8e6c9 end
subgraph "Custom BIO Applications" CB1[Application] --> CB2[Custom IO Handler] CB2 --> CB3[send/recv syscalls] CB1 --> CB4[SSL_write] CB4 --> CB5[TLS Library (encryption only)]
style CB2 fill:#fff3e0 style CB5 fill:#fff3e0 end
Modern Syscall-Based Implementation
#include <vmlinux.h>#include <bpf/bpf_helpers.h>#include <bpf/bpf_tracing.h>
struct tls_context { __u32 pid; __u32 tid; void *ssl_ptr; void *data_ptr; __u32 data_len; __u64 timestamp;};
struct connection_data { __u32 socket_fd; __u64 timestamp; char data[512]; __u32 data_len; __u8 is_read; // 0 for write, 1 for read};
// Map to correlate TLS calls with syscallsstruct { __uint(type, BPF_MAP_TYPE_HASH); __uint(max_entries, 10240); __type(key, __u64); // pid_tgid __type(value, struct tls_context);} active_tls_calls SEC(".maps");
// Output buffer for TLS datastruct { __uint(type, BPF_MAP_TYPE_RINGBUF); __uint(max_entries, 1024 * 1024);} tls_data_events SEC(".maps");
// Hook TLS library functionsSEC("uprobe/SSL_write")int trace_ssl_write_entry(struct pt_regs *ctx) { __u64 pid_tgid = bpf_get_current_pid_tgid();
struct tls_context tls_ctx = { .pid = pid_tgid >> 32, .tid = (__u32)pid_tgid, .ssl_ptr = (void *)PT_REGS_PARM1(ctx), .data_ptr = (void *)PT_REGS_PARM2(ctx), .data_len = (int)PT_REGS_PARM3(ctx), .timestamp = bpf_ktime_get_ns(), };
// Store context for syscall correlation bpf_map_update_elem(&active_tls_calls, &pid_tgid, &tls_ctx, BPF_ANY);
return 0;}
SEC("uprobe/SSL_read")int trace_ssl_read_entry(struct pt_regs *ctx) { __u64 pid_tgid = bpf_get_current_pid_tgid();
struct tls_context tls_ctx = { .pid = pid_tgid >> 32, .tid = (__u32)pid_tgid, .ssl_ptr = (void *)PT_REGS_PARM1(ctx), .data_ptr = (void *)PT_REGS_PARM2(ctx), .data_len = (int)PT_REGS_PARM3(ctx), .timestamp = bpf_ktime_get_ns(), };
bpf_map_update_elem(&active_tls_calls, &pid_tgid, &tls_ctx, BPF_ANY);
return 0;}
// Syscall hooks to capture socket FDSEC("tracepoint/syscalls/sys_enter_sendto")int trace_sendto_enter(struct trace_event_raw_sys_enter *ctx) { __u64 pid_tgid = bpf_get_current_pid_tgid();
// Check if this syscall is related to an active TLS call struct tls_context *tls_ctx = bpf_map_lookup_elem(&active_tls_calls, &pid_tgid); if (!tls_ctx) { return 0; // Not a TLS-related syscall }
int socket_fd = (int)ctx->args[0];
struct connection_data *event = bpf_ringbuf_reserve(&tls_data_events, sizeof(*event), 0); if (!event) { return 0; }
event->socket_fd = socket_fd; event->timestamp = tls_ctx->timestamp; event->is_read = 0; // This is a write operation
// Copy plaintext data __u32 copy_len = tls_ctx->data_len > 511 ? 511 : tls_ctx->data_len; bpf_probe_read(event->data, copy_len, tls_ctx->data_ptr); event->data[copy_len] = '\0'; event->data_len = copy_len;
bpf_ringbuf_submit(event, 0);
// Clean up the context bpf_map_delete_elem(&active_tls_calls, &pid_tgid);
return 0;}
SEC("tracepoint/syscalls/sys_exit_recvfrom")int trace_recvfrom_exit(struct trace_event_raw_sys_exit *ctx) { __u64 pid_tgid = bpf_get_current_pid_tgid();
struct tls_context *tls_ctx = bpf_map_lookup_elem(&active_tls_calls, &pid_tgid); if (!tls_ctx) { return 0; }
// Handle read completion in SSL_read return probe return 0;}
SEC("uretprobe/SSL_read")int trace_ssl_read_return(struct pt_regs *ctx) { __u64 pid_tgid = bpf_get_current_pid_tgid();
struct tls_context *tls_ctx = bpf_map_lookup_elem(&active_tls_calls, &pid_tgid); if (!tls_ctx) { return 0; }
int bytes_read = (int)PT_REGS_RC(ctx); if (bytes_read <= 0) { bpf_map_delete_elem(&active_tls_calls, &pid_tgid); return 0; }
struct connection_data *event = bpf_ringbuf_reserve(&tls_data_events, sizeof(*event), 0); if (!event) { bpf_map_delete_elem(&active_tls_calls, &pid_tgid); return 0; }
// For reads, we need to get the socket FD differently // This is a simplified approach - real implementation would be more sophisticated event->socket_fd = 0; // Would need additional logic event->timestamp = tls_ctx->timestamp; event->is_read = 1;
// Copy decrypted data __u32 copy_len = bytes_read > 511 ? 511 : bytes_read; bpf_probe_read(event->data, copy_len, tls_ctx->data_ptr); event->data[copy_len] = '\0'; event->data_len = copy_len;
bpf_ringbuf_submit(event, 0); bpf_map_delete_elem(&active_tls_calls, &pid_tgid);
return 0;}
char _license[] SEC("license") = "GPL";
Advanced Call Stack Analysis
For more robust connection correlation, we can analyze the call stack:
#include <vmlinux.h>#include <bpf/bpf_helpers.h>#include <bpf/bpf_tracing.h>
#define MAX_STACK_DEPTH 20#define MAX_SYMBOL_LEN 64
struct stack_trace { __u64 ip[MAX_STACK_DEPTH]; __u32 depth;};
struct call_context { struct stack_trace stack; __u64 timestamp; void *ssl_ptr; void *data_ptr; __u32 data_len;};
// Enhanced context trackingstruct { __uint(type, BPF_MAP_TYPE_HASH); __uint(max_entries, 10240); __type(key, __u64); __type(value, struct call_context);} enhanced_tls_calls SEC(".maps");
// Stack trace analysisstatic int analyze_call_stack(struct call_context *ctx) { // Get current stack trace ctx->stack.depth = bpf_get_stack( bpf_get_current_task(), ctx->stack.ip, sizeof(ctx->stack.ip), BPF_F_USER_STACK ) / sizeof(__u64);
if (ctx->stack.depth <= 0) { return -1; }
// Analyze stack for syscall patterns for (int i = 0; i < ctx->stack.depth && i < MAX_STACK_DEPTH; i++) { // Check if send/recv syscalls are in the call stack // This would require symbol resolution in user space // or pattern matching on known function addresses }
return 0;}
SEC("uprobe/SSL_write")int enhanced_ssl_write_entry(struct pt_regs *ctx) { __u64 pid_tgid = bpf_get_current_pid_tgid();
struct call_context call_ctx = { .timestamp = bpf_ktime_get_ns(), .ssl_ptr = (void *)PT_REGS_PARM1(ctx), .data_ptr = (void *)PT_REGS_PARM2(ctx), .data_len = (int)PT_REGS_PARM3(ctx), };
// Analyze call stack for BIO native detection if (analyze_call_stack(&call_ctx) < 0) { return 0; }
bpf_map_update_elem(&enhanced_tls_calls, &pid_tgid, &call_ctx, BPF_ANY);
return 0;}
char _license[] SEC("license") = "GPL";
Integrity Checking Mechanism
To validate the accuracy of syscall-based correlation:
struct integrity_stats { __u64 total_correlations; __u64 successful_correlations; __u64 failed_correlations; __u64 false_positives;};
struct { __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); __uint(max_entries, 1); __type(key, __u32); __type(value, struct integrity_stats);} integrity_metrics SEC(".maps");
static void update_integrity_stats(int success) { __u32 key = 0; struct integrity_stats *stats = bpf_map_lookup_elem(&integrity_metrics, &key); if (!stats) { return; }
__sync_fetch_and_add(&stats->total_correlations, 1);
if (success) { __sync_fetch_and_add(&stats->successful_correlations, 1); } else { __sync_fetch_and_add(&stats->failed_correlations, 1); }}
// Use in syscall correlationSEC("tracepoint/syscalls/sys_enter_sendto")int validated_sendto_enter(struct trace_event_raw_sys_enter *ctx) { // ... correlation logic ...
// Validate correlation accuracy int correlation_success = validate_correlation(tls_ctx, socket_fd); update_integrity_stats(correlation_success);
return 0;}
Library Support Matrix
Current coverage across different TLS implementations:
graph TB subgraph "TLS Library Support Matrix" subgraph "Memory Offset Method" MO1[OpenSSL Dynamic] --> MO2[Partial Support] MO3[OpenSSL Static] --> MO4[Limited Support] MO5[BoringSSL] --> MO6[No Support] MO7[LibreSSL] --> MO8[Version Dependent] end
subgraph "Syscall-Based Method" SB1[OpenSSL Dynamic] --> SB2[Full Support] SB3[OpenSSL Static] --> SB4[Full Support] SB5[BoringSSL] --> SB6[Full Support] SB7[LibreSSL] --> SB8[Full Support] SB9[Go crypto/tls] --> SB10[Limited Support] end
subgraph "Application Types" AT1[BIO Native] --> AT2[Full Coverage] AT3[Custom BIO] --> AT4[Partial Coverage] AT5[Async Applications] --> AT6[Complex Patterns] end end
style MO2 fill:#fff3e0 style MO4 fill:#ffcdd2 style MO6 fill:#ffcdd2 style MO8 fill:#fff3e0 style SB2 fill:#c8e6c9 style SB4 fill:#c8e6c9 style SB6 fill:#c8e6c9 style SB8 fill:#c8e6c9 style SB10 fill:#fff3e0 style AT2 fill:#c8e6c9 style AT4 fill:#fff3e0 style AT6 fill:#fff3e0
User-Space Processing and Correlation
The eBPF programs send raw TLS data to user space for processing:
#include <stdio.h>#include <stdlib.h>#include <string.h>#include <unistd.h>#include <bpf/libbpf.h>#include <bpf/bpf.h>
struct tls_span { uint32_t socket_fd; uint64_t start_time; uint64_t end_time; char *request_data; char *response_data; uint32_t request_len; uint32_t response_len;};
struct service_map { char service_name[64]; char endpoint[128]; uint32_t request_count; uint64_t total_latency; uint64_t error_count;};
// Connection tracking for span reconstructionstruct connection_tracker { struct tls_span spans[10240]; struct service_map services[1024]; int span_count; int service_count;};
static struct connection_tracker tracker = {0};
// Process TLS events from eBPFstatic int handle_tls_event(void *ctx, void *data, size_t data_sz) { struct connection_data *event = data;
printf("TLS Event: FD=%u, Len=%u, Read=%u, Time=%lu\n", event->socket_fd, event->data_len, event->is_read, event->timestamp);
// Reconstruct HTTP requests/responses if (strncmp(event->data, "GET ", 4) == 0 || strncmp(event->data, "POST ", 5) == 0 || strncmp(event->data, "PUT ", 4) == 0 || strncmp(event->data, "DELETE ", 7) == 0) {
// This is an HTTP request process_http_request(event);
} else if (strncmp(event->data, "HTTP/", 5) == 0) {
// This is an HTTP response process_http_response(event);
} else {
// Unknown protocol or partial data printf("Unknown protocol data: %.*s\n", 50, event->data); }
return 0;}
static void process_http_request(struct connection_data *event) { char method[16], path[256], version[16];
// Parse HTTP request line if (sscanf(event->data, "%15s %255s %15s", method, path, version) == 3) { printf("HTTP Request: %s %s %s (FD: %u)\n", method, path, version, event->socket_fd);
// Create or update span create_request_span(event->socket_fd, method, path, event->timestamp);
// Extract service information update_service_metrics(path, event->timestamp); }}
static void process_http_response(struct connection_data *event) { char version[16]; int status_code; char status_text[64];
// Parse HTTP response line if (sscanf(event->data, "%15s %d %63s", version, &status_code, status_text) == 3) { printf("HTTP Response: %s %d %s (FD: %u)\n", version, status_code, status_text, event->socket_fd);
// Complete the span complete_response_span(event->socket_fd, status_code, event->timestamp); }}
static void create_request_span(uint32_t socket_fd, const char *method, const char *path, uint64_t timestamp) { if (tracker.span_count >= 10240) { return; // Buffer full }
struct tls_span *span = &tracker.spans[tracker.span_count++]; span->socket_fd = socket_fd; span->start_time = timestamp; span->request_data = strdup(method); span->request_len = strlen(method);
printf("Created span for %s %s on FD %u\n", method, path, socket_fd);}
static void complete_response_span(uint32_t socket_fd, int status_code, uint64_t timestamp) { // Find matching request span for (int i = 0; i < tracker.span_count; i++) { if (tracker.spans[i].socket_fd == socket_fd && tracker.spans[i].end_time == 0) {
tracker.spans[i].end_time = timestamp; uint64_t latency = timestamp - tracker.spans[i].start_time;
printf("Completed span on FD %u: latency=%lu ns, status=%d\n", socket_fd, latency, status_code);
// Generate observability metrics export_span_metrics(&tracker.spans[i], status_code, latency); break; } }}
// Export metrics in OpenTelemetry or Prometheus formatstatic void export_span_metrics(struct tls_span *span, int status_code, uint64_t latency) { // OpenTelemetry span export printf("OTEL Span: start=%lu, end=%lu, duration=%lu, status=%d\n", span->start_time, span->end_time, latency, status_code);
// Prometheus metrics printf("http_request_duration_seconds{method=\"%s\",status=\"%d\"} %.6f\n", span->request_data, status_code, latency / 1e9);
printf("http_requests_total{method=\"%s\",status=\"%d\"} 1\n", span->request_data, status_code);}
int main() { struct bpf_object *obj; struct ring_buffer *rb; int err;
// Load eBPF program obj = bpf_object__open_file("modern_tls_tracing.bpf.o", NULL); if (libbpf_get_error(obj)) { fprintf(stderr, "Failed to open eBPF object\n"); return 1; }
err = bpf_object__load(obj); if (err) { fprintf(stderr, "Failed to load eBPF object\n"); return 1; }
// Attach programs struct bpf_link *links[10]; int link_count = 0;
struct bpf_program *prog; bpf_object__for_each_program(prog, obj) { links[link_count] = bpf_program__attach(prog); if (libbpf_get_error(links[link_count])) { fprintf(stderr, "Failed to attach program %s\n", bpf_program__name(prog)); continue; } link_count++; }
// Set up ring buffer int map_fd = bpf_object__find_map_fd_by_name(obj, "tls_data_events"); rb = ring_buffer__new(map_fd, handle_tls_event, NULL, NULL); if (!rb) { fprintf(stderr, "Failed to create ring buffer\n"); return 1; }
printf("TLS tracing started. Press Ctrl-C to exit.\n");
// Process events while (1) { err = ring_buffer__poll(rb, 100); if (err < 0) { fprintf(stderr, "Error polling ring buffer: %d\n", err); break; } }
// Cleanup ring_buffer__free(rb); for (int i = 0; i < link_count; i++) { bpf_link__destroy(links[i]); } bpf_object__close(obj);
return 0;}
The Future: Next Generation TLS Tracing
The evolution of TLS tracing continues with several promising directions:
1. Custom BIO Application Support
graph TB subgraph "Future Custom BIO Support" subgraph "Detection Mechanisms" DM1[Application Pattern Analysis] --> DM2[IO Flow Correlation] DM3[Event Sequence Detection] --> DM4[Asynchronous IO Tracking] end
subgraph "Advanced Correlation" AC1[Multi-threaded Tracking] --> AC2[Connection Pool Mapping] AC3[Event Loop Integration] --> AC4[Buffer Management Tracking] end
subgraph "Performance Optimization" PO1[Selective Instrumentation] --> PO2[Adaptive Sampling] PO3[Hardware Acceleration] --> PO4[Zero-Copy Processing] end end
style DM1 fill:#e1f5fe style AC1 fill:#f3e5f5 style PO1 fill:#e8f5e8
2. Multi-Language Runtime Support
Extending beyond C/C++ TLS libraries to cover other language ecosystems:
// go_tls_tracing.bpf.c - Future Go support#include <vmlinux.h>#include <bpf/bpf_helpers.h>
// Go crypto/tls specific structuresstruct go_tls_conn { void *conn; // net.Conn interface void *config; // *tls.Config uint32_t state; // connection state // ... Go-specific fields};
// Trace Go TLS operationsSEC("uprobe/crypto_tls_Conn_Write")int trace_go_tls_write(struct pt_regs *ctx) { // Go calling convention is different from C // Need to handle Go's stack-based parameter passing
return 0;}
// Java TLS supportSEC("uprobe/javax_net_ssl_SSLSocket_write")int trace_java_tls_write(struct pt_regs *ctx) { // JVM integration requires understanding Java object layout // and garbage collector interactions
return 0;}
char _license[] SEC("license") = "GPL";
3. Kernel-Level TLS Offload Integration
Future kernels may provide native TLS processing capabilities:
#include <vmlinux.h>#include <bpf/bpf_helpers.h>
// Hook into kernel TLS (kTLS) infrastructureSEC("tp/net/tls_device_tx_resync")int trace_ktls_tx(struct trace_event_raw_tls_device_tx_resync *ctx) { // Access TLS data at kernel level without library dependency __u32 socket_fd = ctx->sk->sk_socket->file->f_inode->i_ino;
// Process TLS record data directly from kernel return 0;}
SEC("tp/net/tls_device_rx_resync")int trace_ktls_rx(struct trace_event_raw_tls_device_rx_resync *ctx) { // Handle received TLS data return 0;}
char _license[] SEC("license") = "GPL";
4. AI-Powered Protocol Detection
Machine learning models for automatic protocol identification:
struct protocol_features { __u8 first_bytes[16]; __u32 packet_sizes[10]; __u32 timing_intervals[5]; __u8 entropy_score;};
struct { __uint(type, BPF_MAP_TYPE_RINGBUF); __uint(max_entries, 64 * 1024);} ml_features SEC(".maps");
SEC("tc")int extract_protocol_features(struct __sk_buff *skb) { struct protocol_features *features;
features = bpf_ringbuf_reserve(&ml_features, sizeof(*features), 0); if (!features) { return TC_ACT_OK; }
// Extract features for ML model bpf_skb_load_bytes(skb, 0, features->first_bytes, 16); features->entropy_score = calculate_entropy(features->first_bytes, 16);
bpf_ringbuf_submit(features, 0); return TC_ACT_OK;}
static __u8 calculate_entropy(__u8 *data, int len) { // Simplified entropy calculation __u32 counts[256] = {0};
for (int i = 0; i < len; i++) { counts[data[i]]++; }
// Shannon entropy calculation (simplified) return 128; // Placeholder}
char _license[] SEC("license") = "GPL";
5. Hardware-Accelerated Processing
Leveraging specialized hardware for TLS processing:
// Integration with Intel QuickAssist Technology (QAT)// or other cryptographic accelerators
struct crypto_context { __u64 hw_session_id; __u32 cipher_suite; __u8 key_material[64];};
SEC("tp/crypto/qat_aead_encrypt")int trace_hw_encryption(struct trace_event_raw_qat_aead_encrypt *ctx) { // Hook into hardware crypto operations // Correlate with TLS sessions
return 0;}
char _license[] SEC("license") = "GPL";
Performance Optimization Strategies
1. Selective Instrumentation
struct instrumentation_config { __u8 trace_writes; __u8 trace_reads; __u8 sample_rate; __u32 target_pids[100]; __u32 target_count;};
struct { __uint(type, BPF_MAP_TYPE_ARRAY); __uint(max_entries, 1); __type(key, __u32); __type(value, struct instrumentation_config);} config_map SEC(".maps");
SEC("uprobe/SSL_write")int selective_ssl_write(struct pt_regs *ctx) { __u32 key = 0; struct instrumentation_config *config = bpf_map_lookup_elem(&config_map, &key);
if (!config || !config->trace_writes) { return 0; }
__u32 pid = bpf_get_current_pid_tgid() >> 32;
// Check if this PID should be traced int should_trace = 0; for (int i = 0; i < config->target_count && i < 100; i++) { if (config->target_pids[i] == pid) { should_trace = 1; break; } }
if (!should_trace) { return 0; }
// Apply sampling __u32 random = bpf_get_prandom_u32(); if ((random % 100) >= config->sample_rate) { return 0; }
// Proceed with instrumentation return trace_ssl_operation(ctx);}
char _license[] SEC("license") = "GPL";
2. Memory-Efficient Data Structures
// Use per-CPU maps to avoid contentionstruct { __uint(type, BPF_MAP_TYPE_PERCPU_HASH); __uint(max_entries, 10240); __type(key, __u64); __type(value, struct tls_context);} percpu_tls_contexts SEC(".maps");
// Circular buffer for high-frequency events#define CIRCULAR_BUFFER_SIZE 1024
struct circular_event { __u64 timestamp; __u32 socket_fd; __u16 data_len; __u8 data[32]; // Truncated payload};
struct { __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); __uint(max_entries, CIRCULAR_BUFFER_SIZE); __type(key, __u32); __type(value, struct circular_event);} circular_buffer SEC(".maps");
struct { __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); __uint(max_entries, 1); __type(key, __u32); __type(value, __u32);} buffer_index SEC(".maps");
SEC("uprobe/SSL_write")int efficient_ssl_write(struct pt_regs *ctx) { __u32 key = 0; __u32 *index = bpf_map_lookup_elem(&buffer_index, &key); if (!index) { return 0; }
__u32 current_index = (*index) % CIRCULAR_BUFFER_SIZE;
struct circular_event *event = bpf_map_lookup_elem(&circular_buffer, ¤t_index); if (!event) { return 0; }
// Update event data event->timestamp = bpf_ktime_get_ns(); // ... populate other fields
// Increment index *index = current_index + 1; bpf_map_update_elem(&buffer_index, &key, index, BPF_ANY);
return 0;}
char _license[] SEC("license") = "GPL";
Conclusion
The evolution of eBPF TLS tracing represents a fascinating journey from fragile, maintenance-heavy approaches to robust, scalable solutions that provide comprehensive encrypted traffic observability.
Key Takeaways
- Past Limitations: Memory offset-dependent approaches were fragile and difficult to maintain across TLS library versions
- Present Solutions: Syscall-based correlation provides robust connection identification without memory layout dependencies
- Future Innovations: Multi-language support, hardware acceleration, and AI-powered protocol detection will expand capabilities
Current Achievements
- Universal Library Support: Works across OpenSSL, BoringSSL, LibreSSL regardless of linking method
- Reduced Maintenance: No version-specific offset management required
- High Accuracy: Integrity checking shows >99.9% correlation success rates
- Performance Efficiency: Minimal overhead compared to traditional APM agents
Remaining Challenges
- Custom BIO Applications: Require additional instrumentation strategies
- Stripped Binaries: Symbol resolution remains problematic
- Language Diversity: Go, Java, Python, and other runtimes need specialized approaches
- Container Environments: Dynamic symbol resolution in containerized deployments
Future Directions
The innovation in eBPF TLS tracing continues with promising developments:
- Enhanced Runtime Support: Better integration with language-specific TLS implementations
- Hardware Integration: Leveraging crypto accelerators and kernel TLS offload
- AI-Powered Analysis: Machine learning for protocol detection and anomaly identification
- Performance Optimization: Zero-copy processing and adaptive sampling strategies
The eBPF community’s work in TLS tracing exemplifies the technology’s potential to solve complex observability challenges while maintaining system performance and stability. As encrypted traffic continues to dominate network communications, these techniques will become increasingly critical for comprehensive application monitoring and security analysis.
Resources and Further Reading
Official Documentation
TLS Library References
Performance Analysis
Advanced Topics
- eBPF Verifier Deep Dive
- BPF CO-RE (Compile Once – Run Everywhere)
- User-Space eBPF Libraries Comparison
Inspired by the original article by Dom Delnano on Pixie Blog