Writing eBPF Kprobe Programs with Rust Aya: Complete Developer Guide#

Kernel Probe (Kprobe) is a powerful debugging and tracing mechanism for the Linux kernel that, when combined with eBPF and Rust’s Aya framework, provides a robust platform for kernel-level observability and performance analysis. This comprehensive guide walks through creating production-ready eBPF Kprobe programs using Rust.

Introduction to Kprobes and eBPF#

1
graph TB
2
    subgraph "Kernel Function Tracing"
3
        subgraph "Kprobe Types"
4
            K1[kprobe - Function Entry] --> Exec[Function Execution]
5
            K2[kretprobe - Function Exit] --> Exec
6
        end
7

8
        subgraph "BPF Integration"
9
            Exec --> BPF[eBPF Program]
10
            BPF --> Analysis[Real-time Analysis]
11
            BPF --> Metrics[Metrics Collection]
12
        end
13

14
        subgraph "Rust Aya Framework"
15
            Analysis --> Aya[Aya Runtime]
16
            Metrics --> Aya
17
            Aya --> Safe[Memory Safety]
18
            Aya --> Performance[High Performance]
19
        end
20
    end
21

22
    style K1 fill:#e1f5fe
23
    style K2 fill:#e1f5fe
24
    style BPF fill:#f3e5f5
25
    style Aya fill:#fff3e0

Understanding Kprobes#

Kprobe (Kernel Probe) is a debugging mechanism that allows dynamic insertion of breakpoints into running kernel code. When combined with eBPF, it enables:

Dynamic Instrumentation: Insert probes without kernel recompilation
Function Entry Monitoring: Execute eBPF programs when functions are called
Function Exit Monitoring: Execute eBPF programs when functions return
Argument Access: Read function parameters and return values
Performance Analysis: Measure execution time and resource usage

Kprobe Execution Points#

1
sequenceDiagram
2
    participant User as User Space
3
    participant Kernel as Kernel Function
4
    participant Kprobe as Kprobe Handler
5
    participant eBPF as eBPF Program
6

7
    User->>Kernel: System call triggers function
8
    Kernel->>Kprobe: Function entry (kprobe)
9
    Kprobe->>eBPF: Execute eBPF program
10
    eBPF->>eBPF: Process arguments, collect data
11
    eBPF->>Kprobe: Return result
12
    Kprobe->>Kernel: Continue execution
13

14
    Note over Kernel: Function executes normally
15

16
    Kernel->>Kprobe: Function exit (kretprobe)
17
    Kprobe->>eBPF: Execute eBPF program
18
    eBPF->>eBPF: Process return value
19
    eBPF->>Kprobe: Return result
20
    Kprobe->>User: Function completes

Key Limitations and Considerations#

⚠️ Important Considerations:

Kernel Version Compatibility: Kprobes may behave differently across kernel versions
Function Availability: Not all kernel functions are available for probing
Performance Impact: Excessive probing can affect system performance
Security Restrictions: Some systems may restrict eBPF program loading

Development Environment Setup#

Prerequisites#

1
# Install Rust nightly toolchain
2
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
3
rustup install nightly
4
rustup default nightly
5

6
# Install eBPF development tools
7
cargo install bpf-linker
8
cargo install bindgen-cli
9

10
# Install system dependencies (Ubuntu/Debian)
11
sudo apt update
12
sudo apt install -y \
13
    clang \
14
    llvm \
15
    libelf-dev \
16
    libz-dev \
17
    libbpf-dev \
18
    linux-headers-$(uname -r) \
19
    bpftool
20

21
# Install optional monitoring tools
22
cargo install bpftop  # Netflix's eBPF monitoring tool

macOS Development Setup (Lima VM)#

For macOS developers, you can use Lima to create a Linux development environment:

1
arch: "x86_64"
2
cpus: 4
3
memory: "8GiB"
4
disk: "50GiB"
5

6
images:
7
  - location: "https://cloud-images.ubuntu.com/releases/22.04/release-20240821/ubuntu-22.04-server-cloudimg-amd64.img"
8
    arch: "x86_64"
9

10
provision:
11
  - mode: system
12
    script: |
13
      #!/bin/bash
14
      apt-get update
15
      apt-get install -y curl build-essential
16

17
      # Install Rust
18
      curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
19
      source /root/.cargo/env
20
      rustup install nightly
21
      rustup default nightly
22

23
      # Install eBPF tools
24
      cargo install bpf-linker bindgen-cli
25

26
      # Install system dependencies
27
      apt-get install -y clang llvm libelf-dev libz-dev libbpf-dev linux-headers-generic bpftool
28

29
  - mode: user
30
    script: |
31
      #!/bin/bash
32
      curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y

1
# Start Lima VM
2
brew install lima
3
limactl start lima-aya-dev.yaml
4
limactl shell lima-aya-dev

Project Structure and Setup#

Creating a New Aya Project#

1
# Create project directory
2
mkdir ebpf-kprobe-tutorial
3
cd ebpf-kprobe-tutorial
4

5
# Initialize Cargo workspace
6
cargo init --name kprobe-observer

Project Structure#

1
ebpf-kprobe-tutorial/
2
├── Cargo.toml
3
├── Cargo.lock
4
├── src/
5
│   └── main.rs
6
├── ebpf/
7
│   ├── Cargo.toml
8
│   └── src/
9
│       ├── main.rs
10
│       └── bindings.rs
11
├── xtask/
12
│   ├── Cargo.toml
13
│   └── src/
14
│       ├── main.rs
15
│       ├── build.rs
16
│       └── codegen.rs
17
└── README.md

Workspace Configuration#

1
# Cargo.toml (workspace root)
2
[workspace]
3
members = ["ebpf", "xtask"]
4
default-members = ["ebpf"]
5

6
[package]
7
name = "kprobe-observer"
8
version = "0.1.0"
9
edition = "2021"
10

11
[dependencies]
12
aya = { version = "0.12", features = ["async_tokio"] }
13
aya-log = "0.2"
14
clap = { version = "4.0", features = ["derive"] }
15
env_logger = "0.10"
16
log = "0.4"
17
tokio = { version = "1.25", features = ["macros", "rt", "rt-multi-thread", "net", "signal"] }
18
anyhow = "1.0"
19
bytes = "1.4"
20

21
[[bin]]
22
name = "kprobe-observer"
23
path = "src/main.rs"
24

25
[profile.release]
26
debug = true

1
[package]
2
name = "kprobe-observer-ebpf"
3
version = "0.1.0"
4
edition = "2021"
5

6
[dependencies]
7
aya-ebpf = "0.1"
8
aya-log-ebpf = "0.1"
9

10
[[bin]]
11
name = "kprobe-observer"
12
path = "src/main.rs"
13

14
[profile.dev]
15
opt-level = 3
16
debug = false
17
debug-assertions = false
18
overflow-checks = false
19
lto = true
20
panic = "abort"
21
incremental = false
22
codegen-units = 1
23
rpath = false
24

25
[profile.release]
26
lto = true
27
panic = "abort"
28
codegen-units = 1

1
[package]
2
name = "xtask"
3
version = "0.1.0"
4
edition = "2021"
5

6
[dependencies]
7
anyhow = "1.0"
8
clap = { version = "4.0", features = ["derive"] }
9
aya-tool = "0.1"

Kernel Function Analysis and Target Selection#

Checking Available Kprobes#

1
# List all available kernel functions for probing
2
sudo cat /sys/kernel/debug/tracing/available_filter_functions | head -20
3

4
# Search for specific function patterns
5
grep wake_up /sys/kernel/debug/tracing/available_filter_functions
6
grep schedule /sys/kernel/debug/tracing/available_filter_functions
7
grep sys_open /sys/kernel/debug/tracing/available_filter_functions
8

9
# Check if our target function is available
10
grep wake_up_new_task /sys/kernel/debug/tracing/available_filter_functions
11
# Expected output: wake_up_new_task

Understanding Function Signatures#

For this tutorial, we’ll target the wake_up_new_task function. Let’s examine its signature:

1
/*
2
 * wake_up_new_task - wake up a newly created task for the first time.
3
 *
4
 * This function will do some initial scheduler statistics housekeeping
5
 * that must be done for every newly created context, then puts the task
6
 * on the runqueue and wakes it.
7
 */
8
void wake_up_new_task(struct task_struct *p)
9
{
10
    struct rq_flags rf;
11
    struct rq *rq;
12

13
    // ... function implementation
14
}

Key insights:

Function: wake_up_new_task
Arguments: One parameter - pointer to task_struct
Purpose: Called when a new task is woken up for the first time
Use Case: Perfect for monitoring process creation and scheduling

Generating Kernel Type Definitions#

Implementing Code Generation#

1
use anyhow::Result;
2
use aya_tool::generate::InputFile;
3
use std::fs::File;
4
use std::io::Write;
5
use std::path::PathBuf;
6

7
pub fn generate_bindings() -> Result<()> {
8
    println!("Generating kernel bindings...");
9

10
    let dir = PathBuf::from("ebpf/src");
11
    let names: Vec<&str> = vec![
12
        "task_struct",
13
        "pid_t",
14
        "cred",
15
        "mm_struct",
16
        "files_struct",
17
        "fs_struct",
18
        "signal_struct",
19
        "sighand_struct",
20
        "thread_info",
21
        "cpu_context_save",
22
        "thread_struct",
23
    ];
24

25
    let bindings = aya_tool::generate(
26
        InputFile::Btf(PathBuf::from("/sys/kernel/btf/vmlinux")),
27
        &names,
28
        &[],
29
    )?;
30

31
    let mut output = File::create(dir.join("bindings.rs"))?;
32
    write!(output, "{}", bindings)?;
33

34
    println!("Kernel bindings generated successfully!");
35
    Ok(())
36
}

1
use anyhow::Result;
2
use std::process::Command;
3

4
pub fn build_ebpf() -> Result<()> {
5
    println!("Building eBPF program...");
6

7
    let output = Command::new("cargo")
8
        .args(&[
9
            "build",
10
            "--target=bpfel-unknown-none",
11
            "--release",
12
        ])
13
        .current_dir("ebpf")
14
        .output()?;
15

16
    if !output.status.success() {
17
        anyhow::bail!(
18
            "Failed to build eBPF program:\n{}",
19
            String::from_utf8_lossy(&output.stderr)
20
        );
21
    }
22

23
    println!("eBPF program built successfully!");
24
    Ok(())
25
}

1
mod build;
2
mod codegen;
3

4
use anyhow::Result;
5
use clap::{Parser, Subcommand};
6

7
#[derive(Parser)]
8
#[command(name = "xtask")]
9
#[command(about = "Build and development tasks")]
10
struct Cli {
11
    #[command(subcommand)]
12
    command: Commands,
13
}
14

15
#[derive(Subcommand)]
16
enum Commands {
17
    /// Generate kernel type bindings
18
    Codegen,
19
    /// Build eBPF program
20
    Build,
21
    /// Build and run the observer
22
    Run,
23
}
24

25
fn main() -> Result<()> {
26
    let cli = Cli::parse();
27

28
    match cli.command {
29
        Commands::Codegen => codegen::generate_bindings(),
30
        Commands::Build => build::build_ebpf(),
31
        Commands::Run => {
32
            build::build_ebpf()?;
33
            println!("Running observer...");
34
            let output = std::process::Command::new("sudo")
35
                .args(&["./target/release/kprobe-observer"])
36
                .output()?;
37

38
            println!("{}", String::from_utf8_lossy(&output.stdout));
39
            if !output.stderr.is_empty() {
40
                eprintln!("{}", String::from_utf8_lossy(&output.stderr));
41
            }
42
            Ok(())
43
        }
44
    }
45
}

eBPF Program Implementation#

Core eBPF Program#

1
#![no_std]
2
#![no_main]
3

4
mod bindings;
5

6
use aya_ebpf::{
7
    helpers::bpf_get_current_pid_tgid,
8
    macros::{kprobe, map},
9
    maps::RingBuf,
10
    programs::ProbeContext,
11
    EbpfContext,
12
};
13
use aya_log_ebpf::info;
14
use bindings::{task_struct, pid_t};
15

16
// Event structure for user space communication
17
#[repr(C)]
18
#[derive(Clone, Copy)]
19
pub struct TaskEvent {
20
    pub caller_pid: u32,
21
    pub caller_tgid: u32,
22
    pub new_task_pid: u32,
23
    pub new_task_tgid: u32,
24
    pub timestamp: u64,
25
    pub comm: [u8; 16],
26
}
27

28
// Ring buffer for efficient event streaming
29
#[map]
30
static TASK_EVENTS: RingBuf = RingBuf::with_byte_size(1024 * 1024, 0);
31

32
// Statistics tracking
33
#[map]
34
static mut STATS: aya_ebpf::maps::Array<u64> = aya_ebpf::maps::Array::with_max_entries(4, 0);
35

36
// Statistics indices
37
const STAT_TOTAL_EVENTS: u32 = 0;
38
const STAT_SUCCESSFUL_EVENTS: u32 = 1;
39
const STAT_ERROR_EVENTS: u32 = 2;
40
const STAT_LAST_TIMESTAMP: u32 = 3;
41

42
#[kprobe]
43
pub fn wake_up_new_task(ctx: ProbeContext) -> u32 {
44
    match try_wake_up_new_task(ctx) {
45
        Ok(ret) => ret,
46
        Err(ret) => {
47
            // Update error statistics
48
            unsafe {
49
                if let Ok(mut stat) = STATS.get_ptr_mut(STAT_ERROR_EVENTS) {
50
                    *stat += 1;
51
                }
52
            }
53
            ret
54
        }
55
    }
56
}
57

58
fn try_wake_up_new_task(ctx: ProbeContext) -> Result<u32, u32> {
59
    // Get the task_struct pointer from the first argument
60
    let task: *const task_struct = ctx.arg(0).ok_or(1)?;
61

62
    // Read task information safely
63
    let new_task_pid = unsafe {
64
        core::ptr::read_volatile(&(*task).pid as *const pid_t)
65
    };
66
    let new_task_tgid = unsafe {
67
        core::ptr::read_volatile(&(*task).tgid as *const pid_t)
68
    };
69

70
    // Get caller information
71
    let caller_pid_tgid = bpf_get_current_pid_tgid();
72
    let caller_pid = (caller_pid_tgid & 0xFFFFFFFF) as u32;
73
    let caller_tgid = (caller_pid_tgid >> 32) as u32;
74

75
    // Get current timestamp
76
    let timestamp = unsafe {
77
        aya_ebpf::helpers::bpf_ktime_get_ns()
78
    };
79

80
    // Read process name from task_struct
81
    let mut comm = [0u8; 16];
82
    unsafe {
83
        let comm_ptr = &(*task).comm as *const [::aya_ebpf::cty::c_char; 16];
84
        for i in 0..16 {
85
            let c = core::ptr::read_volatile(&(*comm_ptr)[i]);
86
            comm[i] = c as u8;
87
            if c == 0 { break; }
88
        }
89
    }
90

91
    // Create event for user space
92
    let event = TaskEvent {
93
        caller_pid,
94
        caller_tgid,
95
        new_task_pid: new_task_pid as u32,
96
        new_task_tgid: new_task_tgid as u32,
97
        timestamp,
98
        comm,
99
    };
100

101
    // Send event to user space via ring buffer
102
    if let Some(mut entry) = TASK_EVENTS.reserve::<TaskEvent>(0) {
103
        entry.write(event);
104
        entry.submit(0);
105

106
        // Update statistics
107
        unsafe {
108
            if let Ok(mut stat) = STATS.get_ptr_mut(STAT_SUCCESSFUL_EVENTS) {
109
                *stat += 1;
110
            }
111
            if let Ok(mut stat) = STATS.get_ptr_mut(STAT_LAST_TIMESTAMP) {
112
                *stat = timestamp;
113
            }
114
        }
115
    }
116

117
    // Update total events counter
118
    unsafe {
119
        if let Ok(mut stat) = STATS.get_ptr_mut(STAT_TOTAL_EVENTS) {
120
            *stat += 1;
121
        }
122
    }
123

124
    // Log the event (visible in /sys/kernel/debug/tracing/trace_pipe)
125
    info!(
126
        &ctx,
127
        "wake_up_new_task: caller PID {}, new task PID {}, TGID {}",
128
        caller_pid,
129
        new_task_pid,
130
        new_task_tgid
131
    );
132

133
    Ok(0)
134
}
135

136
#[panic_handler]
137
fn panic(_info: &core::panic::PanicInfo) -> ! {
138
    unsafe { core::hint::unreachable_unchecked() }
139
}

Enhanced eBPF Program with Advanced Features#

1
// ebpf/src/advanced.rs (alternative implementation)
2
#![no_std]
3
#![no_main]
4

5
mod bindings;
6

7
use aya_ebpf::{
8
    helpers::{
9
        bpf_get_current_pid_tgid,
10
        bpf_ktime_get_ns,
11
        bpf_probe_read_kernel,
12
    },
13
    macros::{kprobe, kretprobe, map},
14
    maps::{RingBuf, HashMap, Array},
15
    programs::ProbeContext,
16
    EbpfContext,
17
};
18
use aya_log_ebpf::info;
19
use bindings::{task_struct, pid_t, mm_struct, cred};
20

21
// Enhanced event structure
22
#[repr(C)]
23
#[derive(Clone, Copy)]
24
pub struct EnhancedTaskEvent {
25
    pub caller_pid: u32,
26
    pub caller_tgid: u32,
27
    pub new_task_pid: u32,
28
    pub new_task_tgid: u32,
29
    pub parent_pid: u32,
30
    pub timestamp: u64,
31
    pub comm: [u8; 16],
32
    pub uid: u32,
33
    pub gid: u32,
34
    pub memory_usage: u64,
35
    pub cpu_id: u32,
36
    pub event_type: u8, // 0 = entry, 1 = exit
37
}
38

39
// Task tracking for entry/exit correlation
40
#[repr(C)]
41
#[derive(Clone, Copy)]
42
pub struct TaskContext {
43
    pub entry_time: u64,
44
    pub caller_pid: u32,
45
    pub task_ptr: u64,
46
}
47

48
// Ring buffer for events
49
#[map]
50
static TASK_EVENTS: RingBuf = RingBuf::with_byte_size(2 * 1024 * 1024, 0);
51

52
// Track ongoing task creations
53
#[map]
54
static PENDING_TASKS: HashMap<u32, TaskContext> = HashMap::with_max_entries(1024, 0);
55

56
// Process statistics
57
#[map]
58
static PROCESS_STATS: HashMap<u32, u64> = HashMap::with_max_entries(10000, 0);
59

60
// System-wide statistics
61
#[map]
62
static mut SYSTEM_STATS: Array<u64> = Array::with_max_entries(8, 0);
63

64
// Enhanced kprobe with detailed tracking
65
#[kprobe]
66
pub fn wake_up_new_task_enhanced(ctx: ProbeContext) -> u32 {
67
    match try_wake_up_new_task_enhanced(ctx) {
68
        Ok(ret) => ret,
69
        Err(ret) => ret,
70
    }
71
}
72

73
fn try_wake_up_new_task_enhanced(ctx: ProbeContext) -> Result<u32, u32> {
74
    let task: *const task_struct = ctx.arg(0).ok_or(1)?;
75
    let timestamp = unsafe { bpf_ktime_get_ns() };
76

77
    // Read task information
78
    let new_task_pid = unsafe {
79
        bpf_probe_read_kernel(&(*task).pid as *const pid_t).map_err(|_| 1)?
80
    };
81
    let new_task_tgid = unsafe {
82
        bpf_probe_read_kernel(&(*task).tgid as *const pid_t).map_err(|_| 1)?
83
    };
84
    let parent_pid = unsafe {
85
        let parent = bpf_probe_read_kernel(&(*task).real_parent as *const *const task_struct)
86
            .map_err(|_| 1)?;
87
        if !parent.is_null() {
88
            bpf_probe_read_kernel(&(*parent).pid as *const pid_t).map_err(|_| 1)?
89
        } else {
90
            0
91
        }
92
    };
93

94
    // Get caller information
95
    let caller_pid_tgid = bpf_get_current_pid_tgid();
96
    let caller_pid = (caller_pid_tgid & 0xFFFFFFFF) as u32;
97
    let caller_tgid = (caller_pid_tgid >> 32) as u32;
98

99
    // Read process credentials
100
    let (uid, gid) = unsafe {
101
        let cred_ptr = bpf_probe_read_kernel(&(*task).real_cred as *const *const cred)
102
            .map_err(|_| 1)?;
103
        if !cred_ptr.is_null() {
104
            let uid = bpf_probe_read_kernel(&(*cred_ptr).uid.val).unwrap_or(0);
105
            let gid = bpf_probe_read_kernel(&(*cred_ptr).gid.val).unwrap_or(0);
106
            (uid, gid)
107
        } else {
108
            (0, 0)
109
        }
110
    };
111

112
    // Read memory information
113
    let memory_usage = unsafe {
114
        let mm_ptr = bpf_probe_read_kernel(&(*task).mm as *const *const mm_struct)
115
            .map_err(|_| 1)?;
116
        if !mm_ptr.is_null() {
117
            // Simplified memory usage calculation
118
            bpf_probe_read_kernel(&(*mm_ptr).total_vm).unwrap_or(0) * 4096
119
        } else {
120
            0
121
        }
122
    };
123

124
    // Read process name
125
    let mut comm = [0u8; 16];
126
    unsafe {
127
        let comm_array = bpf_probe_read_kernel(&(*task).comm as *const [i8; 16])
128
            .map_err(|_| 1)?;
129
        for i in 0..16 {
130
            comm[i] = comm_array[i] as u8;
131
            if comm_array[i] == 0 { break; }
132
        }
133
    }
134

135
    // Get current CPU
136
    let cpu_id = unsafe {
137
        aya_ebpf::helpers::bpf_get_smp_processor_id()
138
    };
139

140
    // Create enhanced event
141
    let event = EnhancedTaskEvent {
142
        caller_pid,
143
        caller_tgid,
144
        new_task_pid: new_task_pid as u32,
145
        new_task_tgid: new_task_tgid as u32,
146
        parent_pid: parent_pid as u32,
147
        timestamp,
148
        comm,
149
        uid,
150
        gid,
151
        memory_usage,
152
        cpu_id,
153
        event_type: 0, // Entry event
154
    };
155

156
    // Track task for exit correlation
157
    let context = TaskContext {
158
        entry_time: timestamp,
159
        caller_pid,
160
        task_ptr: task as u64,
161
    };
162
    PENDING_TASKS.insert(&(new_task_pid as u32), &context, 0).ok();
163

164
    // Update process statistics
165
    let mut count = PROCESS_STATS.get(&caller_pid).copied().unwrap_or(0);
166
    count += 1;
167
    PROCESS_STATS.insert(&caller_pid, &count, 0).ok();
168

169
    // Send event to user space
170
    if let Some(mut entry) = TASK_EVENTS.reserve::<EnhancedTaskEvent>(0) {
171
        entry.write(event);
172
        entry.submit(0);
173
    }
174

175
    // Update system statistics
176
    unsafe {
177
        if let Ok(mut stat) = SYSTEM_STATS.get_ptr_mut(0) {
178
            *stat += 1; // Total events
179
        }
180
        if let Ok(mut stat) = SYSTEM_STATS.get_ptr_mut(1) {
181
            *stat = timestamp; // Last event time
182
        }
183
    }
184

185
    info!(
186
        &ctx,
187
        "Enhanced wake_up_new_task: PID {} (parent: {}) by caller {}, mem: {} KB",
188
        new_task_pid,
189
        parent_pid,
190
        caller_pid,
191
        memory_usage / 1024
192
    );
193

194
    Ok(0)
195
}
196

197
// Track task completion with kretprobe
198
#[kretprobe]
199
pub fn wake_up_new_task_exit(ctx: ProbeContext) -> u32 {
200
    let caller_pid_tgid = bpf_get_current_pid_tgid();
201
    let caller_pid = (caller_pid_tgid & 0xFFFFFFFF) as u32;
202
    let timestamp = unsafe { bpf_ktime_get_ns() };
203

204
    // Look for pending task creation
205
    if let Some(context) = PENDING_TASKS.get(&caller_pid) {
206
        let duration = timestamp - context.entry_time;
207

208
        // Update system statistics
209
        unsafe {
210
            if let Ok(mut stat) = SYSTEM_STATS.get_ptr_mut(2) {
211
                *stat = duration; // Last task creation duration
212
            }
213
            if let Ok(mut stat) = SYSTEM_STATS.get_ptr_mut(3) {
214
                *stat += 1; // Completed task creations
215
            }
216
        }
217

218
        // Clean up tracking
219
        PENDING_TASKS.remove(&caller_pid).ok();
220

221
        info!(
222
            &ctx,
223
            "Task creation completed: caller {} duration {} μs",
224
            caller_pid,
225
            duration / 1000
226
        );
227
    }
228

229
    0
230
}
231

232
#[panic_handler]
233
fn panic(_info: &core::panic::PanicInfo) -> ! {
234
    unsafe { core::hint::unreachable_unchecked() }
235
}

User-Space Application Implementation#

Basic Observer Implementation#

1
use anyhow::Result;
2
use aya::{
3
    include_bytes_aligned,
4
    maps::RingBuf,
5
    programs::KProbe,
6
    Bpf,
7
};
8
use aya_log::BpfLogger;
9
use bytes::BytesMut;
10
use clap::Parser;
11
use log::{info, warn, error};
12
use std::{
13
    convert::TryInto,
14
    sync::{
15
        atomic::{AtomicBool, Ordering},
16
        Arc,
17
    },
18
    time::{Duration, SystemTime, UNIX_EPOCH},
19
};
20
use tokio::{signal, time::sleep};
21

22
#[derive(Parser, Debug)]
23
#[command(name = "kprobe-observer")]
24
#[command(about = "eBPF Kprobe observer for wake_up_new_task")]
25
struct Args {
26
    /// Enable verbose logging
27
    #[arg(short, long)]
28
    verbose: bool,
29

30
    /// Statistics reporting interval (seconds)
31
    #[arg(short, long, default_value = "30")]
32
    stats_interval: u64,
33

34
    /// Maximum events to process (0 = unlimited)
35
    #[arg(short, long, default_value = "0")]
36
    max_events: u64,
37
}
38

39
// Event structure matching eBPF program
40
#[repr(C)]
41
#[derive(Clone, Copy, Debug)]
42
struct TaskEvent {
43
    pub caller_pid: u32,
44
    pub caller_tgid: u32,
45
    pub new_task_pid: u32,
46
    pub new_task_tgid: u32,
47
    pub timestamp: u64,
48
    pub comm: [u8; 16],
49
}
50

51
// Application statistics
52
#[derive(Default, Debug)]
53
struct Statistics {
54
    total_events: u64,
55
    events_per_second: f64,
56
    unique_callers: std::collections::HashSet<u32>,
57
    unique_processes: std::collections::HashSet<String>,
58
    start_time: Option<SystemTime>,
59
    last_event_time: Option<SystemTime>,
60
}
61

62
impl Statistics {
63
    fn new() -> Self {
64
        Self {
65
            start_time: Some(SystemTime::now()),
66
            ..Default::default()
67
        }
68
    }
69

70
    fn update(&mut self, event: &TaskEvent) {
71
        self.total_events += 1;
72
        self.unique_callers.insert(event.caller_pid);
73

74
        // Convert comm to string
75
        let comm = String::from_utf8_lossy(&event.comm)
76
            .trim_end_matches('\0')
77
            .to_string();
78
        if !comm.is_empty() {
79
            self.unique_processes.insert(comm);
80
        }
81

82
        self.last_event_time = Some(SystemTime::now());
83

84
        // Calculate events per second
85
        if let Some(start) = self.start_time {
86
            if let Ok(duration) = SystemTime::now().duration_since(start) {
87
                if duration.as_secs() > 0 {
88
                    self.events_per_second = self.total_events as f64 / duration.as_secs() as f64;
89
                }
90
            }
91
        }
92
    }
93

94
    fn print_summary(&self) {
95
        println!("\n=== eBPF Kprobe Observer Statistics ===");
96
        println!("Total events processed: {}", self.total_events);
97
        println!("Unique callers: {}", self.unique_callers.len());
98
        println!("Unique processes: {}", self.unique_processes.len());
99
        println!("Events per second: {:.2}", self.events_per_second);
100

101
        if let Some(start) = self.start_time {
102
            if let Ok(duration) = SystemTime::now().duration_since(start) {
103
                println!("Runtime: {:.2} seconds", duration.as_secs_f64());
104
            }
105
        }
106

107
        println!("======================================\n");
108
    }
109
}
110

111
async fn load_and_attach_ebpf() -> Result<(Bpf, RingBuf<&'static mut [u8]>)> {
112
    // Load the eBPF program
113
    let mut bpf = Bpf::load(include_bytes_aligned!(
114
        "../target/bpfel-unknown-none/release/kprobe-observer"
115
    ))?;
116

117
    // Initialize BPF logger for eBPF program logs
118
    if let Err(e) = BpfLogger::init(&mut bpf) {
119
        warn!("Failed to initialize BPF logger: {}", e);
120
    }
121

122
    // Load and attach the kprobe program
123
    let program: &mut KProbe = bpf.program_mut("wake_up_new_task").unwrap().try_into()?;
124
    program.load()?;
125
    program.attach("wake_up_new_task", 0)?;
126

127
    info!("eBPF program loaded and attached successfully");
128

129
    // Get the ring buffer for event communication
130
    let ring_buf = RingBuf::try_from(bpf.map_mut("TASK_EVENTS").unwrap())?;
131

132
    Ok((bpf, ring_buf))
133
}
134

135
fn parse_task_event(data: &[u8]) -> Result<TaskEvent> {
136
    if data.len() < std::mem::size_of::<TaskEvent>() {
137
        anyhow::bail!("Invalid event data size: {} bytes", data.len());
138
    }
139

140
    let event = unsafe {
141
        std::ptr::read_unaligned(data.as_ptr() as *const TaskEvent)
142
    };
143

144
    Ok(event)
145
}
146

147
async fn process_events(
148
    mut ring_buf: RingBuf<&'static mut [u8]>,
149
    running: Arc<AtomicBool>,
150
    args: &Args,
151
) -> Result<()> {
152
    let mut stats = Statistics::new();
153
    let mut events_processed = 0u64;
154

155
    info!("Starting event processing...");
156

157
    // Spawn statistics reporting task
158
    let stats_running = running.clone();
159
    let stats_interval = args.stats_interval;
160
    let stats_handle = tokio::spawn(async move {
161
        let mut interval = tokio::time::interval(Duration::from_secs(stats_interval));
162

163
        while stats_running.load(Ordering::Relaxed) {
164
            interval.tick().await;
165
            // Statistics will be printed by main loop
166
        }
167
    });
168

169
    while running.load(Ordering::Relaxed) {
170
        // Poll for events with timeout
171
        match ring_buf.next() {
172
            Some(item) => {
173
                match parse_task_event(&item) {
174
                    Ok(event) => {
175
                        events_processed += 1;
176
                        stats.update(&event);
177

178
                        if args.verbose {
179
                            print_event_details(&event);
180
                        } else {
181
                            print_event_summary(&event);
182
                        }
183

184
                        // Check if we should stop based on max_events
185
                        if args.max_events > 0 && events_processed >= args.max_events {
186
                            info!("Reached maximum events limit ({}), stopping...", args.max_events);
187
                            break;
188
                        }
189
                    }
190
                    Err(e) => {
191
                        warn!("Failed to parse event: {}", e);
192
                    }
193
                }
194
            }
195
            None => {
196
                // No events available, sleep briefly
197
                sleep(Duration::from_millis(10)).await;
198
            }
199
        }
200

201
        // Print periodic statistics
202
        if events_processed % 1000 == 0 && events_processed > 0 {
203
            stats.print_summary();
204
        }
205
    }
206

207
    // Final statistics
208
    stats.print_summary();
209

210
    // Clean up
211
    stats_handle.abort();
212

213
    Ok(())
214
}
215

216
fn print_event_summary(event: &TaskEvent) {
217
    let comm = String::from_utf8_lossy(&event.comm).trim_end_matches('\0');
218
    let timestamp_secs = event.timestamp / 1_000_000_000;
219
    let timestamp_us = (event.timestamp % 1_000_000_000) / 1_000;
220

221
    println!(
222
        "[{}.{:06}] wake_up_new_task: caller PID {}, new task PID {} ({}), TGID {}",
223
        timestamp_secs,
224
        timestamp_us,
225
        event.caller_pid,
226
        event.new_task_pid,
227
        comm,
228
        event.new_task_tgid
229
    );
230
}
231

232
fn print_event_details(event: &TaskEvent) {
233
    let comm = String::from_utf8_lossy(&event.comm).trim_end_matches('\0');
234
    let timestamp_secs = event.timestamp / 1_000_000_000;
235
    let timestamp_us = (event.timestamp % 1_000_000_000) / 1_000;
236

237
    println!("=== Task Wake-up Event ===");
238
    println!("Timestamp: {}.{:06}", timestamp_secs, timestamp_us);
239
    println!("Caller PID: {}", event.caller_pid);
240
    println!("Caller TGID: {}", event.caller_tgid);
241
    println!("New Task PID: {}", event.new_task_pid);
242
    println!("New Task TGID: {}", event.new_task_tgid);
243
    println!("Process Name: {}", comm);
244
    println!("========================\n");
245
}
246

247
#[tokio::main]
248
async fn main() -> Result<()> {
249
    let args = Args::parse();
250

251
    // Initialize logging
252
    env_logger::Builder::from_default_env()
253
        .filter_level(if args.verbose {
254
            log::LevelFilter::Debug
255
        } else {
256
            log::LevelFilter::Info
257
        })
258
        .init();
259

260
    info!("Starting eBPF Kprobe Observer");
261

262
    // Check if we're running as root
263
    if !nix::unistd::Uid::effective().is_root() {
264
        error!("This program requires root privileges to load eBPF programs");
265
        std::process::exit(1);
266
    }
267

268
    // Load and attach eBPF program
269
    let (_bpf, ring_buf) = load_and_attach_ebpf().await?;
270

271
    // Set up signal handling
272
    let running = Arc::new(AtomicBool::new(true));
273
    let r = running.clone();
274

275
    tokio::spawn(async move {
276
        signal::ctrl_c().await.expect("Failed to listen for Ctrl+C");
277
        info!("Received Ctrl+C, shutting down...");
278
        r.store(false, Ordering::Relaxed);
279
    });
280

281
    info!("Monitoring wake_up_new_task events... Press Ctrl+C to exit");
282

283
    // Process events
284
    process_events(ring_buf, running, &args).await?;
285

286
    info!("Observer shutting down");
287

288
    Ok(())
289
}

Advanced Observer with Metrics and Alerting#

1
use anyhow::Result;
2
use aya::{
3
    include_bytes_aligned,
4
    maps::{RingBuf, Array, HashMap},
5
    programs::KProbe,
6
    Bpf,
7
};
8
use prometheus::{
9
    Counter, Gauge, Histogram, IntCounter, IntGauge, Registry,
10
    Encoder, TextEncoder,
11
};
12
use std::{
13
    collections::HashMap as StdHashMap,
14
    sync::{Arc, Mutex},
15
    time::{Duration, Instant},
16
};
17
use tokio::{
18
    net::TcpListener,
19
    time::interval,
20
};
21
use warp::{Filter, Reply};
22

23
// Prometheus metrics
24
#[derive(Clone)]
25
struct Metrics {
26
    events_total: IntCounter,
27
    events_per_second: Gauge,
28
    unique_processes: IntGauge,
29
    processing_duration: Histogram,
30
    registry: Registry,
31
}
32

33
impl Metrics {
34
    fn new() -> Result<Self> {
35
        let registry = Registry::new();
36

37
        let events_total = IntCounter::new(
38
            "ebpf_wake_up_events_total",
39
            "Total number of wake_up_new_task events"
40
        )?;
41

42
        let events_per_second = Gauge::new(
43
            "ebpf_wake_up_events_per_second",
44
            "Events processed per second"
45
        )?;
46

47
        let unique_processes = IntGauge::new(
48
            "ebpf_unique_processes",
49
            "Number of unique processes observed"
50
        )?;
51

52
        let processing_duration = Histogram::with_opts(
53
            prometheus::HistogramOpts::new(
54
                "ebpf_event_processing_duration_seconds",
55
                "Time spent processing each event"
56
            ).buckets(vec![0.000001, 0.00001, 0.0001, 0.001, 0.01, 0.1])
57
        )?;
58

59
        registry.register(Box::new(events_total.clone()))?;
60
        registry.register(Box::new(events_per_second.clone()))?;
61
        registry.register(Box::new(unique_processes.clone()))?;
62
        registry.register(Box::new(processing_duration.clone()))?;
63

64
        Ok(Self {
65
            events_total,
66
            events_per_second,
67
            unique_processes,
68
            processing_duration,
69
            registry,
70
        })
71
    }
72
}
73

74
// Enhanced event processor with metrics
75
struct AdvancedEventProcessor {
76
    metrics: Metrics,
77
    process_tracker: Arc<Mutex<StdHashMap<String, ProcessInfo>>>,
78
    rate_calculator: RateCalculator,
79
}
80

81
#[derive(Debug, Clone)]
82
struct ProcessInfo {
83
    pid: u32,
84
    name: String,
85
    first_seen: Instant,
86
    last_seen: Instant,
87
    event_count: u64,
88
}
89

90
struct RateCalculator {
91
    window_size: Duration,
92
    events: Arc<Mutex<Vec<Instant>>>,
93
}
94

95
impl RateCalculator {
96
    fn new(window_size: Duration) -> Self {
97
        Self {
98
            window_size,
99
            events: Arc::new(Mutex::new(Vec::new())),
100
        }
101
    }
102

103
    fn add_event(&self) {
104
        let now = Instant::now();
105
        let mut events = self.events.lock().unwrap();
106
        events.push(now);
107

108
        // Remove old events outside the window
109
        events.retain(|&event_time| now.duration_since(event_time) <= self.window_size);
110
    }
111

112
    fn get_rate(&self) -> f64 {
113
        let events = self.events.lock().unwrap();
114
        events.len() as f64 / self.window_size.as_secs_f64()
115
    }
116
}
117

118
impl AdvancedEventProcessor {
119
    fn new() -> Result<Self> {
120
        Ok(Self {
121
            metrics: Metrics::new()?,
122
            process_tracker: Arc::new(Mutex::new(StdHashMap::new())),
123
            rate_calculator: RateCalculator::new(Duration::from_secs(60)),
124
        })
125
    }
126

127
    fn process_event(&self, event: &TaskEvent) -> Result<()> {
128
        let start = Instant::now();
129

130
        // Update metrics
131
        self.metrics.events_total.inc();
132
        self.rate_calculator.add_event();
133

134
        // Extract process name
135
        let comm = String::from_utf8_lossy(&event.comm)
136
            .trim_end_matches('\0')
137
            .to_string();
138

139
        // Update process tracking
140
        {
141
            let mut tracker = self.process_tracker.lock().unwrap();
142
            let now = Instant::now();
143

144
            let process_info = tracker.entry(comm.clone()).or_insert_with(|| {
145
                ProcessInfo {
146
                    pid: event.new_task_pid,
147
                    name: comm.clone(),
148
                    first_seen: now,
149
                    last_seen: now,
150
                    event_count: 0,
151
                }
152
            });
153

154
            process_info.last_seen = now;
155
            process_info.event_count += 1;
156

157
            // Update unique processes metric
158
            self.metrics.unique_processes.set(tracker.len() as i64);
159
        }
160

161
        // Record processing time
162
        let processing_time = start.elapsed().as_secs_f64();
163
        self.metrics.processing_duration.observe(processing_time);
164

165
        // Check for anomalies
166
        self.detect_anomalies(event)?;
167

168
        Ok(())
169
    }
170

171
    fn detect_anomalies(&self, event: &TaskEvent) -> Result<()> {
172
        // Simple anomaly detection: rapid process creation
173
        let rate = self.rate_calculator.get_rate();
174
        if rate > 100.0 { // More than 100 events per second
175
            warn!("High process creation rate detected: {:.2} events/sec", rate);
176
        }
177

178
        // Check for suspicious process names
179
        let comm = String::from_utf8_lossy(&event.comm).trim_end_matches('\0');
180
        if comm.contains("nc") || comm.contains("bash") || comm.contains("sh") {
181
            info!("Potentially interesting process: {} (PID: {})", comm, event.new_task_pid);
182
        }
183

184
        Ok(())
185
    }
186

187
    fn update_rates(&self) {
188
        let rate = self.rate_calculator.get_rate();
189
        self.metrics.events_per_second.set(rate);
190
    }
191
}
192

193
// Metrics HTTP server
194
async fn serve_metrics(metrics: Metrics, port: u16) -> Result<()> {
195
    let metrics_route = warp::path("metrics")
196
        .map(move || {
197
            let encoder = TextEncoder::new();
198
            let metric_families = metrics.registry.gather();
199
            let mut buffer = Vec::new();
200
            encoder.encode(&metric_families, &mut buffer).unwrap();
201
            String::from_utf8(buffer).unwrap()
202
        });
203

204
    let health_route = warp::path("health")
205
        .map(|| "OK");
206

207
    let routes = metrics_route.or(health_route);
208

209
    info!("Starting metrics server on port {}", port);
210
    warp::serve(routes)
211
        .run(([0, 0, 0, 0], port))
212
        .await;
213

214
    Ok(())
215
}
216

217
// Main advanced observer function
218
pub async fn run_advanced_observer() -> Result<()> {
219
    // Initialize the event processor
220
    let processor = AdvancedEventProcessor::new()?;
221
    let metrics = processor.metrics.clone();
222

223
    // Load eBPF program
224
    let (_bpf, mut ring_buf) = load_and_attach_ebpf().await?;
225

226
    // Start metrics server
227
    tokio::spawn(async move {
228
        if let Err(e) = serve_metrics(metrics, 9090).await {
229
            error!("Metrics server error: {}", e);
230
        }
231
    });
232

233
    // Start rate update task
234
    let processor_clone = Arc::new(processor);
235
    let rate_processor = processor_clone.clone();
236
    tokio::spawn(async move {
237
        let mut interval = interval(Duration::from_secs(1));
238
        loop {
239
            interval.tick().await;
240
            rate_processor.update_rates();
241
        }
242
    });
243

244
    info!("Advanced observer started. Metrics available at http://localhost:9090/metrics");
245

246
    // Process events
247
    loop {
248
        if let Some(item) = ring_buf.next() {
249
            match parse_task_event(&item) {
250
                Ok(event) => {
251
                    if let Err(e) = processor_clone.process_event(&event) {
252
                        warn!("Error processing event: {}", e);
253
                    }
254
                }
255
                Err(e) => {
256
                    warn!("Failed to parse event: {}", e);
257
                }
258
            }
259
        } else {
260
            tokio::time::sleep(Duration::from_millis(1)).await;
261
        }
262
    }
263
}

Testing and Validation#

Basic Testing Workflow#

1
# Generate kernel bindings
2
cargo xtask codegen
3

4
# Build eBPF program
5
cargo xtask build
6

7
# Run the observer (requires root)
8
sudo ./target/release/kprobe-observer --verbose
9

10
# In another terminal, trigger events
11
echo $$ # Note your shell PID
12
date &    # Run a background command
13
sleep 1 & # Another background command

Expected Output#

1
[1725456973.392816] wake_up_new_task: caller PID 21479, new task PID 22367 (date), TGID 22367
2
[1725456973.393142] wake_up_new_task: caller PID 21479, new task PID 22368 (sleep), TGID 22368

Advanced Testing with Load Generation#

1
#!/bin/bash
2
echo "Generating process creation load for eBPF testing..."
3

4
# Function to create background processes
5
generate_load() {
6
    local duration=$1
7
    local processes_per_second=$2
8
    local end_time=$((SECONDS + duration))
9

10
    while [ $SECONDS -lt $end_time ]; do
11
        for ((i=0; i<processes_per_second; i++)); do
12
            true &  # Minimal background process
13
        done
14
        sleep 1
15
    done
16
}
17

18
# Light load test
19
echo "Starting light load test (5 processes/sec for 30 seconds)..."
20
generate_load 30 5 &
21

22
# Medium load test
23
echo "Starting medium load test (20 processes/sec for 30 seconds)..."
24
generate_load 30 20 &
25

26
# Wait for tests to complete
27
wait
28

29
echo "Load generation complete"

Validation with bpftool#

1
# Check if eBPF program is loaded
2
sudo bpftool prog list | grep wake_up_new_task
3

4
# Check program statistics
5
sudo bpftool prog show id <PROGRAM_ID> --pretty
6

7
# Check maps
8
sudo bpftool map list
9

10
# Dump ring buffer contents (if applicable)
11
sudo bpftool map dump id <MAP_ID>

Production Deployment and Monitoring#

Systemd Service Configuration#

1
[Unit]
2
Description=eBPF Kprobe Observer for Task Monitoring
3
After=network.target
4
Wants=network.target
5

6
[Service]
7
Type=simple
8
User=root
9
Group=root
10
ExecStart=/usr/local/bin/kprobe-observer --stats-interval=60
11
Restart=always
12
RestartSec=10
13
StandardOutput=journal
14
StandardError=journal
15

16
# Security settings
17
NoNewPrivileges=true
18
PrivateTmp=true
19
ProtectSystem=strict
20
ProtectHome=true
21
ReadWritePaths=/sys/kernel/debug /sys/fs/bpf
22

23
# Required for eBPF operations
24
CapabilityBoundingSet=CAP_SYS_ADMIN CAP_BPF CAP_PERFMON
25
AmbientCapabilities=CAP_SYS_ADMIN CAP_BPF CAP_PERFMON
26

27
[Install]
28
WantedBy=multi-user.target

Container Deployment#

1
# Dockerfile
2
FROM rust:1.70-slim as builder
3

4
# Install build dependencies
5
RUN apt-get update && apt-get install -y \
6
    clang \
7
    llvm \
8
    libelf-dev \
9
    libz-dev \
10
    libbpf-dev \
11
    linux-headers-generic \
12
    && rm -rf /var/lib/apt/lists/*
13

14
# Install Rust nightly and eBPF tools
15
RUN rustup install nightly && rustup default nightly
16
RUN cargo install bpf-linker bindgen-cli
17

18
WORKDIR /app
19
COPY . .
20

21
# Build the application
22
RUN cargo xtask codegen
23
RUN cargo xtask build
24
RUN cargo build --release
25

26
# Runtime image
27
FROM ubuntu:22.04
28

29
# Install runtime dependencies
30
RUN apt-get update && apt-get install -y \
31
    libbpf0 \
32
    && rm -rf /var/lib/apt/lists/*
33

34
# Copy binary
35
COPY --from=builder /app/target/release/kprobe-observer /usr/local/bin/
36

37
# Required for eBPF
38
VOLUME ["/sys/kernel/debug", "/sys/fs/bpf"]
39

40
CMD ["kprobe-observer"]

Kubernetes Deployment#

1
apiVersion: apps/v1
2
kind: DaemonSet
3
metadata:
4
  name: ebpf-kprobe-observer
5
  namespace: monitoring
6
spec:
7
  selector:
8
    matchLabels:
9
      app: ebpf-kprobe-observer
10
  template:
11
    metadata:
12
      labels:
13
        app: ebpf-kprobe-observer
14
    spec:
15
      hostNetwork: true
16
      hostPID: true
17
      serviceAccountName: ebpf-kprobe-observer
18
      containers:
19
        - name: observer
20
          image: ebpf-kprobe-observer:latest
21
          securityContext:
22
            privileged: true
23
            capabilities:
24
              add: ["SYS_ADMIN", "BPF", "PERFMON"]
25
          env:
26
            - name: NODE_NAME
27
              valueFrom:
28
                fieldRef:
29
                  fieldPath: spec.nodeName
30
          resources:
31
            requests:
32
              cpu: 100m
33
              memory: 128Mi
34
            limits:
35
              cpu: 500m
36
              memory: 512Mi
37
          volumeMounts:
38
            - name: debugfs
39
              mountPath: /sys/kernel/debug
40
            - name: bpf
41
              mountPath: /sys/fs/bpf
42
            - name: tracefs
43
              mountPath: /sys/kernel/tracing
44
      volumes:
45
        - name: debugfs
46
          hostPath:
47
            path: /sys/kernel/debug
48
        - name: bpf
49
          hostPath:
50
            path: /sys/fs/bpf
51
        - name: tracefs
52
          hostPath:
53
            path: /sys/kernel/tracing
54
      tolerations:
55
        - operator: Exists
56
          effect: NoSchedule

Monitoring and Alerting#

1
apiVersion: monitoring.coreos.com/v1
2
kind: PrometheusRule
3
metadata:
4
  name: ebpf-kprobe-observer-alerts
5
  namespace: monitoring
6
spec:
7
  groups:
8
    - name: ebpf.rules
9
      rules:
10
        - alert: HighProcessCreationRate
11
          expr: ebpf_wake_up_events_per_second > 100
12
          for: 5m
13
          labels:
14
            severity: warning
15
          annotations:
16
            summary: "High process creation rate detected"
17
            description: "Process creation rate is {{ $value }} events/sec on {{ $labels.instance }}"
18

19
        - alert: eBPFObserverDown
20
          expr: up{job="ebpf-kprobe-observer"} == 0
21
          for: 1m
22
          labels:
23
            severity: critical
24
          annotations:
25
            summary: "eBPF Kprobe Observer is down"
26
            description: "eBPF Kprobe Observer has been down for more than 1 minute"
27

28
        - alert: EventProcessingLatency
29
          expr: histogram_quantile(0.95, rate(ebpf_event_processing_duration_seconds_bucket[5m])) > 0.001
30
          for: 2m
31
          labels:
32
            severity: warning
33
          annotations:
34
            summary: "High event processing latency"
35
            description: "95th percentile processing latency is {{ $value }} seconds"

Performance Optimization and Best Practices#

eBPF Program Optimization#

1
// Optimized eBPF program patterns
2
#![no_std]
3
#![no_main]
4

5
// Use efficient data structures
6
#[map]
7
static RING_BUF: RingBuf = RingBuf::with_byte_size(1024 * 1024, 0); // 1MB buffer
8

9
// Implement sampling for high-frequency events
10
static mut SAMPLE_RATE: u32 = 10; // Sample 1 in 10 events
11

12
#[kprobe]
13
pub fn optimized_wake_up_new_task(ctx: ProbeContext) -> u32 {
14
    // Quick sampling check
15
    let sample_counter = unsafe {
16
        static mut COUNTER: u32 = 0;
17
        COUNTER += 1;
18
        COUNTER
19
    };
20

21
    if sample_counter % unsafe { SAMPLE_RATE } != 0 {
22
        return 0; // Skip this event
23
    }
24

25
    // Continue with normal processing...
26
    match try_wake_up_new_task(ctx) {
27
        Ok(ret) => ret,
28
        Err(ret) => ret,
29
    }
30
}
31

32
// Use efficient memory access patterns
33
fn try_wake_up_new_task(ctx: ProbeContext) -> Result<u32, u32> {
34
    let task: *const task_struct = ctx.arg(0).ok_or(1)?;
35

36
    // Batch multiple reads to reduce kernel/userspace transitions
37
    let (pid, tgid, comm) = unsafe {
38
        let pid = core::ptr::read_volatile(&(*task).pid);
39
        let tgid = core::ptr::read_volatile(&(*task).tgid);
40
        let comm = core::ptr::read_volatile(&(*task).comm);
41
        (pid, tgid, comm)
42
    };
43

44
    // Use compact event structure to reduce memory usage
45
    let compact_event = CompactTaskEvent {
46
        pid_tgid: ((tgid as u64) << 32) | (pid as u64),
47
        timestamp: unsafe { aya_ebpf::helpers::bpf_ktime_get_ns() },
48
        comm_hash: calculate_comm_hash(&comm), // Hash instead of full string
49
    };
50

51
    // Submit to ring buffer
52
    if let Some(mut entry) = RING_BUF.reserve::<CompactTaskEvent>(0) {
53
        entry.write(compact_event);
54
        entry.submit(0);
55
    }
56

57
    Ok(0)
58
}
59

60
// Efficient string hashing for process names
61
fn calculate_comm_hash(comm: &[i8; 16]) -> u32 {
62
    let mut hash: u32 = 5381;
63
    for &c in comm.iter() {
64
        if c == 0 { break; }
65
        hash = ((hash << 5).wrapping_add(hash)).wrapping_add(c as u32);
66
    }
67
    hash
68
}
69

70
// Compact event structure
71
#[repr(C)]
72
#[derive(Clone, Copy)]
73
struct CompactTaskEvent {
74
    pid_tgid: u64,     // Combined PID and TGID
75
    timestamp: u64,
76
    comm_hash: u32,    // Hash of process name
77
}

User-Space Optimization#

1
// High-performance event processing
2
use std::sync::mpsc;
3
use rayon::prelude::*;
4

5
struct OptimizedProcessor {
6
    event_buffer: Vec<TaskEvent>,
7
    batch_size: usize,
8
    worker_pool: rayon::ThreadPool,
9
}
10

11
impl OptimizedProcessor {
12
    fn new() -> Self {
13
        let worker_pool = rayon::ThreadPoolBuilder::new()
14
            .num_threads(num_cpus::get())
15
            .build()
16
            .unwrap();
17

18
        Self {
19
            event_buffer: Vec::with_capacity(1000),
20
            batch_size: 1000,
21
            worker_pool,
22
        }
23
    }
24

25
    fn process_events_batch(&mut self, events: Vec<TaskEvent>) {
26
        // Process events in parallel
27
        self.worker_pool.install(|| {
28
            events.par_iter().for_each(|event| {
29
                self.process_single_event(event);
30
            });
31
        });
32
    }
33

34
    fn process_single_event(&self, event: &TaskEvent) {
35
        // Optimized single event processing
36
        // Minimize allocations and expensive operations
37
    }
38
}
39

40
// Memory pool for event objects
41
use std::sync::Mutex;
42

43
struct EventPool {
44
    pool: Mutex<Vec<Box<TaskEvent>>>,
45
}
46

47
impl EventPool {
48
    fn new() -> Self {
49
        Self {
50
            pool: Mutex::new(Vec::with_capacity(1000)),
51
        }
52
    }
53

54
    fn get(&self) -> Box<TaskEvent> {
55
        self.pool.lock().unwrap().pop()
56
            .unwrap_or_else(|| Box::new(unsafe { std::mem::zeroed() }))
57
    }
58

59
    fn return_event(&self, event: Box<TaskEvent>) {
60
        let mut pool = self.pool.lock().unwrap();
61
        if pool.len() < 1000 {
62
            pool.push(event);
63
        }
64
    }
65
}

Troubleshooting and Debugging#

Common Issues and Solutions#

1. Permission Denied Errors#

1
# Error: Permission denied when loading eBPF program
2
# Solution: Ensure running as root and proper capabilities
3

4
# Check current user
5
id
6

7
# Run with sudo
8
sudo ./target/release/kprobe-observer
9

10
# For containers, ensure privileged mode or specific capabilities
11
docker run --privileged ... # or
12
docker run --cap-add=SYS_ADMIN --cap-add=BPF ...

2. Function Not Found Errors#

1
# Error: Function wake_up_new_task not found
2
# Solution: Check kernel version and available functions
3

4
# Check kernel version
5
uname -r
6

7
# Verify function availability
8
grep wake_up_new_task /sys/kernel/debug/tracing/available_filter_functions
9

10
# Alternative: Use a different function
11
grep -E "do_fork|_do_fork|kernel_clone" /sys/kernel/debug/tracing/available_filter_functions

3. Build Errors#

1
# Error: bpf-linker not found
2
# Solution: Install bpf-linker
3
cargo install bpf-linker
4

5
# Error: vmlinux.h not found
6
# Solution: Generate or download vmlinux.h
7
# Option 1: Generate from running kernel
8
sudo bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h
9

10
# Option 2: Use bindgen with kernel headers
11
# Already handled by our xtask codegen

Debug Logging and Tracing#

1
// Enhanced debugging in eBPF program
2
use aya_log_ebpf::{debug, info, warn, error};
3

4
#[kprobe]
5
pub fn debug_wake_up_new_task(ctx: ProbeContext) -> u32 {
6
    debug!(&ctx, "Kprobe triggered");
7

8
    let task: *const task_struct = match ctx.arg(0) {
9
        Some(task) => {
10
            debug!(&ctx, "Got task_struct pointer: {:p}", task);
11
            task
12
        }
13
        None => {
14
            error!(&ctx, "Failed to get task_struct argument");
15
            return 1;
16
        }
17
    };
18

19
    let pid = unsafe {
20
        match core::ptr::read_volatile(&(*task).pid) {
21
            pid => {
22
                debug!(&ctx, "Read PID: {}", pid);
23
                pid
24
            }
25
        }
26
    };
27

28
    info!(&ctx, "Processing task with PID: {}", pid);
29

30
    0
31
}

1
# View eBPF program logs
2
sudo cat /sys/kernel/debug/tracing/trace_pipe
3

4
# Clear trace buffer
5
sudo truncate -s 0 /sys/kernel/debug/tracing/trace
6

7
# Enable specific trace events
8
sudo echo 1 > /sys/kernel/debug/tracing/events/bpf_trace/enable

Performance Debugging#

1
// Performance monitoring in user space
2
use std::time::Instant;
3

4
struct PerformanceMonitor {
5
    event_count: u64,
6
    processing_times: Vec<Duration>,
7
    start_time: Instant,
8
}
9

10
impl PerformanceMonitor {
11
    fn new() -> Self {
12
        Self {
13
            event_count: 0,
14
            processing_times: Vec::new(),
15
            start_time: Instant::now(),
16
        }
17
    }
18

19
    fn record_event_processing(&mut self, duration: Duration) {
20
        self.event_count += 1;
21
        self.processing_times.push(duration);
22

23
        // Keep only recent measurements
24
        if self.processing_times.len() > 1000 {
25
            self.processing_times.drain(0..100);
26
        }
27

28
        // Print statistics every 1000 events
29
        if self.event_count % 1000 == 0 {
30
            self.print_statistics();
31
        }
32
    }
33

34
    fn print_statistics(&self) {
35
        let total_time = self.start_time.elapsed();
36
        let avg_processing_time: Duration = self.processing_times.iter().sum::<Duration>()
37
            / self.processing_times.len() as u32;
38

39
        println!("Performance Statistics:");
40
        println!("  Events processed: {}", self.event_count);
41
        println!("  Total runtime: {:.2}s", total_time.as_secs_f64());
42
        println!("  Events/sec: {:.2}", self.event_count as f64 / total_time.as_secs_f64());
43
        println!("  Avg processing time: {:.2}μs", avg_processing_time.as_nanos() as f64 / 1000.0);
44
    }
45
}

Conclusion#

This comprehensive guide has covered the essential aspects of writing eBPF Kprobe programs using Rust and the Aya framework. From basic setup to advanced production deployment, you now have the knowledge to build robust kernel monitoring solutions.

Key Takeaways#

Kprobes provide powerful kernel function instrumentation without requiring kernel modifications
Rust and Aya offer memory safety and performance for eBPF development
Proper argument handling and type generation are crucial for reliable programs
Production deployment requires careful consideration of security, monitoring, and performance
Comprehensive testing and debugging ensure reliable operation in production environments

Next Steps#

Experiment with different kernel functions to understand various system behaviors
Implement more complex analysis logic for your specific use cases
Integrate with existing monitoring infrastructure using Prometheus metrics
Explore other eBPF program types like tracepoints, XDP, and socket filters
Contribute to the Aya ecosystem and share your learning with the community

The combination of eBPF’s kernel-level observability and Rust’s safety guarantees provides a powerful platform for building the next generation of system monitoring and observability tools.

Resources and Further Reading#

Official Documentation#

Aya Book - Comprehensive Aya framework guide
eBPF.io - Official eBPF portal and documentation
Linux Kernel eBPF Documentation

Rust and eBPF Resources#

Aya GitHub Repository
Rust eBPF Community
Cilium eBPF Go Library - Alternative for Go developers

Advanced Topics#

Tools and Utilities#

bpftool - Essential eBPF debugging tool
bpftop - Real-time eBPF program monitoring
Cilium CLI - Kubernetes networking with eBPF

Based on the tutorial by Yuki Nakamura from Yuki Nakamura’s Blog