Wazuh Log Collection and Transmission: Complete Architecture Guide#

This comprehensive guide explores Wazuh’s sophisticated log collection and transmission architecture, detailing how logs flow from various sources through the agent to the manager for security analysis and alerting.

Overview of the Log Collection and Transmission Process#

Wazuh is designed to ingest logs from multiple sources, providing comprehensive security monitoring across enterprise environments. The system handles various input types including:

Log Files: Monitored continuously for changes
Syslog: Capturing network syslog data
Journald: Integrating with systemd’s journal
Windows Event Logs: Real-time Windows security events
Command Output: Periodic command execution monitoring
FIM (File Integrity Monitoring): Real-time file change detection

These various inputs are handled by the agent, which uses modular components to read, preprocess, and optionally compress the data before securely forwarding it to the manager.

Architecture Overview#

1
graph TB
2
    subgraph "Data Sources"
3
        LF[Log Files]
4
        SL[Syslog]
5
        JD[Journald]
6
        WE[Windows Events]
7
        CO[Command Output]
8
        FIM[File Integrity Monitoring]
9
    end
10

11
    subgraph "Wazuh Agent"
12
        LC[Log Collector]
13
        PP[Preprocessor]
14
        CP[Compression Engine]
15
        EN[Encryption Module]
16
        TR[Transmission Module]
17
    end
18

19
    subgraph "Network"
20
        TLS[TLS Channel]
21
    end
22

23
    subgraph "Wazuh Manager"
24
        NL[Network Listener]
25
        DC[Decompression]
26
        PA[Parser & Analyzer]
27
        RE[Rules Engine]
28
        AL[Alerting]
29
        IX[Indexer Integration]
30
    end
31

32
    LF --> LC
33
    SL --> LC
34
    JD --> LC
35
    WE --> LC
36
    CO --> LC
37
    FIM --> LC
38

39
    LC --> PP
40
    PP --> CP
41
    CP --> EN
42
    EN --> TR
43
    TR --> TLS
44
    TLS --> NL
45
    NL --> DC
46
    DC --> PA
47
    PA --> RE
48
    RE --> AL
49
    RE --> IX

How Logs Are Processed and Sent#

1. Collection at the Agent#

Log Data Collection Engine#

The agent continuously monitors log files, syslog streams, and journald outputs using dedicated modules. Each module handles its source type through tailored configuration parameters:

File Monitoring Module:

1
<ossec_config>
2
  <localfile>
3
    <log_format>syslog</log_format>
4
    <location>/var/log/auth.log</location>
5
  </localfile>
6

7
  <localfile>
8
    <log_format>apache</log_format>
9
    <location>/var/log/apache2/access.log</location>
10
  </localfile>
11
</ossec_config>

Syslog Module:

1
<ossec_config>
2
  <remote>
3
    <connection>syslog</connection>
4
    <port>514</port>
5
    <protocol>udp</protocol>
6
    <allowed-ips>192.168.1.0/24</allowed-ips>
7
  </remote>
8
</ossec_config>

Preprocessing and Parsing#

Once data is collected, it is preprocessed and parsed. This step ensures that logs are structured, enriched, and validated before further processing:

1
// Simplified C code representation of log preprocessing
2
typedef struct {
3
    char *raw_log;
4
    char *source;
5
    time_t timestamp;
6
    char *hostname;
7
    int severity;
8
    char *parsed_fields[MAX_FIELDS];
9
} log_entry_t;
10

11
int preprocess_log(log_entry_t *entry) {
12
    // Extract timestamp
13
    parse_timestamp(entry->raw_log, &entry->timestamp);
14

15
    // Extract hostname
16
    extract_hostname(entry->raw_log, entry->hostname);
17

18
    // Parse structured fields
19
    parse_fields(entry->raw_log, entry->parsed_fields);
20

21
    // Validate and sanitize
22
    return validate_log_entry(entry);
23
}

2. Optional Compression#

Compression for Efficiency#

To reduce network overhead and improve performance, the agent can compress log payloads using standard compression libraries:

1
#include <zlib.h>
2

3
typedef struct {
4
    unsigned char *data;
5
    size_t size;
6
    size_t compressed_size;
7
    int compression_level;
8
} compression_buffer_t;
9

10
// Compression function using zlib
11
int compress_log_data(const char *input, size_t input_size,
12
                      char *output, size_t *output_size) {
13

14
    uLongf compressed_size = *output_size;
15
    int result = compress2((Bytef *)output, &compressed_size,
16
                          (const Bytef *)input, input_size,
17
                          Z_DEFAULT_COMPRESSION);
18

19
    if (result == Z_OK) {
20
        *output_size = compressed_size;
21
        return 0; // Success
22
    }
23

24
    return -1; // Error
25
}
26

27
// Example usage in agent
28
void send_compressed_log(const char* log_message) {
29
    char compressed_data[MAX_COMPRESSED_SIZE];
30
    size_t compressed_size = sizeof(compressed_data);
31

32
    if (compress_log_data(log_message, strlen(log_message),
33
                         compressed_data, &compressed_size) == 0) {
34

35
        // Add compression header
36
        compression_header_t header = {
37
            .magic = COMPRESSION_MAGIC,
38
            .algorithm = ZLIB_COMPRESSION,
39
            .original_size = strlen(log_message),
40
            .compressed_size = compressed_size
41
        };
42

43
        // Send header + compressed data
44
        send_secure_data(&header, sizeof(header));
45
        send_secure_data(compressed_data, compressed_size);
46
    }
47
}

Benefits of Compression#

Bandwidth Reduction: Up to 70-80% reduction in network traffic
Faster Transmission: Less data to transmit over WAN connections
Cost Savings: Reduced bandwidth costs for cloud deployments
Scalability: Support for more agents with same network infrastructure

3. Secure Transmission#

Transport Layer Security#

The agent forwards the (compressed) log data over a secure channel using TLS encryption:

1
#include <openssl/ssl.h>
2
#include <openssl/err.h>
3

4
typedef struct {
5
    SSL_CTX *ctx;
6
    SSL *ssl;
7
    int socket_fd;
8
    char *server_cert;
9
    char *client_cert;
10
    char *client_key;
11
} secure_connection_t;
12

13
// Establish secure connection to manager
14
secure_connection_t* establish_secure_connection(const char *server_address, int port) {
15
    secure_connection_t *conn = malloc(sizeof(secure_connection_t));
16

17
    // Initialize SSL context
18
    conn->ctx = SSL_CTX_new(TLSv1_2_client_method());
19

20
    // Load certificates
21
    SSL_CTX_use_certificate_file(conn->ctx, conn->client_cert, SSL_FILETYPE_PEM);
22
    SSL_CTX_use_PrivateKey_file(conn->ctx, conn->client_key, SSL_FILETYPE_PEM);
23

24
    // Create socket and connect
25
    conn->socket_fd = create_socket_connection(server_address, port);
26

27
    // Create SSL connection
28
    conn->ssl = SSL_new(conn->ctx);
29
    SSL_set_fd(conn->ssl, conn->socket_fd);
30

31
    if (SSL_connect(conn->ssl) != 1) {
32
        // Handle connection error
33
        cleanup_connection(conn);
34
        return NULL;
35
    }
36

37
    return conn;
38
}
39

40
// Send encrypted log data
41
int send_encrypted_log(secure_connection_t *conn, const void *data, size_t size) {
42
    return SSL_write(conn->ssl, data, size);
43
}

Multiple-Socket Outputs#

Wazuh’s architecture supports multiple-socket outputs for enhanced reliability:

1
typedef struct {
2
    secure_connection_t *primary;
3
    secure_connection_t *secondary;
4
    secure_connection_t *backup;
5
    int active_connection;
6
} multi_socket_manager_t;
7

8
int send_with_failover(multi_socket_manager_t *manager, const void *data, size_t size) {
9
    secure_connection_t *connections[] = {
10
        manager->primary,
11
        manager->secondary,
12
        manager->backup
13
    };
14

15
    for (int i = 0; i < 3; i++) {
16
        if (connections[i] && send_encrypted_log(connections[i], data, size) > 0) {
17
            manager->active_connection = i;
18
            return 1; // Success
19
        }
20
    }
21

22
    return 0; // All connections failed
23
}

4. Reception and Processing at the Manager#

Receiving the Data#

The manager runs a network listener that accepts incoming connections from agents:

1
#include <pthread.h>
2

3
typedef struct {
4
    int port;
5
    SSL_CTX *ssl_ctx;
6
    pthread_t listener_thread;
7
    agent_pool_t *agent_pool;
8
} network_listener_t;
9

10
void* connection_handler(void *arg) {
11
    connection_context_t *ctx = (connection_context_t *)arg;
12

13
    char buffer[MAX_BUFFER_SIZE];
14
    int bytes_received;
15

16
    while ((bytes_received = SSL_read(ctx->ssl, buffer, sizeof(buffer))) > 0) {
17
        // Process received data
18
        process_agent_data(ctx->agent_id, buffer, bytes_received);
19
    }
20

21
    cleanup_connection(ctx);
22
    return NULL;
23
}
24

25
// Main listener function
26
void start_network_listener(network_listener_t *listener) {
27
    int server_socket = create_server_socket(listener->port);
28

29
    while (1) {
30
        int client_socket = accept(server_socket, NULL, NULL);
31

32
        // Create SSL connection
33
        SSL *ssl = SSL_new(listener->ssl_ctx);
34
        SSL_set_fd(ssl, client_socket);
35

36
        if (SSL_accept(ssl) == 1) {
37
            // Authenticate agent
38
            char agent_id[MAX_AGENT_ID];
39
            if (authenticate_agent(ssl, agent_id)) {
40
                // Create connection context
41
                connection_context_t *ctx = create_connection_context(ssl, agent_id);
42

43
                // Spawn handler thread
44
                pthread_create(&ctx->thread, NULL, connection_handler, ctx);
45
            }
46
        }
47
    }
48
}

Decompression and Parsing#

If logs were compressed, the manager uses complementary decompression routines:

1
// Decompression function using zlib
2
int decompress_log_data(const char *input, size_t input_size,
3
                       char *output, size_t *output_size) {
4

5
    uLongf decompressed_size = *output_size;
6
    int result = uncompress((Bytef *)output, &decompressed_size,
7
                           (const Bytef *)input, input_size);
8

9
    if (result == Z_OK) {
10
        *output_size = decompressed_size;
11
        return 0; // Success
12
    }
13

14
    return -1; // Error
15
}
16

17
// Process incoming agent data
18
void process_agent_data(const char *agent_id, const char *data, size_t size) {
19
    // Check if data is compressed
20
    compression_header_t *header = (compression_header_t *)data;
21

22
    if (header->magic == COMPRESSION_MAGIC) {
23
        // Decompress data
24
        char *decompressed_data = malloc(header->original_size);
25
        size_t decompressed_size = header->original_size;
26

27
        if (decompress_log_data(data + sizeof(compression_header_t),
28
                               header->compressed_size,
29
                               decompressed_data,
30
                               &decompressed_size) == 0) {
31

32
            // Parse decompressed log
33
            parse_and_analyze_log(agent_id, decompressed_data, decompressed_size);
34
        }
35

36
        free(decompressed_data);
37
    } else {
38
        // Process uncompressed data
39
        parse_and_analyze_log(agent_id, data, size);
40
    }
41
}

Log Analysis and Alerting#

The manager applies its log analysis engine using configuration rules:

1
typedef struct {
2
    int rule_id;
3
    char *pattern;
4
    int severity;
5
    char *description;
6
    regex_t compiled_regex;
7
} analysis_rule_t;
8

9
typedef struct {
10
    analysis_rule_t *rules;
11
    int rule_count;
12
    hash_table_t *rule_cache;
13
} rules_engine_t;
14

15
// Analyze log against rules
16
alert_t* analyze_log_entry(rules_engine_t *engine, log_entry_t *log) {
17
    for (int i = 0; i < engine->rule_count; i++) {
18
        analysis_rule_t *rule = &engine->rules[i];
19

20
        if (regexec(&rule->compiled_regex, log->raw_log, 0, NULL, 0) == 0) {
21
            // Rule matched - create alert
22
            alert_t *alert = create_alert(rule, log);
23

24
            // Enrich alert with additional context
25
            enrich_alert(alert, log);
26

27
            return alert;
28
        }
29
    }
30

31
    return NULL; // No rules matched
32
}
33

34
// Main log processing function
35
void parse_and_analyze_log(const char *agent_id, const char *log_data, size_t size) {
36
    log_entry_t *log = parse_log_entry(log_data, size);
37

38
    if (log) {
39
        // Add agent context
40
        strcpy(log->agent_id, agent_id);
41

42
        // Analyze against rules
43
        alert_t *alert = analyze_log_entry(&global_rules_engine, log);
44

45
        if (alert) {
46
            // Forward alert to output modules
47
            forward_alert(alert);
48

49
            // Store in database/indexer
50
            store_alert(alert);
51
        }
52

53
        // Always store raw log for forensics
54
        store_raw_log(log);
55

56
        free_log_entry(log);
57
    }
58
}

Advanced Features#

1. Real-time Analysis Pipeline#

1
sequenceDiagram
2
    participant Agent
3
    participant Manager
4
    participant RulesEngine
5
    participant Alerting
6
    participant Indexer
7

8
    Agent->>Manager: Send compressed log
9
    Manager->>Manager: Decompress & parse
10
    Manager->>RulesEngine: Analyze log
11
    RulesEngine-->>Manager: Rule match found
12
    Manager->>Alerting: Generate alert
13
    Manager->>Indexer: Store log & alert
14
    Alerting->>Alerting: Send notifications

2. Log Correlation and Intelligence#

1
typedef struct correlation_context {
2
    hash_table_t *event_cache;
3
    time_window_t *time_windows;
4
    correlation_rule_t *rules;
5
    int rule_count;
6
} correlation_context_t;
7

8
// Advanced correlation analysis
9
alert_t* correlate_events(correlation_context_t *ctx, log_entry_t *new_log) {
10
    // Check for related events in time window
11
    event_list_t *related_events = find_related_events(ctx->event_cache, new_log);
12

13
    if (related_events->count > 0) {
14
        // Apply correlation rules
15
        for (int i = 0; i < ctx->rule_count; i++) {
16
            correlation_rule_t *rule = &ctx->rules[i];
17

18
            if (evaluate_correlation_rule(rule, related_events, new_log)) {
19
                // Create correlation alert
20
                return create_correlation_alert(rule, related_events, new_log);
21
            }
22
        }
23
    }
24

25
    // Cache this event for future correlation
26
    cache_event(ctx->event_cache, new_log);
27

28
    return NULL;
29
}

3. Performance Optimization#

1
// Lock-free queue for high-throughput log processing
2
typedef struct {
3
    atomic_uint head;
4
    atomic_uint tail;
5
    log_entry_t *buffer[QUEUE_SIZE];
6
    uint32_t mask;
7
} lockfree_queue_t;
8

9
// Multi-threaded log processor
10
typedef struct {
11
    lockfree_queue_t *input_queue;
12
    pthread_t *worker_threads;
13
    int worker_count;
14
    rules_engine_t *rules_engine;
15
} log_processor_t;
16

17
void* log_worker_thread(void *arg) {
18
    log_processor_t *processor = (log_processor_t *)arg;
19
    log_entry_t *log;
20

21
    while (1) {
22
        // Non-blocking dequeue
23
        if (dequeue(processor->input_queue, &log)) {
24
            // Process log entry
25
            alert_t *alert = analyze_log_entry(processor->rules_engine, log);
26

27
            if (alert) {
28
                forward_alert(alert);
29
            }
30

31
            free_log_entry(log);
32
        } else {
33
            // No work available, yield CPU
34
            sched_yield();
35
        }
36
    }
37

38
    return NULL;
39
}

Configuration Examples#

Agent Configuration#

1
<ossec_config>
2
  <client>
3
    <server>
4
      <address>wazuh-manager.company.com</address>
5
      <port>1514</port>
6
      <protocol>tcp</protocol>
7
    </server>
8
    <config-profile>centos, centos7</config-profile>
9
    <notify_time>10</notify_time>
10
    <time-reconnect>60</time-reconnect>
11
    <auto_restart>yes</auto_restart>
12
    <crypto_method>aes</crypto_method>
13
  </client>
14

15
  <client_buffer>
16
    <disabled>no</disabled>
17
    <queue_size>5000</queue_size>
18
    <events_per_second>500</events_per_second>
19
  </client_buffer>
20

21
  <!-- Log file monitoring -->
22
  <localfile>
23
    <log_format>syslog</log_format>
24
    <location>/var/log/auth.log</location>
25
  </localfile>
26

27
  <localfile>
28
    <log_format>syslog</log_format>
29
    <location>/var/log/syslog</location>
30
  </localfile>
31

32
  <localfile>
33
    <log_format>apache</log_format>
34
    <location>/var/log/apache2/access.log</location>
35
  </localfile>
36

37
  <!-- Command monitoring -->
38
  <localfile>
39
    <log_format>command</log_format>
40
    <command>df -P</command>
41
    <alias>df -P</alias>
42
    <frequency>360</frequency>
43
  </localfile>
44

45
  <!-- File integrity monitoring -->
46
  <syscheck>
47
    <directories>/etc,/usr/bin,/usr/sbin</directories>
48
    <directories>/bin,/sbin,/boot</directories>
49
    <ignore>/etc/mtab</ignore>
50
    <ignore>/etc/hosts.deny</ignore>
51
    <ignore>/etc/mail/statistics</ignore>
52
    <ignore>/etc/random-seed</ignore>
53
    <frequency>79200</frequency>
54
  </syscheck>
55
</ossec_config>

Manager Configuration#

1
<ossec_config>
2
  <global>
3
    <jsonout_output>yes</jsonout_output>
4
    <alerts_log>yes</alerts_log>
5
    <logall>no</logall>
6
    <logall_json>no</logall_json>
7
    <email_notification>yes</email_notification>
8
    <smtp_server>smtp.company.com</smtp_server>
9
    <email_from>wazuh@company.com</email_from>
10
    <email_to>security@company.com</email_to>
11
    <hostname>wazuh-manager</hostname>
12
    <queue_size>131072</queue_size>
13
  </global>
14

15
  <remote>
16
    <connection>secure</connection>
17
    <port>1514</port>
18
    <protocol>tcp</protocol>
19
    <queue_size>16384</queue_size>
20
  </remote>
21

22
  <analysisd>
23
    <memory_size>8192</memory_size>
24
    <log_fw>yes</log_fw>
25
    <pre_match>yes</pre_match>
26
  </analysisd>
27

28
  <!-- Compression settings -->
29
  <logging>
30
    <log_format>plain</log_format>
31
  </logging>
32

33
  <!-- Integration with indexer -->
34
  <integration>
35
    <name>opensearch</name>
36
    <level>3</level>
37
    <alert_format>json</alert_format>
38
    <api_url>https://wazuh-indexer:9200</api_url>
39
    <api_user>admin</api_user>
40
    <api_pass>SecurePassword123!</api_pass>
41
  </integration>
42
</ossec_config>

Performance Metrics and Monitoring#

Key Performance Indicators#

1
#!/bin/bash
2
# Wazuh performance monitoring script
3

4
# Agent metrics
5
echo "=== Agent Performance ==="
6
echo "Events per second: $(wazuh-logtest -q | grep 'Events processed' | tail -1)"
7
echo "Queue utilization: $(cat /var/ossec/var/run/ossec-agent.stats | grep queue)"
8

9
# Manager metrics
10
echo "=== Manager Performance ==="
11
echo "Alerts per second: $(cat /var/ossec/logs/alerts/alerts.log | tail -100 | wc -l)"
12
echo "Analysis queue: $(cat /var/ossec/var/run/ossec-analysisd.stats | grep queue)"
13
echo "Remote queue: $(cat /var/ossec/var/run/ossec-remoted.stats | grep queue)"
14

15
# Compression metrics
16
echo "=== Compression Performance ==="
17
echo "Compression ratio: $(cat /var/ossec/logs/ossec.log | grep 'compression' | tail -1)"
18
echo "Network bytes saved: $(cat /var/ossec/var/run/compression.stats 2>/dev/null || echo 'N/A')"

Summary#

Wazuh’s log collection and transmission process provides:

Comprehensive Data Ingestion: Support for multiple log sources and formats
Efficient Compression: Significant bandwidth reduction using zlib compression
Secure Transmission: TLS-encrypted communication with certificate-based authentication
Robust Processing: Multi-threaded analysis with real-time correlation
High Availability: Multiple-socket outputs and failover mechanisms
Scalable Architecture: Lock-free queues and parallel processing for high throughput

The modular architecture ensures that each component can be optimized independently while maintaining overall system reliability and performance. The optional compression significantly reduces network overhead, making Wazuh suitable for large-scale deployments across WAN connections.

This end-to-end process enables organizations to collect, transmit, and analyze security events in real-time, providing comprehensive visibility into their security posture while maintaining high performance and reliability standards.