Table of Contents
Introduction
BlueChI is a systemd-based multi-node orchestration framework that provides deterministic service control across distributed systems. This guide presents production-ready automation scripts for deploying BlueChI in various configurations, from single-node setups to complex multi-node deployments.
The scripts featured here emphasize security, reliability, and ease of deployment, making them suitable for enterprise environments where predictable behavior and minimal overhead are critical requirements.
Script Overview
We’ll cover three main deployment scenarios:
- All-in-One Installation: Single script for quick deployments
- Multi-Node Installation: Flexible script supporting controller and agent roles
- Production Deployment: Enhanced scripts with monitoring, HA, and security features
All-in-One BlueChI Installation Script
This script installs all BlueChI components on a single node, ideal for development and testing:
#!/bin/bash
# BlueChI All-in-One Installation Script# Supports: Rocky Linux, AlmaLinux, RHEL, Amazon Linux# Version: 1.0
# Secure script setupset -e # Exit immediately if a command exits with a non-zero statusset -u # Treat unset variables as an errorset -o pipefail # Return value of a pipeline is the value of the last command
# Global variablesTEMP_DIRS=()LOG_FILE="/var/log/bluechi_install.log"
# Function for logging with colorslog() { local message="[$(date +'%Y-%m-%d %H:%M:%S')] $1" echo -e "\e[0;32m${message}\e[0m" echo "${message}" >> "$LOG_FILE"}
warn() { local message="[$(date +'%Y-%m-%d %H:%M:%S')] WARNING: $1" echo -e "\e[0;33m${message}\e[0m" echo "${message}" >> "$LOG_FILE"}
# Function for error handlingerror_exit() { local message="[$(date +'%Y-%m-%d %H:%M:%S')] ERROR: $1" echo -e "\e[0;31m${message}\e[0m" echo "${message}" >> "$LOG_FILE" cleanup exit 1}
# Cleanup functioncleanup() { log "Cleaning up temporary files..." for dir in "${TEMP_DIRS[@]}"; do if [ -d "$dir" ]; then rm -rf "$dir" fi done}
# Set trap for cleanuptrap cleanup EXIT INT TERM
# Function to check if running as rootcheck_root() { if [ "$(id -u)" -ne 0 ]; then error_exit "This script must be run as root" fi}
# Function to detect OSdetect_os() { if [ -f /etc/os-release ]; then . /etc/os-release OS=$ID VERSION=$VERSION_ID log "Detected OS: $OS $VERSION" else error_exit "Cannot detect OS, /etc/os-release file not found" fi}
# Function to install wgetinstall_wget() { log "Installing wget..."
case $OS in "rocky" | "almalinux" | "rhel" | "centos") dnf install -y wget || error_exit "Failed to install wget on $OS" ;; "amzn") yum install -y wget || error_exit "Failed to install wget on Amazon Linux" ;; *) error_exit "Unsupported OS: $OS" ;; esac
log "wget installed successfully"}
# Function to handle dependencies for BlueChIinstall_dependencies() { log "Installing dependencies..."
# Dependencies for BlueChI DEPENDENCIES=( "dbus" "systemd" "python3" "dbus-broker" "device-mapper" "device-mapper-libs" "libfdisk" "util-linux" "systemd-pam" )
case $OS in "rocky" | "almalinux" | "rhel" | "centos") for dep in "${DEPENDENCIES[@]}"; do log "Installing dependency: $dep" dnf install -y "$dep" || warn "Could not install $dep, but continuing" done ;; "amzn") for dep in "${DEPENDENCIES[@]}"; do log "Installing dependency: $dep" yum install -y "$dep" || warn "Could not install $dep, but continuing" done ;; *) error_exit "Unsupported OS: $OS" ;; esac
log "Dependencies installation completed"}
# Function to download file with proper security validationsecure_download() { local url=$1 local output_file=$2
# Use wget with proper SSL/TLS security wget --https-only --secure-protocol=TLSv1_2 "$url" -O "$output_file" || return 1
# Verify file exists and is not empty if [ ! -s "$output_file" ]; then return 1 fi
return 0}
# Function to download and install a packagedownload_and_install() { local url=$1 local filename=$(basename "$url") local temp_dir=$(mktemp -d) TEMP_DIRS+=("$temp_dir") local output_file="$temp_dir/$filename"
log "Downloading $filename..." if ! secure_download "$url" "$output_file"; then error_exit "Failed to download $filename securely" fi
log "Installing $filename..." if ! rpm -Uvh --nodeps "$output_file"; then warn "Failed to install $filename with standard options, trying with --force" rpm -Uvh --nodeps --force "$output_file" || error_exit "Failed to install $filename even with --force" fi
log "$filename installed successfully"}
# Function to check if all required commands are availablecheck_requirements() { log "Checking if required commands are available..."
local missing_commands=()
# Commands to check local commands=("rpm")
for cmd in "${commands[@]}"; do if ! command -v "$cmd" &> /dev/null; then missing_commands+=("$cmd") fi done
if [ ${#missing_commands[@]} -gt 0 ]; then error_exit "Required commands not found: ${missing_commands[*]}" fi
log "All required commands are available"}
# Function to verify service statusverify_services() { log "Verifying BlueChI services..."
if systemctl is-active --quiet bluechi-controller; then log "BlueChI controller service is active" else warn "BlueChI controller service is not active" fi
if systemctl is-active --quiet bluechi-agent; then log "BlueChI agent service is active" else warn "BlueChI agent service is not active" fi}
# Main functionmain() { log "Starting BlueChI installation script"
check_root detect_os check_requirements
# Check if wget is already installed, if not install it if ! command -v wget &> /dev/null; then install_wget else log "wget is already installed" fi
# Install dependencies install_dependencies
# BlueChI package URLs PACKAGES=( "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/b/bluechi-agent-0.10.2-1.el9.x86_64.rpm" "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/b/bluechi-controller-0.10.2-1.el9.x86_64.rpm" "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/b/bluechi-ctl-0.10.2-1.el9.x86_64.rpm" "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/b/bluechi-is-online-0.10.2-1.el9.x86_64.rpm" "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/b/bluechi-selinux-0.10.2-1.el9.noarch.rpm" "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/p/python3-bluechi-0.10.2-1.el9.noarch.rpm" )
# Install each package for package in "${PACKAGES[@]}"; do download_and_install "$package" done
log "BlueChI packages installed successfully"
# Configure BlueChI for all-in-one setup log "Configuring BlueChI..."
# Create configuration directories mkdir -p /etc/bluechi/controller.conf.d mkdir -p /etc/bluechi/agent.conf.d
# Create controller configuration cat > /etc/bluechi/controller.conf.d/1.conf << EOF[bluechi-controller]ControllerPort=2020AllowedNodeNames=$(hostname)EOF
# Create agent configuration for local connection cat > /etc/bluechi/agent.conf.d/1.conf << EOF[bluechi-agent]ControllerAddress=unix:path=/run/bluechi/bluechi.sockNodeName=$(hostname)EOF
# Enable and start BlueChI services log "Enabling and starting BlueChI services..." systemctl enable --now bluechi-controller bluechi-agent || warn "Failed to enable and start BlueChI services"
# Verify services verify_services
# Display summary log "BlueChI installation and configuration completed" log "Installation log available at: $LOG_FILE"
echo "" echo "========================================" echo "BlueChI Installation Complete!" echo "========================================" echo "" echo "Verify installation with:" echo " bluechictl list-nodes" echo " bluechictl list-units $(hostname)" echo "" echo "Service status:" systemctl status bluechi-controller --no-pager | grep Active systemctl status bluechi-agent --no-pager | grep Active echo ""}
# Execute main functionmain
Multi-Node BlueChI Installation Script
This enhanced script supports both controller and agent installations with flexible configurations:
#!/bin/bash
# BlueChI Multi-Node Installation Script# Supports: Rocky Linux, AlmaLinux, RHEL, Amazon Linux# Version: 2.0
# Secure script setupset -e # Exit immediately if a command exits with a non-zero statusset -u # Treat unset variables as an errorset -o pipefail # Return value of a pipeline is the value of the last command
# Global variablesTEMP_DIRS=()LOG_FILE="/var/log/bluechi_install.log"CONTROLLER_PORT="2020" # Non-privileged port for easier testing/deploymentBLUECHI_VERSION="0.10.2"INSTALL_MODE=""CONTROLLER_IP=""ALLOWED_NODES=""NODE_NAME=""
# Function for logging with colorslog() { local message="[$(date +'%Y-%m-%d %H:%M:%S')] $1" echo -e "\e[0;32m${message}\e[0m" echo "${message}" >> "$LOG_FILE"}
warn() { local message="[$(date +'%Y-%m-%d %H:%M:%S')] WARNING: $1" echo -e "\e[0;33m${message}\e[0m" echo "${message}" >> "$LOG_FILE"}
# Function for error handlingerror_exit() { local message="[$(date +'%Y-%m-%d %H:%M:%S')] ERROR: $1" echo -e "\e[0;31m${message}\e[0m" echo "${message}" >> "$LOG_FILE" cleanup exit 1}
# Cleanup functioncleanup() { log "Cleaning up temporary files..." for dir in "${TEMP_DIRS[@]}"; do if [ -d "$dir" ]; then rm -rf "$dir" fi done}
# Set trap for cleanuptrap cleanup EXIT INT TERM
# Function to check if running as rootcheck_root() { if [ "$(id -u)" -ne 0 ]; then error_exit "This script must be run as root" fi}
# Function to detect OSdetect_os() { if [ -f /etc/os-release ]; then . /etc/os-release OS=$ID VERSION=$VERSION_ID log "Detected OS: $OS $VERSION" else error_exit "Cannot detect OS, /etc/os-release file not found" fi}
# Function to install wgetinstall_wget() { log "Installing wget..."
case $OS in "rocky" | "almalinux" | "rhel" | "centos") dnf install -y wget || error_exit "Failed to install wget on $OS" ;; "amzn") yum install -y wget || error_exit "Failed to install wget on Amazon Linux" ;; *) error_exit "Unsupported OS: $OS" ;; esac
log "wget installed successfully"}
# Function to handle dependencies for BlueChIinstall_dependencies() { log "Installing dependencies..."
DEPENDENCIES=( "dbus" "systemd" "python3" "dbus-broker" "device-mapper" "device-mapper-libs" "libfdisk" "util-linux" "systemd-pam" )
case $OS in "rocky" | "almalinux" | "rhel" | "centos") for dep in "${DEPENDENCIES[@]}"; do log "Installing dependency: $dep" dnf install -y "$dep" || warn "Could not install $dep, but continuing" done ;; "amzn") for dep in "${DEPENDENCIES[@]}"; do log "Installing dependency: $dep" yum install -y "$dep" || warn "Could not install $dep, but continuing" done ;; *) error_exit "Unsupported OS: $OS" ;; esac
log "Dependencies installation completed"}
# Function to download file with proper security validationsecure_download() { local url=$1 local output_file=$2
# Use wget with proper SSL/TLS security wget --https-only --secure-protocol=TLSv1_2 "$url" -O "$output_file" || return 1
# Verify file exists and is not empty if [ ! -s "$output_file" ]; then return 1 fi
return 0}
# Function to download and install a packagedownload_and_install() { local url=$1 local filename=$(basename "$url") local temp_dir=$(mktemp -d) TEMP_DIRS+=("$temp_dir") local output_file="$temp_dir/$filename"
log "Downloading $filename..." if ! secure_download "$url" "$output_file"; then error_exit "Failed to download $filename securely" fi
log "Installing $filename..." if ! rpm -Uvh --nodeps "$output_file"; then warn "Failed to install $filename with standard options, trying with --force" rpm -Uvh --nodeps --force "$output_file" || error_exit "Failed to install $filename even with --force" fi
log "$filename installed successfully"}
# Function to check if all required commands are availablecheck_requirements() { log "Checking if required commands are available..."
local missing_commands=() local commands=("rpm" "hostname" "systemctl")
for cmd in "${commands[@]}"; do if ! command -v "$cmd" &> /dev/null; then missing_commands+=("$cmd") fi done
if [ ${#missing_commands[@]} -gt 0 ]; then error_exit "Required commands not found: ${missing_commands[*]}" fi
log "All required commands are available"}
# Function to ensure config directories existensure_config_dirs() { log "Ensuring configuration directories exist..."
mkdir -p /etc/bluechi/controller.conf.d mkdir -p /etc/bluechi/agent.conf.d
log "Configuration directories created"}
# Function to configure controllerconfigure_controller() { log "Configuring BlueChI controller..."
# Create configuration cat > /etc/bluechi/controller.conf.d/1.conf << EOF[bluechi-controller]ControllerPort=${CONTROLLER_PORT}AllowedNodeNames=${ALLOWED_NODES}LogLevel=infoLogTarget=journaldEOF
log "Controller configuration written to /etc/bluechi/controller.conf.d/1.conf"
# Configure local agent to use Unix Domain Socket cat > /etc/bluechi/agent.conf.d/1.conf << EOF[bluechi-agent]ControllerAddress=unix:path=/run/bluechi/bluechi.sockNodeName=$(hostname)LogLevel=infoLogTarget=journaldEOF
log "Local agent configuration written to /etc/bluechi/agent.conf.d/1.conf"}
# Function to configure agentconfigure_agent() { log "Configuring BlueChI agent..."
# Create configuration cat > /etc/bluechi/agent.conf.d/1.conf << EOF[bluechi-agent]ControllerHost=${CONTROLLER_IP}ControllerPort=${CONTROLLER_PORT}NodeName=${NODE_NAME}LogLevel=infoLogTarget=journaldHeartbeatInterval=5EOF
log "Agent configuration written to /etc/bluechi/agent.conf.d/1.conf"}
# Function to verify service statusverify_services() { log "Verifying BlueChI services..."
if [ "$INSTALL_MODE" = "controller" ]; then if systemctl is-active --quiet bluechi-controller; then log "BlueChI controller service is active" else warn "BlueChI controller service is not active" fi fi
if systemctl is-active --quiet bluechi-agent; then log "BlueChI agent service is active" else warn "BlueChI agent service is not active" fi}
# Function to open firewall port for controllerconfigure_firewall() { log "Configuring firewall for BlueChI..."
if command -v firewall-cmd &> /dev/null; then log "Using firewalld..." if systemctl is-active --quiet firewalld; then log "Opening port ${CONTROLLER_PORT} for BlueChI controller" firewall-cmd --permanent --add-port=${CONTROLLER_PORT}/tcp firewall-cmd --reload else warn "Firewalld is installed but not running, skipping firewall configuration" fi else warn "firewall-cmd not found, unable to configure firewall automatically" log "Please ensure port ${CONTROLLER_PORT} is open in your firewall for BlueChI to function properly" fi}
# Function to install controllerinstall_controller() { log "Installing BlueChI controller components..."
# Controller packages PACKAGES=( "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/b/bluechi-agent-${BLUECHI_VERSION}-1.el9.x86_64.rpm" "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/b/bluechi-controller-${BLUECHI_VERSION}-1.el9.x86_64.rpm" "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/b/bluechi-ctl-${BLUECHI_VERSION}-1.el9.x86_64.rpm" "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/b/bluechi-selinux-${BLUECHI_VERSION}-1.el9.noarch.rpm" "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/p/python3-bluechi-${BLUECHI_VERSION}-1.el9.noarch.rpm" )
# Install each package for package in "${PACKAGES[@]}"; do download_and_install "$package" done
# If online component was requested, install it if [[ "$*" == *"--with-online"* ]]; then download_and_install "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/b/bluechi-is-online-${BLUECHI_VERSION}-1.el9.x86_64.rpm" fi
ensure_config_dirs configure_controller
if [[ "$*" == *"--configure-firewall"* ]]; then configure_firewall fi
log "BlueChI controller installation completed"}
# Function to install agentinstall_agent() { log "Installing BlueChI agent components..."
# Agent packages PACKAGES=( "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/b/bluechi-agent-${BLUECHI_VERSION}-1.el9.x86_64.rpm" "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/b/bluechi-selinux-${BLUECHI_VERSION}-1.el9.noarch.rpm" "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/p/python3-bluechi-${BLUECHI_VERSION}-1.el9.noarch.rpm" )
# Install each package for package in "${PACKAGES[@]}"; do download_and_install "$package" done
# If online component was requested, install it if [[ "$*" == *"--with-online"* ]]; then download_and_install "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/b/bluechi-is-online-${BLUECHI_VERSION}-1.el9.x86_64.rpm" fi
ensure_config_dirs configure_agent
log "BlueChI agent installation completed"}
# Function to start servicesstart_services() { log "Starting BlueChI services..."
if [ "$INSTALL_MODE" = "controller" ]; then log "Starting controller and agent services..." systemctl enable --now bluechi-controller bluechi-agent || warn "Failed to enable and start BlueChI services" else log "Starting agent service..." systemctl enable --now bluechi-agent || warn "Failed to enable and start BlueChI agent service" fi
verify_services}
# Function to show usageusage() { cat << EOFUsage: $0 [OPTIONS]
BlueChI Multi-Node Installation Script
OPTIONS: -m, --mode MODE Installation mode (controller or agent) -i, --controller-ip IP IP address of the controller node (required for agent mode) -n, --node-name NAME Name of this node (required for agent mode) -a, --allowed-nodes NODES Comma-separated list of allowed node names (required for controller mode) -p, --port PORT Controller port (default: 2020) -v, --version VERSION BlueChI version to install (default: ${BLUECHI_VERSION}) --with-online Install bluechi-is-online package --configure-firewall Configure firewall to allow BlueChi traffic (controller only) -h, --help Show this help message and exit
EXAMPLES: # Install controller with local hostname and pi as allowed nodes $0 --mode controller --allowed-nodes \$(hostname),pi
# Install agent to connect to controller at 192.168.1.10 $0 --mode agent --controller-ip 192.168.1.10 --node-name pi
# Install controller with firewall configuration $0 --mode controller --allowed-nodes node1,node2,node3 --configure-firewall
# Install agent with custom port $0 --mode agent --controller-ip 192.168.1.10 --node-name worker1 --port 2021
NOTES: - This script must be run as root - Supported OS: Rocky Linux, AlmaLinux, RHEL, Amazon Linux - Logs are written to: $LOG_FILE - Configuration files are in: /etc/bluechi/
For more information, visit: https://github.com/eclipse-bluechi/bluechiEOF exit 0}
# Parse command line argumentsparse_args() { while [[ $# -gt 0 ]]; do case $1 in -m|--mode) INSTALL_MODE="$2" shift 2 ;; -i|--controller-ip) CONTROLLER_IP="$2" shift 2 ;; -n|--node-name) NODE_NAME="$2" shift 2 ;; -a|--allowed-nodes) ALLOWED_NODES="$2" shift 2 ;; -p|--port) CONTROLLER_PORT="$2" shift 2 ;; -v|--version) BLUECHI_VERSION="$2" shift 2 ;; --with-online) # This is a flag, no value needed shift ;; --configure-firewall) # This is a flag, no value needed shift ;; -h|--help) usage ;; *) warn "Unknown option: $1" usage ;; esac done
# Validate required arguments if [ -z "$INSTALL_MODE" ]; then error_exit "Installation mode (--mode) must be specified" fi
if [ "$INSTALL_MODE" != "controller" ] && [ "$INSTALL_MODE" != "agent" ]; then error_exit "Installation mode must be either 'controller' or 'agent'" fi
if [ "$INSTALL_MODE" = "controller" ] && [ -z "$ALLOWED_NODES" ]; then error_exit "Allowed nodes (--allowed-nodes) must be specified for controller mode" fi
if [ "$INSTALL_MODE" = "agent" ]; then if [ -z "$CONTROLLER_IP" ]; then error_exit "Controller IP (--controller-ip) must be specified for agent mode" fi
if [ -z "$NODE_NAME" ]; then error_exit "Node name (--node-name) must be specified for agent mode" fi fi}
# Main functionmain() { log "Starting BlueChI installation script in $INSTALL_MODE mode"
check_root detect_os check_requirements
# Check if wget is already installed, if not install it if ! command -v wget &> /dev/null; then install_wget else log "wget is already installed" fi
# Install dependencies install_dependencies
# Install either controller or agent based on mode if [ "$INSTALL_MODE" = "controller" ]; then install_controller "$@" else install_agent "$@" fi
# Start services start_services
# Display summary echo "" echo "========================================" echo "BlueChI Installation Complete!" echo "========================================" echo ""
if [ "$INSTALL_MODE" = "controller" ]; then echo "Controller installation summary:" echo " - Listening on port: $CONTROLLER_PORT" echo " - Allowed nodes: $ALLOWED_NODES" echo " - Configuration: /etc/bluechi/controller.conf.d/" echo "" echo "Verify with: bluechictl list-nodes" else echo "Agent installation summary:" echo " - Node name: $NODE_NAME" echo " - Controller: $CONTROLLER_IP:$CONTROLLER_PORT" echo " - Configuration: /etc/bluechi/agent.conf.d/" echo "" echo "Check logs: journalctl -u bluechi-agent -f" fi
echo "" echo "Service status:" if [ "$INSTALL_MODE" = "controller" ]; then systemctl status bluechi-controller --no-pager | grep Active || true fi systemctl status bluechi-agent --no-pager | grep Active || true echo "" echo "Installation log: $LOG_FILE" echo ""}
# Parse arguments firstparse_args "$@"
# Execute main function with all argumentsmain "$@"
Production Deployment Script
This comprehensive script includes monitoring, high availability, and enhanced security features:
#!/bin/bash
# BlueChI Production Deployment Script# Enterprise-ready installation with monitoring and HA support# Version: 3.0
set -euo pipefail
# Configurationreadonly SCRIPT_VERSION="3.0"readonly LOG_DIR="/var/log/bluechi"readonly BACKUP_DIR="/var/backups/bluechi"readonly CONFIG_DIR="/etc/bluechi"readonly MONITORING_DIR="/opt/bluechi-monitoring"readonly HA_CONFIG_DIR="/etc/bluechi-ha"
# Create necessary directoriesmkdir -p "$LOG_DIR" "$BACKUP_DIR" "$CONFIG_DIR" "$MONITORING_DIR" "$HA_CONFIG_DIR"
# Logging setupreadonly LOG_FILE="$LOG_DIR/deployment_$(date +%Y%m%d_%H%M%S).log"exec 1> >(tee -a "$LOG_FILE")exec 2>&1
# Color codesreadonly RED='\033[0;31m'readonly GREEN='\033[0;32m'readonly YELLOW='\033[1;33m'readonly BLUE='\033[0;34m'readonly NC='\033[0m'
# Functionslog() { echo -e "${GREEN}[$(date +'%Y-%m-%d %H:%M:%S')]${NC} $*"}
warn() { echo -e "${YELLOW}[$(date +'%Y-%m-%d %H:%M:%S')] WARNING:${NC} $*"}
error() { echo -e "${RED}[$(date +'%Y-%m-%d %H:%M:%S')] ERROR:${NC} $*" exit 1}
info() { echo -e "${BLUE}[$(date +'%Y-%m-%d %H:%M:%S')] INFO:${NC} $*"}
# Check prerequisitescheck_prerequisites() { log "Checking prerequisites..."
# Check if running as root if [[ $EUID -ne 0 ]]; then error "This script must be run as root" fi
# Check OS if [[ ! -f /etc/os-release ]]; then error "Cannot detect OS version" fi
# Check required tools local required_tools=("systemctl" "firewall-cmd" "sestatus" "python3") for tool in "${required_tools[@]}"; do if ! command -v "$tool" &> /dev/null; then warn "$tool is not installed" fi done
log "Prerequisites check completed"}
# Install BlueChI with enhanced securityinstall_bluechi_secure() { local mode=$1 local controller_ip=${2:-} local node_name=${3:-$(hostname)}
log "Installing BlueChI in $mode mode with enhanced security..."
# Create dedicated user for BlueChI if ! id -u bluechi &>/dev/null; then useradd -r -s /sbin/nologin -d /var/lib/bluechi -m bluechi log "Created bluechi user" fi
# Install packages local packages=( "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/b/bluechi-agent-0.10.2-1.el9.x86_64.rpm" "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/b/bluechi-selinux-0.10.2-1.el9.noarch.rpm" "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/p/python3-bluechi-0.10.2-1.el9.noarch.rpm" )
if [[ "$mode" == "controller" ]]; then packages+=( "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/b/bluechi-controller-0.10.2-1.el9.x86_64.rpm" "https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/b/bluechi-ctl-0.10.2-1.el9.x86_64.rpm" ) fi
for package in "${packages[@]}"; do local pkg_name=$(basename "$package") log "Downloading $pkg_name..." wget -q --https-only "$package" -O "/tmp/$pkg_name" rpm -Uvh "/tmp/$pkg_name" || warn "Package $pkg_name might already be installed" rm -f "/tmp/$pkg_name" done
# Secure configuration setup_secure_config "$mode" "$controller_ip" "$node_name"
# SELinux configuration setup_selinux
# Firewall configuration setup_firewall "$mode"
log "BlueChI secure installation completed"}
# Setup secure configurationsetup_secure_config() { local mode=$1 local controller_ip=$2 local node_name=$3
log "Setting up secure configuration..."
# Create config directories with proper permissions mkdir -p "$CONFIG_DIR"/{controller.conf.d,agent.conf.d} chown -R bluechi:bluechi "$CONFIG_DIR" chmod 750 "$CONFIG_DIR"
if [[ "$mode" == "controller" ]]; then # Controller configuration with security settings cat > "$CONFIG_DIR/controller.conf.d/security.conf" << EOF[bluechi-controller]ControllerPort=2020AllowedNodeNames=${ALLOWED_NODES:-$node_name}LogLevel=infoLogTarget=journald
# Security settingsMaxConnections=100ConnectionTimeout=30AuthenticationMethod=certificateCertificateFile=/etc/bluechi/certs/controller.crtKeyFile=/etc/bluechi/certs/controller.keyCAFile=/etc/bluechi/certs/ca.crtEOF
# Local agent config cat > "$CONFIG_DIR/agent.conf.d/local.conf" << EOF[bluechi-agent]ControllerAddress=unix:path=/run/bluechi/bluechi.sockNodeName=$node_nameLogLevel=infoLogTarget=journaldEOF else # Agent configuration cat > "$CONFIG_DIR/agent.conf.d/remote.conf" << EOF[bluechi-agent]ControllerHost=$controller_ipControllerPort=2020NodeName=$node_nameLogLevel=infoLogTarget=journaldHeartbeatInterval=5ReconnectInterval=10
# Security settingsCertificateFile=/etc/bluechi/certs/agent.crtKeyFile=/etc/bluechi/certs/agent.keyCAFile=/etc/bluechi/certs/ca.crtEOF fi
# Set proper permissions chmod 640 "$CONFIG_DIR"/*/*.conf chown -R root:bluechi "$CONFIG_DIR"}
# Setup monitoringsetup_monitoring() { log "Setting up monitoring..."
# Create monitoring scripts cat > "$MONITORING_DIR/bluechi-monitor.py" << 'EOF'#!/usr/bin/env python3
import subprocessimport jsonimport timeimport socketfrom datetime import datetime
class BlueChiMonitor: def __init__(self): self.metrics = { 'nodes': {}, 'services': {}, 'errors': [] }
def collect_metrics(self): # Collect node metrics try: result = subprocess.run(['bluechictl', 'list-nodes'], capture_output=True, text=True) if result.returncode == 0: self.parse_nodes(result.stdout) except Exception as e: self.metrics['errors'].append(f"Failed to collect nodes: {e}")
# Collect service metrics for each node for node in self.metrics['nodes']: try: result = subprocess.run(['bluechictl', 'list-units', node], capture_output=True, text=True) if result.returncode == 0: self.parse_services(node, result.stdout) except Exception as e: self.metrics['errors'].append(f"Failed to collect services for {node}: {e}")
def parse_nodes(self, output): lines = output.strip().split('\n')[1:] # Skip header for line in lines: parts = line.split() if len(parts) >= 3: node_name = parts[0] state = parts[1] last_seen = ' '.join(parts[2:]) self.metrics['nodes'][node_name] = { 'state': state, 'last_seen': last_seen, 'timestamp': datetime.now().isoformat() }
def parse_services(self, node, output): if node not in self.metrics['services']: self.metrics['services'][node] = {}
lines = output.strip().split('\n')[1:] # Skip header for line in lines: parts = line.split(None, 3) if len(parts) >= 3: service_name = parts[0] state = parts[1] self.metrics['services'][node][service_name] = { 'state': state, 'timestamp': datetime.now().isoformat() }
def export_prometheus(self): # Export metrics in Prometheus format output = []
# Node metrics for node, data in self.metrics['nodes'].items(): state_value = 1 if data['state'] == 'online' else 0 output.append(f'bluechi_node_online{{node="{node}"}} {state_value}')
# Service metrics for node, services in self.metrics['services'].items(): running = sum(1 for s in services.values() if s['state'] == 'running') failed = sum(1 for s in services.values() if s['state'] == 'failed') output.append(f'bluechi_services_running{{node="{node}"}} {running}') output.append(f'bluechi_services_failed{{node="{node}"}} {failed}')
# Error count output.append(f'bluechi_monitor_errors {len(self.metrics["errors"])}')
return '\n'.join(output)
def serve_metrics(self, port=9101): # Simple HTTP server for Prometheus scraping server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM) server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) server_socket.bind(('0.0.0.0', port)) server_socket.listen(5)
print(f"Serving metrics on port {port}")
while True: client_socket, addr = server_socket.accept()
# Collect fresh metrics self.collect_metrics() metrics = self.export_prometheus()
# Send HTTP response response = f"HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\n\r\n{metrics}" client_socket.send(response.encode()) client_socket.close()
if __name__ == '__main__': monitor = BlueChiMonitor() monitor.serve_metrics()EOF
chmod +x "$MONITORING_DIR/bluechi-monitor.py"
# Create systemd service for monitoring cat > /etc/systemd/system/bluechi-monitor.service << EOF[Unit]Description=BlueChI Monitoring ServiceAfter=network.target bluechi-controller.service
[Service]Type=simpleExecStart=$MONITORING_DIR/bluechi-monitor.pyRestart=alwaysUser=bluechiGroup=bluechi
[Install]WantedBy=multi-user.targetEOF
# Create alert rules cat > "$MONITORING_DIR/alerts.yaml" << EOFgroups: - name: bluechi_alerts rules: - alert: BlueChiNodeDown expr: bluechi_node_online == 0 for: 5m labels: severity: critical annotations: summary: "BlueChI node {{ \$labels.node }} is down" description: "Node {{ \$labels.node }} has been offline for more than 5 minutes"
- alert: BlueChiServicesFailed expr: bluechi_services_failed > 0 for: 10m labels: severity: warning annotations: summary: "Failed services on node {{ \$labels.node }}" description: "{{ \$value }} services are in failed state on {{ \$labels.node }}"
- alert: BlueChiMonitorErrors expr: bluechi_monitor_errors > 0 for: 5m labels: severity: warning annotations: summary: "BlueChI monitoring errors detected" description: "The monitoring system has encountered {{ \$value }} errors"EOF
systemctl daemon-reload systemctl enable bluechi-monitor.service
log "Monitoring setup completed"}
# Setup High Availabilitysetup_ha() { local role=$1 # master or backup local vip=$2 local interface=$3
log "Setting up High Availability as $role..."
# Install keepalived dnf install -y keepalived || yum install -y keepalived
# Configure keepalived local priority=100 if [[ "$role" == "backup" ]]; then priority=90 fi
cat > /etc/keepalived/keepalived.conf << EOFglobal_defs { notification_email { admin@example.com } notification_email_from bluechi@$(hostname) smtp_server localhost smtp_connect_timeout 30}
vrrp_script check_bluechi { script "/usr/local/bin/check_bluechi_health.sh" interval 5 weight -10 fall 2 rise 2}
vrrp_instance BLUECHI_VIP { state $role interface $interface virtual_router_id 51 priority $priority advert_int 1 authentication { auth_type PASS auth_pass $(openssl rand -hex 16) } virtual_ipaddress { $vip/24 } track_script { check_bluechi } notify_master "/usr/local/bin/bluechi_ha_notify.sh master" notify_backup "/usr/local/bin/bluechi_ha_notify.sh backup" notify_fault "/usr/local/bin/bluechi_ha_notify.sh fault"}EOF
# Create health check script cat > /usr/local/bin/check_bluechi_health.sh << 'EOF'#!/bin/bash
# Check if BlueChI controller is runningif systemctl is-active --quiet bluechi-controller; then # Check if we can list nodes if bluechictl list-nodes &>/dev/null; then exit 0 fifi
exit 1EOF
chmod +x /usr/local/bin/check_bluechi_health.sh
# Create notification script cat > /usr/local/bin/bluechi_ha_notify.sh << 'EOF'#!/bin/bash
STATE=$1TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
case $STATE in master) logger -t bluechi-ha "[$TIMESTAMP] Became MASTER - starting services" systemctl start bluechi-controller # Notify monitoring system curl -X POST http://monitoring.example.com/api/alerts \ -H "Content-Type: application/json" \ -d "{\"alert\": \"BlueChI HA failover\", \"host\": \"$(hostname)\", \"state\": \"master\"}" ;; backup) logger -t bluechi-ha "[$TIMESTAMP] Became BACKUP - stopping controller" systemctl stop bluechi-controller ;; fault) logger -t bluechi-ha "[$TIMESTAMP] Entered FAULT state" ;;esacEOF
chmod +x /usr/local/bin/bluechi_ha_notify.sh
# Enable and start keepalived systemctl enable --now keepalived
log "High Availability setup completed"}
# Setup backup and recoverysetup_backup() { log "Setting up backup and recovery..."
# Create backup script cat > /usr/local/bin/bluechi_backup.sh << 'EOF'#!/bin/bash
BACKUP_DIR="/var/backups/bluechi"TIMESTAMP=$(date +%Y%m%d_%H%M%S)BACKUP_FILE="$BACKUP_DIR/bluechi_backup_$TIMESTAMP.tar.gz"
# Create backup directorymkdir -p "$BACKUP_DIR"
# Backup componentstar -czf "$BACKUP_FILE" \ /etc/bluechi/ \ /var/lib/bluechi/ \ /etc/systemd/system/bluechi*.service \ /usr/local/bin/bluechi*.sh \ 2>/dev/null
# Backup node statesbluechictl list-nodes > "$BACKUP_DIR/nodes_$TIMESTAMP.txt"for node in $(bluechictl list-nodes | tail -n +2 | awk '{print $1}'); do bluechictl list-units "$node" > "$BACKUP_DIR/units_${node}_$TIMESTAMP.txt"done
# Rotate old backups (keep last 30 days)find "$BACKUP_DIR" -name "bluechi_backup_*.tar.gz" -mtime +30 -deletefind "$BACKUP_DIR" -name "*.txt" -mtime +30 -delete
echo "Backup completed: $BACKUP_FILE"
# Upload to remote storage (optional)# aws s3 cp "$BACKUP_FILE" s3://backup-bucket/bluechi/EOF
chmod +x /usr/local/bin/bluechi_backup.sh
# Create recovery script cat > /usr/local/bin/bluechi_recover.sh << 'EOF'#!/bin/bash
if [ $# -ne 1 ]; then echo "Usage: $0 <backup_file>" exit 1fi
BACKUP_FILE=$1
if [ ! -f "$BACKUP_FILE" ]; then echo "Backup file not found: $BACKUP_FILE" exit 1fi
# Stop servicessystemctl stop bluechi-controller bluechi-agent
# Extract backuptar -xzf "$BACKUP_FILE" -C /
# Restore permissionschown -R root:bluechi /etc/bluechichmod 750 /etc/bluechichmod 640 /etc/bluechi/*/*.conf
# Restart servicessystemctl daemon-reloadsystemctl start bluechi-controller bluechi-agent
echo "Recovery completed from: $BACKUP_FILE"EOF
chmod +x /usr/local/bin/bluechi_recover.sh
# Schedule daily backups cat > /etc/cron.d/bluechi-backup << EOF0 2 * * * root /usr/local/bin/bluechi_backup.sh >> $LOG_DIR/backup.log 2>&1EOF
log "Backup and recovery setup completed"}
# Setup SELinuxsetup_selinux() { log "Configuring SELinux..."
if command -v getenforce &> /dev/null; then if [[ $(getenforce) != "Disabled" ]]; then # Set proper SELinux contexts semanage fcontext -a -t systemd_unit_file_t "/etc/bluechi(/.*)?" restorecon -Rv /etc/bluechi
# Allow BlueChI to bind to port semanage port -a -t bluechi_port_t -p tcp 2020 2>/dev/null || true
log "SELinux configuration completed" else warn "SELinux is disabled" fi else warn "SELinux tools not found" fi}
# Setup firewallsetup_firewall() { local mode=$1
log "Configuring firewall..."
if command -v firewall-cmd &> /dev/null; then if systemctl is-active --quiet firewalld; then if [[ "$mode" == "controller" ]]; then # Open controller port firewall-cmd --permanent --add-port=2020/tcp # Open monitoring port firewall-cmd --permanent --add-port=9101/tcp fi
# Reload firewall firewall-cmd --reload
log "Firewall configuration completed" else warn "Firewalld is not running" fi else warn "firewall-cmd not found" fi}
# Main deployment functiondeploy_production() { local deployment_type=$1
case "$deployment_type" in "controller-ha") log "Deploying BlueChI Controller with HA" check_prerequisites install_bluechi_secure "controller" setup_monitoring setup_ha "master" "192.168.1.100" "eth0" setup_backup ;;
"controller") log "Deploying BlueChI Controller" check_prerequisites install_bluechi_secure "controller" setup_monitoring setup_backup ;;
"agent") log "Deploying BlueChI Agent" check_prerequisites local controller_ip=$2 local node_name=$3 install_bluechi_secure "agent" "$controller_ip" "$node_name" ;;
*) error "Unknown deployment type: $deployment_type" ;; esac
# Start services log "Starting BlueChI services..." if [[ "$deployment_type" == "controller-ha" ]] || [[ "$deployment_type" == "controller" ]]; then systemctl enable --now bluechi-controller bluechi-agent systemctl enable --now bluechi-monitor else systemctl enable --now bluechi-agent fi
# Final verification log "Verifying deployment..." sleep 5
if [[ "$deployment_type" == "controller-ha" ]] || [[ "$deployment_type" == "controller" ]]; then if bluechictl list-nodes &>/dev/null; then info "BlueChI Controller is operational" bluechictl list-nodes else warn "BlueChI Controller verification failed" fi fi
if systemctl is-active --quiet bluechi-agent; then info "BlueChI Agent is running" else warn "BlueChI Agent is not running" fi
log "Production deployment completed successfully" info "Deployment log: $LOG_FILE" info "Configuration: $CONFIG_DIR" info "Monitoring: http://$(hostname):9101/metrics"}
# Usage informationusage() { cat << EOFBlueChI Production Deployment Script v${SCRIPT_VERSION}
Usage: $0 <deployment-type> [options]
Deployment Types: controller Deploy BlueChI controller controller-ha Deploy BlueChI controller with HA agent Deploy BlueChI agent
Options for agent deployment: $0 agent <controller-ip> <node-name>
Examples: # Deploy controller with HA $0 controller-ha
# Deploy simple controller $0 controller
# Deploy agent $0 agent 192.168.1.10 worker-01
Features: - Enhanced security configuration - Monitoring and metrics export - High Availability support - Automated backup and recovery - SELinux and firewall configuration - Production-ready logging
For more information, check the deployment log at: $LOG_FILEEOF exit 0}
# Script entry pointif [[ $# -lt 1 ]]; then usagefi
# Export required variables for controller modeif [[ "$1" == "controller" ]] || [[ "$1" == "controller-ha" ]]; then export ALLOWED_NODES="controller,agent1,agent2,agent3"fi
# Run deploymentdeploy_production "$@"
Security Hardening Script
Additional script for security hardening of BlueChI deployments:
#!/bin/bash
# BlueChI Security Hardening Script# Implements defense-in-depth security measures# Version: 1.0
set -euo pipefail
readonly CERT_DIR="/etc/bluechi/certs"readonly AUDIT_LOG="/var/log/bluechi/audit.log"
# Create certificate infrastructuresetup_pki() { log "Setting up PKI infrastructure..."
mkdir -p "$CERT_DIR" chmod 700 "$CERT_DIR"
# Generate CA openssl req -x509 -newkey rsa:4096 -days 3650 -nodes \ -keyout "$CERT_DIR/ca.key" \ -out "$CERT_DIR/ca.crt" \ -subj "/C=US/ST=State/L=City/O=Organization/CN=BlueChI CA"
# Generate controller certificate openssl req -newkey rsa:4096 -nodes \ -keyout "$CERT_DIR/controller.key" \ -out "$CERT_DIR/controller.csr" \ -subj "/C=US/ST=State/L=City/O=Organization/CN=bluechi-controller"
openssl x509 -req -in "$CERT_DIR/controller.csr" \ -CA "$CERT_DIR/ca.crt" -CAkey "$CERT_DIR/ca.key" \ -CAcreateserial -out "$CERT_DIR/controller.crt" \ -days 365 -sha256
# Set permissions chmod 600 "$CERT_DIR"/*.key chmod 644 "$CERT_DIR"/*.crt chown -R root:bluechi "$CERT_DIR"
log "PKI infrastructure setup completed"}
# Configure audit loggingsetup_audit() { log "Configuring audit logging..."
# Create audit rules cat > /etc/audit/rules.d/bluechi.rules << 'EOF'# BlueChI audit rules-w /usr/bin/bluechictl -p x -k bluechi_commands-w /etc/bluechi/ -p wa -k bluechi_config-w /var/lib/bluechi/ -p wa -k bluechi_data-w /etc/systemd/system/bluechi*.service -p wa -k bluechi_service
# Monitor D-Bus calls-a always,exit -F arch=b64 -S connect -F a0=1 -F path=/var/run/dbus/system_bus_socket -k bluechi_dbusEOF
# Reload audit rules augenrules --load systemctl restart auditd
# Create audit report script cat > /usr/local/bin/bluechi_audit_report.sh << 'EOF'#!/bin/bash
echo "BlueChI Security Audit Report - $(date)"echo "========================================="echo ""
echo "Recent BlueChI Commands:"ausearch -k bluechi_commands -ts recent --raw | aureport -x --summaryecho ""
echo "Configuration Changes:"ausearch -k bluechi_config -ts recent --raw | aureport -f --summaryecho ""
echo "Service Modifications:"ausearch -k bluechi_service -ts recent --raw | aureport -f --summaryecho ""
echo "Failed Authentication Attempts:"journalctl -u bluechi-controller --since "1 day ago" | grep -i "auth.*fail" || echo "None found"echo ""
echo "Suspicious Activities:"ausearch -k bluechi_dbus -ts recent --raw | aureport --summaryEOF
chmod +x /usr/local/bin/bluechi_audit_report.sh
log "Audit logging configuration completed"}
# Harden system configurationharden_system() { log "Hardening system configuration..."
# Kernel parameters for security cat >> /etc/sysctl.d/99-bluechi-security.conf << 'EOF'# BlueChI Security Hardeningnet.ipv4.tcp_syncookies = 1net.ipv4.conf.all.rp_filter = 1net.ipv4.conf.default.rp_filter = 1net.ipv4.icmp_echo_ignore_broadcasts = 1net.ipv4.conf.all.accept_source_route = 0net.ipv6.conf.all.accept_source_route = 0net.ipv4.conf.all.send_redirects = 0net.ipv4.conf.all.accept_redirects = 0net.ipv6.conf.all.accept_redirects = 0net.ipv4.conf.all.log_martians = 1kernel.randomize_va_space = 2kernel.yama.ptrace_scope = 1EOF
sysctl -p /etc/sysctl.d/99-bluechi-security.conf
# Restrict core dumps echo "* hard core 0" >> /etc/security/limits.conf
# Configure fail2ban for BlueChI if command -v fail2ban-client &> /dev/null; then cat > /etc/fail2ban/jail.d/bluechi.conf << 'EOF'[bluechi]enabled = trueport = 2020filter = bluechilogpath = /var/log/bluechi/*.logmaxretry = 5bantime = 3600EOF
cat > /etc/fail2ban/filter.d/bluechi.conf << 'EOF'[Definition]failregex = Authentication failed for .* from <HOST> Connection refused from <HOST> Invalid certificate from <HOST>ignoreregex =EOF
systemctl restart fail2ban fi
log "System hardening completed"}
# Main executionmain() { log "Starting BlueChI security hardening..."
setup_pki setup_audit harden_system
log "Security hardening completed successfully"}
# Run main functionmain "$@"
Usage Examples
Quick Development Setup
# Single node for testingcurl -sSL https://raw.githubusercontent.com/example/bluechi-scripts/main/install-all-in-one.sh | sudo bash
Production Controller Deployment
# Download scriptwget https://raw.githubusercontent.com/example/bluechi-scripts/main/install-production.shchmod +x install-production.sh
# Deploy controller with HAsudo ./install-production.sh controller-ha
# Deploy simple controllersudo ./install-production.sh controller
Production Agent Deployment
# Deploy agent connected to controllersudo ./install-production.sh agent 192.168.1.100 worker-01
Multi-Node Setup Example
# On controller nodesudo ./install-multi-node.sh \ --mode controller \ --allowed-nodes controller,worker1,worker2,worker3 \ --configure-firewall
# On worker nodessudo ./install-multi-node.sh \ --mode agent \ --controller-ip 192.168.1.100 \ --node-name worker1
Post-Installation Verification
After installation, verify your BlueChI deployment:
# Check servicessystemctl status bluechi-controller bluechi-agent
# List connected nodesbluechictl list-nodes
# Check node connectivityfor node in $(bluechictl list-nodes | tail -n +2 | awk '{print $1}'); do echo "Node: $node" bluechictl list-units $node | head -5done
# Verify monitoring (if enabled)curl -s http://localhost:9101/metrics | grep bluechi
# Check logsjournalctl -u bluechi-controller -u bluechi-agent --since "10 minutes ago"
# Test service controlbluechictl start <node> <service>bluechictl status <node> <service>
Troubleshooting
Common issues and solutions:
Installation Failures
# Check installation logtail -f /var/log/bluechi_install.log
# Verify package installationrpm -qa | grep bluechi
# Check for conflictsrpm -Va | grep bluechi
Connection Issues
# Test network connectivitytelnet controller-ip 2020
# Check firewallfirewall-cmd --list-all
# Verify certificates (if using TLS)openssl s_client -connect controller-ip:2020 -CAfile /etc/bluechi/certs/ca.crt
Service Control Issues
# Check D-Busbusctl status
# Verify systemd integrationsystemctl show bluechi-agent | grep -E "User|Group|PrivateDevices"
# Test local controlsystemctl status httpdbluechictl status $(hostname) httpd.service
Best Practices
-
Security First
- Always use TLS in production
- Implement proper firewall rules
- Enable SELinux where possible
- Regular security audits
-
Monitoring
- Deploy monitoring from day one
- Set up alerting for critical events
- Regular health checks
- Capacity planning
-
Backup and Recovery
- Automate daily backups
- Test recovery procedures
- Document recovery steps
- Off-site backup storage
-
Documentation
- Keep deployment records
- Document custom configurations
- Maintain runbooks
- Update as needed
Conclusion
These automation scripts provide a comprehensive solution for deploying BlueChI in various environments, from development to production. The scripts emphasize:
- Security: Multiple layers of security controls
- Reliability: Health checks and monitoring
- Scalability: Support for multi-node deployments
- Maintainability: Automated backup and recovery
By using these scripts, organizations can quickly deploy a production-ready BlueChI infrastructure that meets enterprise requirements for security, monitoring, and high availability. The modular approach allows for customization based on specific needs while maintaining best practices for system orchestration.