Single Node CoreOS Kubernetes: Complete Setup Guide
This comprehensive guide walks you through setting up a production-ready single-node Kubernetes cluster on CoreOS. Learn to configure networking, storage, security, and monitoring for a robust container orchestration platform suitable for development, testing, and small-scale production workloads.
Table of Contents
Introduction to Single Node Kubernetes
Single-node Kubernetes clusters offer several advantages for specific use cases:
- Development Environment: Complete Kubernetes API for application development
- Edge Computing: Lightweight orchestration for edge deployments
- Learning Platform: Full Kubernetes features for education and training
- Small Workloads: Cost-effective solution for lightweight applications
- CI/CD Pipeline: Dedicated cluster for testing and deployment automation
CoreOS Advantages for Kubernetes
CoreOS provides an optimal foundation for Kubernetes:
- Container-Optimized: Minimal OS designed specifically for containers
- Automatic Updates: Seamless OS updates without service disruption
- Immutable Infrastructure: Read-only root filesystem for enhanced security
- Systemd Integration: Native process management and service orchestration
- etcd Built-in: Distributed key-value store for Kubernetes state management
Prerequisites and System Requirements
Hardware Requirements
Minimum specifications:
- CPU: 2 cores (4 cores recommended)
- RAM: 4GB (8GB recommended)
- Storage: 20GB SSD (50GB recommended)
- Network: Stable internet connection for image downloads
Production specifications:
- CPU: 4+ cores with virtualization support
- RAM: 16GB+ for application workloads
- Storage: 100GB+ NVMe SSD with backup storage
- Network: High-bandwidth connection with static IP
Software Prerequisites
# Check system requirementslscpu | grep -E "(Architecture|CPU|Thread|Core)"free -hdf -hip addr show
# Verify virtualization supportgrep -E "(vmx|svm)" /proc/cpuinfo
CoreOS Installation and Setup
Initial CoreOS Configuration
# ignition.yml - CoreOS Ignition configurationvariant: fcosversion: 1.4.0passwd: users: - name: core ssh_authorized_keys: - ssh-rsa AAAAB3NzaC1yc2EAAAA... # Your SSH public key groups: - sudo - docker shell: /bin/bash
systemd: units: - name: docker.service enabled: true - name: kubelet.service enabled: true - name: k8s-setup.service enabled: true contents: | [Unit] Description=Kubernetes Setup Service After=docker.service Requires=docker.service
[Service] Type=oneshot ExecStart=/usr/local/bin/setup-kubernetes.sh RemainAfterExit=yes
[Install] WantedBy=multi-user.target
storage: directories: - path: /opt/kubernetes mode: 0755 - path: /var/lib/etcd mode: 0700 - path: /etc/kubernetes mode: 0755 - path: /var/log/pods mode: 0755
files: - path: /usr/local/bin/setup-kubernetes.sh mode: 0755 contents: inline: | #!/bin/bash set -euxo pipefail
# Install kubeadm, kubelet, kubectl curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" > /etc/apt/sources.list.d/kubernetes.list apt-get update apt-get install -y kubelet kubeadm kubectl apt-mark hold kubelet kubeadm kubectl
# Configure kubelet echo 'KUBELET_EXTRA_ARGS="--fail-swap-on=false --container-runtime=docker"' > /etc/default/kubelet systemctl daemon-reload systemctl restart kubelet
- path: /etc/kubernetes/kubeadm-config.yaml mode: 0644 contents: inline: | apiVersion: kubeadm.k8s.io/v1beta3 kind: InitConfiguration localAPIEndpoint: advertiseAddress: "0.0.0.0" bindPort: 6443 nodeRegistration: criSocket: "/var/run/dockershim.sock" kubeletExtraArgs: fail-swap-on: "false" container-runtime: "docker" --- apiVersion: kubeadm.k8s.io/v1beta3 kind: ClusterConfiguration kubernetesVersion: "v1.28.0" controlPlaneEndpoint: "127.0.0.1:6443" networking: serviceSubnet: "10.96.0.0/12" podSubnet: "10.244.0.0/16" dnsDomain: "cluster.local" etcd: local: dataDir: "/var/lib/etcd" apiServer: bindPort: 6443 extraArgs: enable-admission-plugins: "NodeRestriction,ResourceQuota,LimitRanger" controllerManager: extraArgs: bind-address: "0.0.0.0" scheduler: extraArgs: bind-address: "0.0.0.0" --- apiVersion: kubelet.config.k8s.io/v1beta1 kind: KubeletConfiguration failSwapOn: false containerRuntimeEndpoint: "unix:///var/run/dockershim.sock"
CoreOS Installation Process
# Download CoreOS installercurl -LO https://builds.coreos.fedoraproject.org/prod/streams/stable/builds/38.20230918.3.0/x86_64/fedora-coreos-38.20230918.3.0-live.x86_64.iso
# Create bootable USB (replace /dev/sdX with your USB device)sudo dd if=fedora-coreos-38.20230918.3.0-live.x86_64.iso of=/dev/sdX bs=4M status=progress oflag=sync
# Boot from USB and installsudo coreos-installer install /dev/sda --ignition-file ignition.yml --insecure-ignition
Kubernetes Installation and Initialization
Automated Installation Script
#!/bin/bashset -euo pipefail
KUBERNETES_VERSION="1.28.0"CNI_PLUGIN="flannel"LOG_FILE="/var/log/k8s-setup.log"
# Logging functionlog() { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE"}
# Error handlingtrap 'log "ERROR: Script failed at line $LINENO"' ERR
log "Starting Kubernetes single-node setup"
# System preparationprepare_system() { log "Preparing system for Kubernetes"
# Disable swap swapoff -a sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# Load required kernel modules modprobe br_netfilter modprobe ip_vs modprobe ip_vs_rr modprobe ip_vs_wrr modprobe ip_vs_sh modprobe nf_conntrack
# Make modules persistent cat > /etc/modules-load.d/k8s.conf << EOFbr_netfilterip_vsip_vs_rrip_vs_wrrip_vs_shnf_conntrackEOF
# Configure sysctl settings cat > /etc/sysctl.d/k8s.conf << EOFnet.bridge.bridge-nf-call-ip6tables = 1net.bridge.bridge-nf-call-iptables = 1net.ipv4.ip_forward = 1vm.swappiness = 0EOF
sysctl --system log "System preparation completed"}
# Install Dockerinstall_docker() { log "Installing Docker"
# Install Docker from official repository curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add - add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" apt-get update apt-get install -y docker-ce docker-ce-cli containerd.io
# Configure Docker daemon mkdir -p /etc/docker cat > /etc/docker/daemon.json << EOF{ "exec-opts": ["native.cgroupdriver=systemd"], "log-driver": "json-file", "log-opts": { "max-size": "100m", "max-file": "3" }, "storage-driver": "overlay2", "storage-opts": [ "overlay2.override_kernel_check=true" ], "registry-mirrors": ["https://mirror.gcr.io"], "insecure-registries": ["localhost:5000"]}EOF
systemctl daemon-reload systemctl enable docker systemctl start docker
# Add core user to docker group usermod -aG docker core
log "Docker installation completed"}
# Install Kubernetes componentsinstall_kubernetes() { log "Installing Kubernetes components"
# Add Kubernetes repository curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" > /etc/apt/sources.list.d/kubernetes.list
apt-get update apt-get install -y kubelet=$KUBERNETES_VERSION-00 kubeadm=$KUBERNETES_VERSION-00 kubectl=$KUBERNETES_VERSION-00 apt-mark hold kubelet kubeadm kubectl
# Configure kubelet cat > /etc/default/kubelet << EOFKUBELET_EXTRA_ARGS="--fail-swap-on=false --container-runtime=docker --cgroup-driver=systemd"EOF
systemctl daemon-reload systemctl enable kubelet
log "Kubernetes components installed"}
# Initialize Kubernetes clusterinitialize_cluster() { log "Initializing Kubernetes cluster"
# Initialize cluster with kubeadm kubeadm init \ --config=/etc/kubernetes/kubeadm-config.yaml \ --upload-certs \ --v=5 2>&1 | tee -a "$LOG_FILE"
# Setup kubectl for core user mkdir -p /home/core/.kube cp -i /etc/kubernetes/admin.conf /home/core/.kube/config chown core:core /home/core/.kube/config
# Setup kubectl for root export KUBECONFIG=/etc/kubernetes/admin.conf echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> /root/.bashrc
log "Cluster initialization completed"}
# Remove taints from master node (single-node setup)configure_single_node() { log "Configuring single-node cluster"
# Remove master node taint to allow scheduling kubectl taint nodes --all node-role.kubernetes.io/control-plane- || true kubectl taint nodes --all node-role.kubernetes.io/master- || true
log "Single-node configuration completed"}
# Install CNI plugininstall_cni() { log "Installing CNI plugin: $CNI_PLUGIN"
case $CNI_PLUGIN in "flannel") kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml ;; "calico") kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/tigera-operator.yaml kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/custom-resources.yaml ;; "weave") kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')" ;; esac
log "CNI plugin installation completed"}
# Verify cluster setupverify_cluster() { log "Verifying cluster setup"
# Wait for nodes to be ready timeout=300 while [[ $timeout -gt 0 ]]; do if kubectl get nodes | grep -q "Ready"; then break fi sleep 10 ((timeout-=10)) done
# Display cluster information kubectl cluster-info kubectl get nodes -o wide kubectl get pods --all-namespaces
log "Cluster verification completed"}
# Main executionmain() { log "Starting Kubernetes single-node cluster setup"
prepare_system install_docker install_kubernetes initialize_cluster configure_single_node install_cni verify_cluster
log "Kubernetes single-node cluster setup completed successfully" log "Run 'kubectl get nodes' to verify cluster status" log "Run 'kubectl get pods --all-namespaces' to see system pods"}
# Execute main functionmain "$@"
Networking Configuration
Flannel CNI Setup
apiVersion: v1kind: ConfigMapmetadata: name: kube-flannel-cfg namespace: kube-system labels: tier: node app: flanneldata: cni-conf.json: | { "name": "cbr0", "cniVersion": "0.3.1", "plugins": [ { "type": "flannel", "delegate": { "hairpinMode": true, "isDefaultGateway": true } }, { "type": "portmap", "capabilities": { "portMappings": true } } ] } net-conf.json: | { "Network": "10.244.0.0/16", "Backend": { "Type": "vxlan" } }---apiVersion: apps/v1kind: DaemonSetmetadata: name: kube-flannel-ds namespace: kube-system labels: tier: node app: flannelspec: selector: matchLabels: app: flannel template: metadata: labels: tier: node app: flannel spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/os operator: In values: - linux hostNetwork: true priorityClassName: system-node-critical tolerations: - operator: Exists effect: NoSchedule serviceAccountName: flannel initContainers: - name: install-cni-plugin image: rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0 command: - cp args: - -f - /flannel - /opt/cni/bin/flannel volumeMounts: - name: cni-plugin mountPath: /opt/cni/bin - name: install-cni image: rancher/mirrored-flannelcni-flannel:v0.19.2 command: - cp args: - -f - /etc/kube-flannel/cni-conf.json - /etc/cni/net.d/10-flannel.conflist volumeMounts: - name: cni mountPath: /etc/cni/net.d - name: flannel-cfg mountPath: /etc/kube-flannel/ containers: - name: kube-flannel image: rancher/mirrored-flannelcni-flannel:v0.19.2 command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" securityContext: privileged: false capabilities: add: ["NET_ADMIN", "NET_RAW"] env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace - name: EVENT_QUEUE_DEPTH value: "5000" volumeMounts: - name: run mountPath: /run/flannel - name: flannel-cfg mountPath: /etc/kube-flannel/ - name: xtables-lock mountPath: /run/xtables.lock volumes: - name: run hostPath: path: /run/flannel - name: cni-plugin hostPath: path: /opt/cni/bin - name: cni hostPath: path: /etc/cni/net.d - name: flannel-cfg configMap: name: kube-flannel-cfg - name: xtables-lock hostPath: path: /run/xtables.lock type: FileOrCreate
Network Policy Implementation
apiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: default-deny-all namespace: defaultspec: podSelector: {} policyTypes: - Ingress - Egress---apiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: allow-dns namespace: defaultspec: podSelector: {} policyTypes: - Egress egress: - to: [] ports: - protocol: UDP port: 53 - protocol: TCP port: 53---apiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: allow-kube-system namespace: defaultspec: podSelector: {} policyTypes: - Ingress - Egress ingress: - from: - namespaceSelector: matchLabels: name: kube-system egress: - to: - namespaceSelector: matchLabels: name: kube-system
Storage Configuration
Local Storage Setup
apiVersion: storage.k8s.io/v1kind: StorageClassmetadata: name: local-storageprovisioner: kubernetes.io/no-provisionervolumeBindingMode: WaitForFirstConsumerreclaimPolicy: Delete---apiVersion: v1kind: PersistentVolumemetadata: name: local-pv-1spec: capacity: storage: 10Gi volumeMode: Filesystem accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Delete storageClassName: local-storage local: path: /opt/local-storage/pv1 nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - coreos-single-node---apiVersion: v1kind: PersistentVolumemetadata: name: local-pv-2spec: capacity: storage: 20Gi volumeMode: Filesystem accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Delete storageClassName: local-storage local: path: /opt/local-storage/pv2 nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - coreos-single-node
Dynamic Storage Provisioning
#!/bin/bash# Create directories for local storagemkdir -p /opt/local-storage/{pv1,pv2,pv3,pv4,pv5}
# Set proper permissionschmod 755 /opt/local-storage/*chown -R root:root /opt/local-storage
# Install local-path-provisioner for dynamic provisioningkubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.24/deploy/local-path-storage.yaml
# Set as default storage classkubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
# Verify storage setupkubectl get storageclasskubectl get pv
Security Hardening
RBAC Configuration
apiVersion: v1kind: ServiceAccountmetadata: name: admin-user namespace: kube-system---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRoleBindingmetadata: name: admin-userroleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-adminsubjects: - kind: ServiceAccount name: admin-user namespace: kube-system---apiVersion: v1kind: ServiceAccountmetadata: name: dashboard-readonly namespace: kube-system---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRolemetadata: name: dashboard-readonlyrules: - apiGroups: [""] resources: ["*"] verbs: ["get", "list", "watch"] - apiGroups: ["apps", "extensions"] resources: ["*"] verbs: ["get", "list", "watch"]---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRoleBindingmetadata: name: dashboard-readonlyroleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: dashboard-readonlysubjects: - kind: ServiceAccount name: dashboard-readonly namespace: kube-system
Pod Security Standards
apiVersion: v1kind: Namespacemetadata: name: secure-namespace labels: pod-security.kubernetes.io/enforce: restricted pod-security.kubernetes.io/audit: restricted pod-security.kubernetes.io/warn: restricted---apiVersion: v1kind: LimitRangemetadata: name: resource-limits namespace: secure-namespacespec: limits: - default: cpu: "200m" memory: "256Mi" defaultRequest: cpu: "100m" memory: "128Mi" type: Container - max: cpu: "1" memory: "1Gi" min: cpu: "50m" memory: "64Mi" type: Container---apiVersion: v1kind: ResourceQuotametadata: name: resource-quota namespace: secure-namespacespec: hard: requests.cpu: "2" requests.memory: 4Gi limits.cpu: "4" limits.memory: 8Gi pods: "10" services: "5" persistentvolumeclaims: "3"
Security Policies
#!/bin/bash# Enable audit loggingmkdir -p /var/log/kubernetes
cat > /etc/kubernetes/audit-policy.yaml << EOFapiVersion: audit.k8s.io/v1kind: Policyrules:- level: Metadata resources: - group: "" resources: ["secrets", "configmaps"]- level: RequestResponse resources: - group: "" resources: ["pods", "services"]- level: Request namespaces: ["kube-system"]EOF
# Update API server configurationsed -i '/--enable-admission-plugins/s/$/,PodSecurityPolicy/' /etc/kubernetes/manifests/kube-apiserver.yamlsed -i '/--enable-admission-plugins/a\ - --audit-log-path=/var/log/kubernetes/audit.log' /etc/kubernetes/manifests/kube-apiserver.yamlsed -i '/--audit-log-path/a\ - --audit-policy-file=/etc/kubernetes/audit-policy.yaml' /etc/kubernetes/manifests/kube-apiserver.yamlsed -i '/--audit-policy-file/a\ - --audit-log-maxage=30' /etc/kubernetes/manifests/kube-apiserver.yamlsed -i '/--audit-log-maxage/a\ - --audit-log-maxbackup=3' /etc/kubernetes/manifests/kube-apiserver.yamlsed -i '/--audit-log-maxbackup/a\ - --audit-log-maxsize=100' /etc/kubernetes/manifests/kube-apiserver.yaml
# Restart kubelet to apply changessystemctl restart kubelet
echo "Security hardening applied. Monitor /var/log/kubernetes/audit.log for security events."
Monitoring and Observability
Prometheus and Grafana Setup
apiVersion: v1kind: Namespacemetadata: name: monitoring---apiVersion: apps/v1kind: Deploymentmetadata: name: prometheus namespace: monitoringspec: replicas: 1 selector: matchLabels: app: prometheus template: metadata: labels: app: prometheus spec: containers: - name: prometheus image: prom/prometheus:v2.45.0 ports: - containerPort: 9090 args: - "--config.file=/etc/prometheus/prometheus.yml" - "--storage.tsdb.path=/prometheus/" - "--web.console.libraries=/etc/prometheus/console_libraries" - "--web.console.templates=/etc/prometheus/consoles" - "--storage.tsdb.retention.time=200h" - "--web.enable-lifecycle" volumeMounts: - name: prometheus-config mountPath: /etc/prometheus/ - name: prometheus-storage mountPath: /prometheus/ volumes: - name: prometheus-config configMap: name: prometheus-config - name: prometheus-storage emptyDir: {}---apiVersion: v1kind: ConfigMapmetadata: name: prometheus-config namespace: monitoringdata: prometheus.yml: | global: scrape_interval: 15s evaluation_interval: 15s
rule_files: - "first_rules.yml"
scrape_configs: - job_name: 'prometheus' static_configs: - targets: ['localhost:9090']
- job_name: 'kubernetes-apiservers' kubernetes_sd_configs: - role: endpoints scheme: https tls_config: ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token relabel_configs: - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name] action: keep regex: default;kubernetes;https
- job_name: 'kubernetes-nodes' kubernetes_sd_configs: - role: node scheme: https tls_config: ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token relabel_configs: - action: labelmap regex: __meta_kubernetes_node_label_(.+) - target_label: __address__ replacement: kubernetes.default.svc:443 - source_labels: [__meta_kubernetes_node_name] regex: (.+) target_label: __metrics_path__ replacement: /api/v1/nodes/${1}/proxy/metrics
- job_name: 'kubernetes-pods' kubernetes_sd_configs: - role: pod relabel_configs: - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape] action: keep regex: true - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path] action: replace target_label: __metrics_path__ regex: (.+) - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port] action: replace regex: ([^:]+)(?::\d+)?;(\d+) replacement: $1:$2 target_label: __address__ - action: labelmap regex: __meta_kubernetes_pod_label_(.+) - source_labels: [__meta_kubernetes_namespace] action: replace target_label: kubernetes_namespace - source_labels: [__meta_kubernetes_pod_name] action: replace target_label: kubernetes_pod_name---apiVersion: v1kind: Servicemetadata: name: prometheus namespace: monitoringspec: type: NodePort ports: - port: 9090 targetPort: 9090 nodePort: 30090 selector: app: prometheus
Node Exporter Deployment
apiVersion: apps/v1kind: DaemonSetmetadata: name: node-exporter namespace: monitoring labels: app: node-exporterspec: selector: matchLabels: app: node-exporter template: metadata: labels: app: node-exporter annotations: prometheus.io/scrape: "true" prometheus.io/port: "9100" spec: hostPID: true hostIPC: true hostNetwork: true containers: - name: node-exporter image: prom/node-exporter:v1.6.0 ports: - containerPort: 9100 args: - "--path.sysfs=/host/sys" - "--path.rootfs=/host/root" - "--no-collector.wifi" - "--no-collector.hwmon" - "--collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/)" - "--collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$" resources: requests: memory: 30Mi cpu: 100m limits: memory: 50Mi cpu: 200m volumeMounts: - name: dev mountPath: /host/dev - name: proc mountPath: /host/proc - name: sys mountPath: /host/sys - name: rootfs mountPath: /host/root tolerations: - operator: Exists volumes: - name: proc hostPath: path: /proc - name: dev hostPath: path: /dev - name: sys hostPath: path: /sys - name: rootfs hostPath: path: /
Troubleshooting and Maintenance
Common Issues and Solutions
#!/bin/bash# Comprehensive troubleshooting script for single-node Kubernetes
# Check cluster healthcheck_cluster_health() { echo "=== Cluster Health Check ==="
echo "Node Status:" kubectl get nodes -o wide
echo "System Pods Status:" kubectl get pods -n kube-system
echo "API Server Health:" kubectl get --raw='/readyz'
echo "etcd Health:" kubectl get pods -n kube-system -l component=etcd
echo "Component Status:" kubectl get componentstatuses}
# Check resource usagecheck_resources() { echo "=== Resource Usage ==="
echo "Node Resource Usage:" kubectl top nodes
echo "Pod Resource Usage:" kubectl top pods --all-namespaces
echo "Disk Usage:" df -h
echo "Memory Usage:" free -h
echo "Docker Images:" docker images --format "table {{.Repository}}\t{{.Tag}}\t{{.Size}}"}
# Check networkingcheck_networking() { echo "=== Networking Check ==="
echo "Pod Network Status:" kubectl get pods -n kube-system -l app=flannel
echo "Service Status:" kubectl get svc --all-namespaces
echo "Network Policies:" kubectl get networkpolicies --all-namespaces
echo "DNS Test:" kubectl run test-dns --image=busybox --rm -it --restart=Never -- nslookup kubernetes.default}
# Check logscheck_logs() { echo "=== System Logs ==="
echo "Kubelet Logs:" journalctl -u kubelet --no-pager -n 20
echo "Docker Logs:" journalctl -u docker --no-pager -n 10
echo "Failed Pods:" kubectl get pods --all-namespaces --field-selector=status.phase=Failed}
# Clean up resourcescleanup_resources() { echo "=== Cleanup Resources ==="
echo "Removing failed pods:" kubectl delete pods --all-namespaces --field-selector=status.phase=Failed
echo "Cleaning Docker:" docker system prune -f
echo "Cleaning unused images:" docker image prune -f
echo "Restart kubelet if needed:" read -p "Restart kubelet? (y/N): " restart_kubelet if [[ $restart_kubelet == "y" ]]; then systemctl restart kubelet fi}
# Performance optimizationoptimize_performance() { echo "=== Performance Optimization ==="
# Optimize kernel parameters echo "net.core.somaxconn = 32768" >> /etc/sysctl.conf echo "net.ipv4.tcp_max_syn_backlog = 32768" >> /etc/sysctl.conf echo "net.core.netdev_max_backlog = 32768" >> /etc/sysctl.conf sysctl -p
# Configure Docker logging cat > /etc/docker/daemon.json << EOF{ "log-driver": "json-file", "log-opts": { "max-size": "100m", "max-file": "3" }}EOF
systemctl restart docker
echo "Performance optimizations applied"}
# Main menumain() { while true; do echo "=== Kubernetes Troubleshooting Toolkit ===" echo "1. Check Cluster Health" echo "2. Check Resource Usage" echo "3. Check Networking" echo "4. Check Logs" echo "5. Cleanup Resources" echo "6. Optimize Performance" echo "7. Exit"
read -p "Select option (1-7): " choice
case $choice in 1) check_cluster_health ;; 2) check_resources ;; 3) check_networking ;; 4) check_logs ;; 5) cleanup_resources ;; 6) optimize_performance ;; 7) exit 0 ;; *) echo "Invalid option" ;; esac
echo read -p "Press Enter to continue..." clear done}
main "$@"
Backup and Recovery
#!/bin/bashBACKUP_DIR="/opt/kubernetes-backups"DATE=$(date +%Y%m%d_%H%M%S)
# Create backupcreate_backup() { echo "Creating Kubernetes cluster backup..."
mkdir -p "$BACKUP_DIR/$DATE"
# Backup etcd kubectl exec -n kube-system etcd-$(hostname) -- etcdctl snapshot save /tmp/etcd-snapshot.db \ --endpoints=https://127.0.0.1:2379 \ --cacert=/etc/kubernetes/pki/etcd/ca.crt \ --cert=/etc/kubernetes/pki/etcd/server.crt \ --key=/etc/kubernetes/pki/etcd/server.key
kubectl cp kube-system/etcd-$(hostname):/tmp/etcd-snapshot.db "$BACKUP_DIR/$DATE/etcd-snapshot.db"
# Backup certificates cp -r /etc/kubernetes/pki "$BACKUP_DIR/$DATE/"
# Backup configuration cp -r /etc/kubernetes/*.conf "$BACKUP_DIR/$DATE/"
# Backup resources kubectl get all --all-namespaces -o yaml > "$BACKUP_DIR/$DATE/all-resources.yaml"
echo "Backup completed: $BACKUP_DIR/$DATE"}
# Restore from backuprestore_backup() { local backup_path="$1"
if [[ ! -d "$backup_path" ]]; then echo "Backup directory not found: $backup_path" return 1 fi
echo "Restoring from backup: $backup_path"
# Stop kubelet systemctl stop kubelet
# Restore etcd snapshot etcdctl snapshot restore "$backup_path/etcd-snapshot.db" \ --data-dir=/var/lib/etcd-backup
# Replace etcd data rm -rf /var/lib/etcd mv /var/lib/etcd-backup /var/lib/etcd
# Restore certificates and configuration cp -r "$backup_path/pki" /etc/kubernetes/ cp "$backup_path"/*.conf /etc/kubernetes/
# Start kubelet systemctl start kubelet
echo "Restore completed"}
# List available backupslist_backups() { echo "Available backups:" ls -la "$BACKUP_DIR"}
case $1 in "backup") create_backup ;; "restore") restore_backup "$2" ;; "list") list_backups ;; *) echo "Usage: $0 {backup|restore <path>|list}" ;;esac
Best Practices and Recommendations
Production Readiness Checklist
-
Security Hardening
- Enable RBAC and Pod Security Standards
- Configure network policies
- Implement audit logging
- Regular security scanning
-
Monitoring and Observability
- Deploy Prometheus and Grafana
- Configure alerts for critical metrics
- Implement log aggregation
- Monitor resource usage
-
Backup and Recovery
- Automated etcd backups
- Configuration backups
- Tested recovery procedures
- Disaster recovery plan
-
Resource Management
- Configure resource limits and quotas
- Implement horizontal pod autoscaling
- Monitor disk usage
- Optimize container images
-
Maintenance Procedures
- Regular cluster updates
- Certificate renewal
- Node maintenance windows
- Performance optimization
Performance Optimization
#!/bin/bash
# Optimize system for Kubernetes workloadsoptimize_system() { # Kernel parameters cat >> /etc/sysctl.conf << EOF# Kubernetes optimizationsnet.bridge.bridge-nf-call-iptables = 1net.bridge.bridge-nf-call-ip6tables = 1net.ipv4.ip_forward = 1vm.swappiness = 1vm.max_map_count = 262144fs.file-max = 2097152net.core.somaxconn = 32768net.ipv4.tcp_max_syn_backlog = 32768net.core.netdev_max_backlog = 32768EOF
sysctl -p
# Optimize kubelet cat >> /etc/default/kubelet << EOFKUBELET_EXTRA_ARGS="--max-pods=250 --kube-api-qps=100 --kube-api-burst=100"EOF
systemctl restart kubelet}
optimize_system
Conclusion
Setting up a single-node Kubernetes cluster on CoreOS provides a robust foundation for container orchestration in development, testing, and small-scale production environments. This configuration offers:
- Complete Kubernetes Experience: Full API compatibility for development and testing
- Enhanced Security: CoreOS’s immutable infrastructure and security hardening
- Monitoring and Observability: Comprehensive monitoring stack with Prometheus and Grafana
- Production Readiness: Backup, recovery, and maintenance procedures
- Scalability Path: Easy migration to multi-node clusters when needed
Remember to regularly update your cluster components, monitor security advisories, and maintain backup procedures to ensure a reliable and secure Kubernetes environment.
This single-node setup is ideal for development, testing, and learning purposes. For production workloads requiring high availability, consider implementing a multi-node cluster with proper load balancing and redundancy.