A service of Daily Data, Inc.
Contact Form

User Tools

Site Tools


unix:freebsd:system_builds:airgap:buildairgap

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
unix:freebsd:system_builds:airgap:buildairgap [2026/01/20 01:14] – removed - external edit (Unknown date) 127.0.0.1unix:freebsd:system_builds:airgap:buildairgap [2026/01/20 01:14] (current) – ↷ Links adapted because of a move operation rodolico
Line 1: Line 1:
 +====== Building an Air Gap Server ======
  
 +<WRAP center round important 60%>
 +**Purpose:** Secure long-term backup storage isolated from network threats\\
 +**Key Requirements:** Full disk encryption, physical security, validated data transfer\\
 +**Target Environment:** FreeBSD with ZFS and GELI encryption\\
 +**Status:** Powered off when not actively receiving updates
 +</WRAP>
 +
 +===== Overview =====
 +
 +An **Air Gap Server** is a server that operates without network connectivity to protect critical backup data from remote attacks. A modified approach allows temporary network access for system updates while maintaining security boundaries.
 +
 +The primary purpose is to store long-term backups with significantly reduced attack surface. Physical isolation combined with encryption provides defense-in-depth against:
 +  * Remote network attacks (ransomware, unauthorized access)
 +  * Physical theft or unauthorized access
 +  * Compromised source systems
 +
 +<wrap round important>
 +**Critical:** If the server must be stored in an unsecured location, full disk encryption is **mandatory**, not optional.
 +</wrap>
 +
 +===== Core Security Principles =====
 +
 +  * **Physical Isolation** — Geographic separation from primary servers with controlled access
 +  * **Defense in Depth** — Multiple security layers protect data at rest and in transit
 +  * **Data Validation** — Verify integrity of all data transfers and scripts
 +  * **Automated Reporting** — Track all operations for audit and monitoring
 +  * **Powered Off by Default** — Server only active during updates or maintenance
 +
 +===== Implementation Guidelines =====
 +
 +==== Physical Security and Access Control ====
 +
 +**Ideal Configuration:**
 +  * Store in secure facility requiring authenticated access (e.g., Network Operations Center)
 +  * Geographic separation from primary production servers
 +  * Documented access procedures and audit logs
 +
 +**Fallback for Insecure Locations:**
 +
 +When secure facilities are unavailable:
 +  * **Mandatory:** Full disk encryption (''GELI'' or equivalent)
 +  * **Mandatory:** Documented key management procedures
 +  * **Recommended:** Physical locks, tamper-evident seals
 +  * **Recommended:** Motion detection or access logging
 +
 +<wrap round tip>
 +Store encryption keys in a different physical location than the server. Consider splitting keys across multiple secure locations.
 +</wrap>
 +
 +==== Encryption Strategy ====
 +
 +This implementation uses FreeBSD with ''GELI'' disk encryption backing a ''ZFS'' filesystem.
 +
 +**At Rest Protection:**
 +  * Full disk encryption using ''GELI'' (minimum requirement)
 +  * All data pools encrypted with strong passphrases or key files
 +  * Keys never stored on the server itself
 +
 +**Split-Key Architecture:**
 +
 +For enhanced security, consider using split-key encryption where the final encryption key is derived from combining two separate key components. This enhances security by allowing the actual GELI key to be stored securely off-site, as it cannot be reconstructed without both components:
 +
 +  * **Two-Operator Model:** Each key component held by different operators
 +    * Requires both operators present to unlock encrypted data
 +    * Maximum security: no single person can access data alone
 +    * Higher operational overhead
 +  
 +  * **Operator + Automated Model:** One key with operator, one on server
 +    * Operator key: Physically carried by trusted operator
 +    * Server key: Stored on automated script (never on target server)
 +    * Keys combined via XOR or similar operation at decrypt time
 +    * Balances security with automation needs
 +
 +**In Transit Protection:**
 +  * Transport media (external drives) fully encrypted
 +  * Delta data encrypted before writing to transport media
 +  * Encryption keys validated at both source and destination
 +
 +**Example GELI Setup:**
 +<code bash>
 +# Generate a random key file (4096 bits = 512 bytes)
 +openssl rand 512 > /secure/path/geli.key
 +chmod 400 /secure/path/geli.key
 +
 +# Initialize GELI encryption on disk using the key file
 +geli init -s 4096 -K /secure/path/geli.key /dev/ada0
 +
 +# Attach encrypted device
 +geli attach -k /secure/path/geli.key /dev/ada0
 +
 +# Create ZFS pool on encrypted device
 +zpool create backup /dev/ada0.eli
 +</code>
 +
 +<wrap round info>
 +Key size of 4096 bits provides strong encryption. The key file should be stored securely and backed up to a separate location. Use ''-P'' flag to add passphrase protection in addition to key file.
 +</wrap>
 +
 +==== Data Transfer Validation ====
 +
 +**Transport Media Requirements:**
 +  * Large capacity drives (match expected delta sizes)
 +  * Encrypted filesystem (''GELI'', ''LUKS'', or BitLocker) or Encryption of individual files in transit
 +  * Labeled with GPT labels for automated mounting
 +
 +**Delta Monitoring:**
 +
 +Monitor transfer sizes to detect anomalies:
 +  * Establish baseline delta sizes for normal operations
 +  * Alert on deltas exceeding 150-200% of baseline
 +  * **Large deltas may indicate ransomware on source system**
 +
 +**Data Integrity Verification:**
 +
 +<code bash>
 +# Generate checksum on source
 +zfs send pool/dataset@snapshot | tee >(sha256) > /mnt/transport/delta.zfs
 +
 +# Verify checksum on air gap server
 +sha256 /mnt/transport/delta.zfs
 +</code>
 +
 +**Data Validation:**
 +  * All data encrypted with symmetric key at source
 +  * Decryption failure automatically rejects the data
 +  * Failed decryption indicates corruption or tampering
 +  * Process terminates on any decryption failure
 +
 +==== Script Validation and Maintenance ====
 +
 +Air gap servers require special consideration for maintenance since they lack network access for updates.
 +
 +**Validated Script Execution:**
 +
 +Scripts may be deployed to perform maintenance tasks:
 +  * ''ZFS'' scrubs and pool health checks
 +  * Snapshot cleanup and rotation
 +  * SMART disk monitoring
 +  * System updates (if temporarily networked)
 +
 +**Script Deployment Process:**
 +  - Scripts stored on source server and version controlled
 +  - Scripts encrypted with symmetric key before transfer
 +  - Air gap server must successfully decrypt before execution
 +  - Decryption failure prevents script execution and terminates process
 +  - Scripts run automatically during replication operations
 +
 +**Example Script Encryption/Decryption:**
 +<code bash>
 +# On source server: encrypt script
 +openssl enc -aes-256-cbc -salt -in cleanup_script.sh \
 +  -out cleanup_script.sh.enc -pass file:/secure/transport.key
 +
 +# On air gap server: decrypt and execute
 +openssl enc -aes-256-cbc -d -in cleanup_script.sh.enc \
 +  -out cleanup_script.sh -pass file:/secure/transport.key && \
 +  sh cleanup_script.sh || { echo "Decryption failed - aborting"; exit 1; }
 +</code>
 +
 +<wrap round important>
 +**Security through decryption:** Scripts that cannot be decrypted with the correct symmetric key are rejected. Any decryption failure terminates the entire process to prevent execution of potentially tampered scripts.
 +</wrap>
 +
 +==== Reporting and Audit Trail ====
 +
 +**Reporting Challenges:**
 +  * Air gap servers cannot send email reports
 +  * No network access for remote monitoring
 +  * Reports must be physically retrieved
 +
 +**Solution — Report Drive:**
 +  * Dedicated removable media for reports (USB drive, small HDD)
 +  * Reports written to transport drive after each operation
 +  * Administrator retrieves and processes reports manually
 +
 +**Report Contents:**
 +  * Timestamp of operation
 +  * Data volumes transferred (size, snapshot names)
 +  * Success/failure status of each operation
 +  * Disk health (SMART status, ZFS pool health)
 +  * Script execution results
 +  * Any errors or warnings
 +
 +**Example Report Structure:**
 +<code>
 +=== Air Gap Backup Report ===
 +Date: 2026-01-18 03:00:00
 +Operation: Incremental Backup
 +Source: production.example.com
 +Target: airgap-backup01
 +
 +Datasets Processed:
 +  - pool/data: 45.2 GB transferred
 +    Latest: pool/data@2026-01-18_02:00:00
 +  - pool/databases: 12.8 GB transferred  
 +    Latest: pool/databases@2026-01-18_02:00:00
 +
 +Pool Health: ONLINE
 +Disk Status: All disks PASSED SMART checks
 +
 +Maintenance Scripts Executed:
 +  - snapshot_cleanup.sh: SUCCESS (removed 3 old snapshots)
 +  - zfs_scrub.sh: SUCCESS (no errors found)
 +
 +System Shutdown: 2026-01-18 03:45:00
 +Next Expected Update: 2026-01-25
 +</code>
 +
 +==== Power Management ====
 +
 +**Default State: Powered Off**
 +
 +The air gap server should remain powered off except during:
 +  * Scheduled data imports
 +  * Manual maintenance operations
 +  * Security audits
 +
 +**Benefits of Power-Off Strategy:**
 +  * Encrypted drives are locked (keys in memory are cleared)
 +  * Eliminates risk of remote exploitation during off time
 +  * Reduces hardware wear and power consumption
 +  * Limits window of opportunity for physical attacks
 +
 +**Automated Shutdown:**
 +
 +Final script in maintenance chain should power off the system:
 +<code bash>
 +#!/bin/sh
 +# Final maintenance script - shutdown system
 +
 +# Verify all operations completed successfully
 +if [ -f /var/run/backup_complete ]; then
 +    # Write final report
 +    echo "Backup completed successfully at $(date)" >> /mnt/report/status.log
 +    
 +    # Sync all filesystem buffers
 +    sync
 +    
 +    # Unmount transport media
 +    umount /mnt/transport
 +    umount /mnt/report
 +    
 +    # Power off system
 +    shutdown -p now
 +else
 +    echo "ERROR: Backup did not complete. Manual intervention required." >> /mnt/report/error.log
 +    # Do NOT shutdown - leave powered on for troubleshooting
 +fi
 +</code>
 +
 +<wrap round important>
 +**Do not** configure automatic shutdown if backups fail. A powered-on system indicates problems requiring manual investigation.
 +</wrap>
 +
 +===== Example Workflow =====
 +
 +A typical weekly backup cycle:
 +
 +**Day 1 (Monday) — Source Server:**
 +  - Automated script takes ZFS snapshots of all datasets
 +  - Calculates incremental changes since last backup
 +  - Encrypts delta data to transport drive with symmetric key
 +  - Encrypts maintenance scripts with same symmetric key
 +  - Operator notified that transport drive is ready
 +
 +**Day 2 (Tuesday) — Physical Transport:**
 +  - Operator removes transport drive from source server
 +  - Drive physically transported to air gap location
 +  - Transport logged in access control system
 +
 +**Day 3 (Wednesday) — Air Gap Server:**
 +  - Operator inserts transport drive and powers on server
 +  - Server boots, mounts transport drive
 +  - Automated script begins:
 +    * Attempts to decrypt delta files with symmetric key
 +    * Validates delta sizes against baseline
 +    * Imports ZFS datasets (decryption happens during import)
 +    * Attempts to decrypt and run maintenance scripts
 +    * Any decryption failure terminates the entire process
 +    * Generates report to report drive
 +    * Powers off system (only if all operations succeed)
 +  - Operator retrieves report drive for later review
 +
 +**Day 4 (Thursday) — Report Processing:**
 +  - Operator reviews reports from air gap server
 +  - Verifies all backups completed successfully
 +  - Archives reports for audit trail
 +  - Updates monitoring dashboard
 +
 +**Day 8 (Next Monday):**
 +  - Process repeats with fresh delta data
 +
 +===== Pre-Implementation Checklist =====
 +
 +<code>
 +[ ] Physical Security
 +    [ ] Secure location identified and documented
 +    [ ] Access procedures established
 +    [ ] Key storage locations determined
 +    
 +[ ] Hardware
 +    [ ] Air gap server procured and tested
 +    [ ] Transport drives procured (minimum 2 for rotation)
 +    [ ] Report drive procured
 +    [ ] All drives labeled appropriately
 +    
 +[ ] Encryption
 +    [ ] GELI encryption configured and tested
 +    [ ] Encryption keys generated and stored securely
 +    [ ] Key recovery procedures documented
 +    [ ] Transport drives encrypted
 +    
 +[ ] Software
 +    [ ] FreeBSD installed and hardened
 +    [ ] ZFS pools created and tested
 +    [ ] Replication scripts developed and tested
 +    [ ] Maintenance scripts developed and tested
 +    [ ] Symmetric transport keys generated and deployed
 +    
 +[ ] Procedures
 +    [ ] Backup schedule documented
 +    [ ] Transport procedures documented
 +    [ ] Report review procedures documented
 +    [ ] Key rotation schedule established
 +    [ ] Disaster recovery plan created
 +    
 +[ ] Testing
 +    [ ] Full backup cycle tested end-to-end
 +    [ ] Recovery procedures tested
 +    [ ] Failure scenarios tested
 +    [ ] Report generation verified
 +    [ ] Automated shutdown verified
 +</code>
 +
 +===== Security Considerations =====
 +
 +**Threat Model:**
 +
 +This design protects against:
 +  * ✓ Remote network attacks (ransomware, unauthorized access)
 +  * ✓ Compromised source systems
 +  * ✓ Physical theft (with encryption)
 +  * ✓ Unauthorized physical access (with encryption)
 +
 +This design does NOT fully protect against:
 +  * ✗ Sophisticated attackers with physical access and unlimited time
 +  * ✗ Compromised encryption keys
 +  * ✗ Attacks on the transport process itself
 +  * ✗ Insider threats with authorized access
 +
 +**Best Practices:**
 +  * Rotate encryption keys annually
 +  * Test recovery procedures quarterly
 +  * Review audit logs monthly
 +  * Update maintenance scripts as needed
 +  * Keep offline backups of critical configuration
 +
 +===== Troubleshooting =====
 +
 +**Common Issues:**
 +
 +^ Problem ^ Symptom ^ Solution ^
 +| Transport drive not mounting | Server unable to find ''/dev/gpt/label'' | Verify GPT label, check dmesg for device detection |
 +| Decryption fails | OpenSSL reports bad decrypt error | Verify correct symmetric key in use, check file integrity, investigate potential tampering or corruption |
 +| Large delta size | Delta exceeds baseline by 200%+ | **Do not import** — investigate source system for compromise or legitimate growth |
 +| Server won't shutdown | Remains powered on after backup | Check ''/var/run/backup_complete'' flag, review error logs on report drive |
 +| ZFS pool won't import | Import command fails | Verify encryption key, check pool status with ''zpool import -F'' |
 +
 +===== Real-World Implementation =====
 +
 +==== Client Requirements ====
 +
 +A production deployment required the following specifications:
 +
 +^ Requirement ^ Implementation ^
 +| Replication Schedule | Monthly updates from in-house backup server to air gap server |
 +| Transport Media | 3× 1.9TB SSD drives in rotation |
 +| Drive Rotation | One at source, one at target, one in transit — minimizes site visits |
 +| Security Model | Multi-layer encryption with split-key architecture |
 +| Location | Air gap server in unsecured location (mandatory encryption) |
 +| Automation | Fully automated with maintenance script execution |
 +
 +==== Security Architecture ====
 +
 +**Encryption Layers:**
 +
 +  - **At Rest (Target):** ''GELI'' full disk encryption on air gap server
 +  - **In Transit:** Symmetric key encryption for all data on transport drives
 +  - **Maintenance Scripts:** Encrypted with same symmetric key
 +  - **Split-Key Design:** Target ''GELI'' key derived from:
 +    * Server-resident key component (stored locally)
 +    * Operator-carried key component (physical transport)
 +    * Combined via XOR bitwise operation at decrypt time
 +    * Target GELI key stored securely to facilitate key rotation and recovery
 +
 +<wrap round tip>
 +**Split-key advantage:** Neither component alone can decrypt the air gap server. Compromise of a single key (server or transport) does not expose data.
 +</wrap>
 +
 +==== Implementation Scripts ====
 +
 +Custom automation scripts handle the complete workflow. Source code is available via Subversion:
 +
 +**Repository URL:** ''http://svn.dailydata.net/svn/zfs_utils/trunk''\\
 +**Sub-project:** ''sneakernet''
 +
 +**Export the project:**
 +<code bash>
 +mkdir -p /usr/local/opt
 +svn export http://svn.dailydata.net/svn/zfs_utils/trunk /usr/local/opt/zfs_utils
 +</code>
 +
 +**Source Server Workflow:**
 +
 +  - Auto-detect operating mode (source vs. target)
 +  - Mount transport drive using GPT label detection
 +  - Verify transport drive processed by target (check status file)
 +  - Securely erase previous data from transport drive
 +  - Calculate incremental ZFS replication stream
 +  - Encrypt and write replication data to transport drive
 +  - Record latest snapshots sent (update status file)
 +  - Encrypt and write maintenance scripts to transport drive
 +  - Unmount transport drive
 +  - Email completion report to administrators
 +
 +**Target Server Workflow:**
 +
 +  - Mount transport drive
 +  - Detect operator-provided secure key (USB/separate media)
 +  - Combine server key with operator key (XOR operation)
 +  - Unlock ''GELI'' encrypted disks using combined key
 +  - Import ZFS pool
 +  - Save current snapshot list to state file (enable rollback if needed)
 +  - Decrypt and import replication streams from transport
 +  - Collect system statistics (pool health, disk status, capacity)
 +  - Decrypt and execute maintenance scripts
 +  - Generate detailed report and write to report drive
 +  - Unmount all media
 +  - Power off system
 +
 +<code bash>
 +# Example: Simplified detection logic
 +# this is actually accomplished within sneakernet automatically, so
 +# not necessary. This just shows the logic used.
 +HOSTNAME=$(hostname -s)
 +if [ "$HOSTNAME" = "backup-source" ]; then
 +    # Source mode
 +    /usr/local/sbin/sneakernet --mode=source
 +elif [ "$HOSTNAME" = "airgap-target" ]; then
 +    # Target mode
 +    /usr/local/sbin/sneakernet --mode=target
 +else
 +    echo "ERROR: Unknown host" >&2
 +    exit 1
 +fi
 +</code>
 +
 +==== Three-Drive Rotation Strategy ====
 +
 +The three-drive rotation minimizes operational overhead:
 +
 +**Normal Operation Cycle:**
 +
 +^ Month ^ Drive A ^ Drive B ^ Drive C ^ Action Required ^
 +| 1 | At Source (ready) | At Target | In Transit to Target | Operator: Deliver Drive C to target |
 +| 2 | At Source (ready) | In Transit to Source | At Target (ready) | Operator: Collect Drive B from target |
 +| 3 | In Transit to Target | At Source (ready) | At Target | Operator: Deliver Drive A to target |
 +| 4 | At Target | At Source (ready) | In Transit to Source | Operator: Collect Drive C from target |
 +
 +**Benefits:**
 +  * Each site visit handles both delivery and pickup
 +  * No waiting time for drive processing
 +  * Reduced frequency of site access (security benefit)
 +  * Built-in offline backup (data exists on multiple drives)
 +
 +==== Key Management Strategy ====
 +
 +**Symmetric Transport Key:**
 +  * Unique key per deployment
 +  * Stored on both source and target servers
 +  * Used to encrypt data and scripts on transport drives
 +
 +**Split GELI Key:**
 +  * Server component: Stored on target server (never leaves facility)
 +  * Operator component: Carried by trusted operator (never stored at target)
 +  * Combined at runtime via XOR: ''final_key = server_key ⊕ operator_key''
 +
 +**Key Rotation Procedures:**
 +
 +If transport drive compromised:
 +<code bash>
 +# Generate new symmetric key
 +openssl rand 32 | xxd -p | tr -d '\n' > /secure/path/new_transport.key
 +
 +# Deploy as maintenance script on next run
 +# Old data on compromised drive remains encrypted with old key
 +</code>
 +
 +If operator key compromised, retrieve the geli key from secure storage in hex format, then run the following commands:
 +<code bash>
 +# Generate new key pair
 +openssl rand 32 > operator.key
 +xxd -p -c 999 operator.key > operator.key.hex
 +# retrieve server.key from secure storage in binary format and run the following
 +# perl on-liner on them. This is not tested. The keys are in hex, not binary
 +# and the result is in hex (use xxd for two way processing)
 +perl -e '
 +    # Iterate over each byte index of the keys
 +    print join("", 
 +        map { 
 +            # Extract bytes and perform XOR
 +            sprintf("%02x", 
 +                hex(substr($ARGV[0], $_, 2)) ^ 
 +                hex(substr($ARGV[1], $_, 2))
 +            ) 
 +        } 
 +        0 .. (length($ARGV[0]) / 2 - 1)  # Calculate the number of bytes
 +    ) . "\n"  # Print the result
 +' 'operator.key.hex' 'server.key.hex'
 +
 +# Operator must use new key on next visit after updating the key on the air gap server
 +# If server.key is ever lost, must 
 +</code>
 +
 +<wrap round important>
 +**Key rotation can be automated** through maintenance scripts. New keys deployed during normal replication cycles without requiring emergency site visits.
 +</wrap>
 +
 +==== Operational Benefits ====
 +
 +This implementation balances security with operational efficiency:
 +
 +**Security Advantages:**
 +  * No single point of key compromise
 +  * Lost transport drive: data remains encrypted
 +  * Lost operator key: server data still protected
 +  * Automated key rotation capability
 +  * Audit trail via detailed reports
 +
 +**Operational Advantages:**
 +  * Minimal site visits (monthly vs. weekly)
 +  * No waiting time for processing
 +  * Fully automated operation (no manual commands)
 +  * Email reports from source (connected)
 +  * Physical reports from target (air-gapped)
 +  * Automated maintenance without network access
 +
 +
 +
 +===== References =====
 +
 +  * [[https://docs.freebsd.org/en/books/handbook/disks/#disks-encrypting-geli|FreeBSD GELI Encryption Documentation]]
 +  * [[https://docs.freebsd.org/en/books/handbook/zfs/|FreeBSD ZFS Administration Guide]]
 +  * [[https://www.openssl.org/docs/|OpenSSL Documentation]] — For symmetric encryption operations
 +  * [[https://www.nist.gov/publications/guide-storage-encryption-technologies-end-user-devices|NIST Storage Encryption Guide]]
 +
 +===== Related Documentation =====
 +
 +  * [[..:zfs_replication_scripts|ZFS Replication Scripts]] — Automated snapshot and transfer scripts
 +  * [[..:key_management_procedures|Encryption Key Management]] — Key generation, storage, and rotation
 +  * [[..:incident_response_airgap|Air Gap Incident Response Plan]] — What to do if compromise suspected
 +  * [[..:disaster_recovery_procedures|Disaster Recovery Procedures]] — Restoring from air gap backups