Planned Storage Prompt
[FOR FUTURE SESSION - DO NOT RUN TONIGHT]
You are conducting a comprehensive read-only investigation of all storage layers in the homelab. This is investigation only - do not modify any storage config, RAID config, filesystem state, or run any operation that could affect data integrity.
GOAL: Build a complete picture of storage architecture, health, and risks. Output feeds into a separate decision-making session about storage improvements.
LAYER 1: Proxmox host local storage
- pvesm status (all configured storages and their states)
- pvs / vgs / lvs (LVM structure)
- df -h (filesystem usage on host)
- lsblk -o NAME,SIZE,TYPE,FSTYPE,MOUNTPOINT,SERIAL,MODEL,WWN
- cat /proc/mdstat (any mdadm arrays on host)
- P410i RAID controller status via ssacli or hpssacli if installed:
- ssacli controller all show config detail
- Note: if ssacli not installed, skip - do not install tonight
- SMART data for each physical drive on the host:
- smartctl -a /dev/sdX for each drive
- Focus on: Reallocated_Sector_Ct, Power_On_Hours, Wear_Leveling_Count, Media_Wearout_Indicator
- The Crucial M4 SSDs (Bay 5-8) are 14 years old, special attention needed
- The EG0450 SAS HDDs (Bay 2-4) and SPCC SSD (Bay 1) - basic health check
LAYER 2: VM 189 homeNas internal storage (read-only via SSH to VM 189)
- ssh homenas "lsblk -o NAME,SIZE,TYPE,FSTYPE,MOUNTPOINT" - what disks are passed through
- ssh homenas "cat /proc/mdstat" - mdadm arrays state
- ssh homenas "mdadm --detail /dev/md0 /dev/md1" - array health
- ssh homenas "btrfs filesystem show" - btrfs filesystem state
- ssh homenas "btrfs filesystem df /path/to/mount" - btrfs usage detail
- ssh homenas "btrfs scrub status /path/to/mount" - last scrub state
- ssh homenas "smartctl -a /dev/sdX" for each passthrough disk
LAYER 3: NFS exports configuration (read-only)
- ssh homenas "cat /etc/exports" - what's currently exported
- ssh homenas "showmount -e localhost 2>/dev/null" - exports as visible
- From PVE host: cat /etc/pve/storage.cfg | grep -A 5 "nfs:" - how Proxmox sees them
- Identify which NFS exports are referenced by destroyed VMs (orphan tracking)
LAYER 4: Backup state
- Check for any existing Proxmox Backup Server connections: cat /etc/pve/storage.cfg | grep pbs
- Check for any cron/systemd backup jobs: ls /etc/cron.* | grep -i backup
- Check for any vzdump scheduled tasks: ls /etc/pve/jobs.cfg 2>/dev/null
- Honest finding expected: no backups exist anywhere
LAYER 5: Filesystem health on Proxmox host
- dmesg | grep -iE "ata|sata|sas|scsi|raid|btrfs|ext4|error|fail" | tail -50 - look for storage errors in kernel log
- journalctl -k --since "30 days ago" | grep -iE "i/o error|medium error|unrecovered" - serious storage errors in last 30 days
REPORTING (no recommendations yet, just facts):
Section A: Physical drive inventory with health metrics
- Per drive: model, serial, capacity, power-on hours, wear level if SSD, reallocated sectors, SMART overall verdict
- Highlight drives showing any concerning indicator
Section B: RAID/array health
- P410i logical drive states
- mdadm array states (clean / degraded / resyncing / missing devices)
- btrfs filesystem state (no errors / errors detected / scrub overdue)
Section C: Capacity and utilization
- Per storage layer: total / used / available
- Highlight any approaching capacity limits
Section D: Configuration documentation
- The complete picture of how storage flows: physical drives → RAID → filesystem → exports → consumers
- This is the documentation that should have existed but probably doesn't
Section E: Honest risk assessment
- Identified single points of failure
- Hardware showing age (M4 SSDs specifically)
- Architectural concerns (mdadm-on-btrfs nest doll, no backups)
- What "data loss scenarios" exist today
Section F: Open questions for the user
- Specific decisions needed about retention, restoration, architecture changes
- Things the investigation surfaced that need human judgment
This is pure investigation. No changes. Just facts and honest assessment.