Backblaze storage pods¶
In case of storage pod hard disk problems it can be difficult to identify the disk from the error messages. Linux error messages identify disks by ATA number or by sdX such as sdc. For hardware maintenance, disks need to be identified by backplane number and slot.
A report like this allows error messages to be translated to physical location:
Backplane Socket sdx ata Serial 1 1 sdc ata7.00 Y6N1KE53FTMB 2 1 sdd ata8.00 WD-WMC1T1802712 3 1 sde ata9.00 WD-WCC4ELZ72L40 5 1 sdf ata11.00 Y6N1KE56FTMB 7 1 sdg ata13.00 VDGKJ03D 8 1 sdh ata14.00 WD-WX21D25R5PP4 10 1 sdi ata16.00 19P1K5R6FTMBThe report is generated by this bash function. msg is a messaging function which terminates the script when called with E for error
#-------------------------- # Name: generate_report # Purpose: # * Generates SATA backplane usage report # Usage: generate_report # Global variable set: none # Outputs: writes dated report file; removes if same as last one # Returns: # 0 on success and warning. Does not return on error #-------------------------- function generate_report { local buf cmd last_out_fn oIFS out out_dir out_fn declare -A ata # Get the ATA number for each sdX # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Based on syntaxerror's script in # https://serverfault.com/questions/244944/linux-ata-errors-translating-to-a-device-name oIFS=$IFS while IFS=' ' read Path HostFull sdx do IFS=: h=($HostFull) HostMain=${h[0]}; HostMid=${h[1]}; HostSub=${h[2]} if echo $Path | grep -q '/usb[0-9]*/'; then msg I "Device $sdx is not an ATA device, it is a USB device" else ata["$sdx"]=ata$(< "$Path/host$HostMain/scsi_host/host$HostMain/unique_id").$HostMid$HostSub fi done < <( for i in /sys/block/sd* do readlink $i \ | sed \ -e 's|\.\./devices|/sys/devices|' \ -e 's|/host[0-9]\{1,2\}/target| |' \ -e 's|/[0-9]\{1,2\}\(:[0-9]\)\{3\}/block/| |' done ) IFS=$oIFS # Generate report data # ~~~~~~~~~~~~~~~~~~~~ out=$'Backplane Socket sdx ata Serial\n' out+=$( while read backplane socket sdx serno do msg D "backplane: $backplane, socket: $socket, sdx: $sdx, serno: $serno" ((backplane=backplane-6+1)) ((socket++)) printf '%9s %6s %4s %8s %s\n' $backplane $socket $sdx ${ata[$sdx]} $serno done < <( lshw -class disk 2>&1 \ | grep -E '^ (bus info|logical name|serial)' \ | sed -e 's/^[[:space:]]*//' \ | xargs -L 3 \ | grep -Ev 'scsi@(0|1):0.0.0' \ | sed -e 's/bus info: scsi@//' -e 's|logical name: /dev/||' \ | sed -e 's/serial: //' -e 's/:/ /' -e 's/\.0\.0//' \ | sort -n ) ) # Check reports directory # ~~~~~~~~~~~~~~~~~~~~~~~ out_dir=/var/backup/sata_backplane_usage buf=$(ck_file "$out_dir" d:rwx 2>&1) [[ $buf != '' ]] && msg E "$buf" # Get name of most recent report # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ last_out_fn=$(ls -1rt "$out_dir" | tail -1) [[ $last_out_fn != '' ]] && last_out_fn=$out_dir/$last_out_fn # Write report to file # ~~~~~~~~~~~~~~~~~~~~ out_fn=$out_dir/$(date +%Y-%m-%d@%H:%M:%S).report msg I "Writing report $out_fn" buf=$(echo "$out" > "$out_fn" 2>&1) [[ $buf != '' ]] && msg E "Writing report: $buf" # Remove report file if same as previous one # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ if [[ $last_out_fn != '' ]]; then cmd=(diff --brief "$last_out_fn" "$out_fn") buf=$("${cmd[@]}" 2>&1) if (($?==0)); then msg I "Removing report $out_fn because identical to the last report" buf=$(rm "$out_fn" 2>&1) [[ $buf != '' ]] && msg E "Removing $out_fn: $buf" elif (($?==1)); then msg W "SATA backplane usage changed. Reports in $out_dir" elif (($?==2)); then msg E "${cmd[*]}: $buf" fi fi return 0 } # end of function generate_reportThe full script is available at A Backblaze storage pod storage management utility for Linux