Storage Server
My documents on building a storage server and proxmox host.
Contents
Parts
- X10DSC+ Motherboard
- AOM-S3108M-H8L RAID/HBA
- AOC-MTG-I4S quad SFP+ NIC
- 8x32 GB RAM (256g)
- 24x Seagate ST16000NM002G Exos X16 16TB 12Gb/s SAS
- 375 GB Optane SSD for l2arc/ZIL SSDPEL1K375GA01
- 3 4tb m2 SSD for Special Devices
Disk Layout
ZFS zraid2
Zpool 3 - 6 disk raidz2 1 - 5 disk raidz2 Special - 3 m2 flash Zil - 16 GiB - optane l2arc - 333 GiB - optane Note optane is shared between Zil and l2arc
Space on this should be 15 disks of 16 tb, with redundancy or 240tb. This will leave one disk open as a hot spare.
The sata disks will be linux boot disks in a standard linux raid (maybe zfs mirror?)
SAS controller
The built in controller supports HBA mode per supermicro.
https://docs.broadcom.com/doc/pub-005110
https://www.reddit.com/r/homelab/comments/iqz7xc/supermicro_s3108_in_jbod_mode/
Doesn't work in JBOD mode, everything is proxied.
Updating Disks
Firmware
Current firmware on the disks is E004, but disks have E002 on them
for i in `seq 2 25` ; do SeaChest_Firmware --downloadFW ./EvansExosX16SAS-STD-512E-E004.LOD -d /dev/sg$i; done #SeaChest_Firmware -s ========================================================================================== SeaChest_Firmware - Seagate drive utilities - NVMe Enabled Copyright (c) 2014-2021 Seagate Technology LLC and/or its Affiliates, All Rights Reserved SeaChest_Firmware Version: 3.0.0-2_2_1 X86_64 Build Date: Apr 27 2021 Today: Tue Oct 10 22:14:38 2023 User: root ========================================================================================== nvme_ioctl_id: Inappropriate ioctl for device Vendor Handle Model Number Serial Number FwRev LSI /dev/sg0 SAS3x28 0705 LSI /dev/sg1 SAS3x28 0705 SEAGATE /dev/sg10 ST16000NM002G ZL20AJ3P E004 SEAGATE /dev/sg11 ST16000NM002G ZL231860 E004 SEAGATE /dev/sg12 ST16000NM002G ZL21T6Q1 E004 SEAGATE /dev/sg13 ST16000NM002G ZL231PV0 E004 SEAGATE /dev/sg14 ST16000NM002G ZL21TDHH E004 SEAGATE /dev/sg15 ST16000NM002G ZL21Y85R E004 SEAGATE /dev/sg16 ST16000NM002G ZL21S9FQ E004 SEAGATE /dev/sg17 ST16000NM002G ZL21T38M E004 SEAGATE /dev/sg18 ST16000NM002G ZL22C3MJ E004 SEAGATE /dev/sg19 ST16000NM002G ZL21S9J3 E004 SEAGATE /dev/sg2 ST16000NM002G ZL21S9WK E004 SEAGATE /dev/sg20 ST16000NM002G ZL21S9AW E004 SEAGATE /dev/sg21 ST16000NM002G ZL21TGY1 E004 SEAGATE /dev/sg22 ST16000NM002G ZL20CRL7 E004 SEAGATE /dev/sg23 ST16000NM002G ZL21RP9E E004 SEAGATE /dev/sg24 ST16000NM002G ZL21RNZW E004 SEAGATE /dev/sg25 ST16000NM002G ZL21JYXF E004 ATA /dev/sg26 Samsung SSD 870 EVO 1TB S75BNL0W812633P SVT03B6Q ATA /dev/sg27 Samsung SSD 870 EVO 1TB S75BNS0W642820L SVT03B6Q SEAGATE /dev/sg3 ST16000NM002G ZL21V48W E004 SEAGATE /dev/sg4 ST16000NM002G ZL21T7XK E004 SEAGATE /dev/sg5 ST16000NM002G ZL21T8HS E004 SEAGATE /dev/sg6 ST16000NM002G ZL21SBMS E004 SEAGATE /dev/sg7 ST16000NM002G ZL21SRYP E004 SEAGATE /dev/sg8 ST16000NM002G ZL21LVPQ E004 SEAGATE /dev/sg9 ST16000NM002G ZL21TB40 E004 NVMe /dev/nvme0n1 Samsung SSD 990 PRO 4TB S7KGNJ0W912464T 0B2QJXG7
Low Level Format
It's necessary to low level format these disks as we need to turn off Protection Type 2, test the sectors of the drive and make it a 4096 sector size.
# SeaChest_Format --protectionType 0 --formatUnit 4096 --confirm this-will-erase-data --poll -d /dev/sg2
NVME Config
This was the most dificult. The onboard NVME ports under the power supply will take a standard NVME cable, but do not supply any power. The M.2 to SFF-8612 adapters don't have a power connector, so I was able to add some 3.3v inputs on the filter cap. This was kinda hacky, but it works.
Optane
The optane SSD needs to be changed to 4k sectors, but the nvme command doesn't work. Intel has a intelmas program that does https://community.intel.com/t5/Intel-Optane-Solid-State-Drives/4K-format-on-Optane-SSD-P1600X/m-p/1477181
root@pve01:~# nvme id-ns /dev/nvme2n1 -H -n 1 <snip> LBA Format 0 : Metadata Size: 0 bytes - Data Size: 512 bytes - Relative Performance: 0x2 Good (in use) LBA Format 1 : Metadata Size: 8 bytes - Data Size: 512 bytes - Relative Performance: 0x2 Good LBA Format 2 : Metadata Size: 16 bytes - Data Size: 512 bytes - Relative Performance: 0x2 Good LBA Format 3 : Metadata Size: 0 bytes - Data Size: 4096 bytes - Relative Performance: 0 Best LBA Format 4 : Metadata Size: 8 bytes - Data Size: 4096 bytes - Relative Performance: 0 Best LBA Format 5 : Metadata Size: 64 bytes - Data Size: 4096 bytes - Relative Performance: 0 Best LBA Format 6 : Metadata Size: 128 bytes - Data Size: 4096 bytes - Relative Performance: 0 Best
root@pve01:~# intelmas start -intelssd 2 -nvmeformat LBAFormat=3 WARNING! You have selected to format the drive! Proceed with the format? (Y|N): y Formatting...(This can take several minutes to complete) - Intel Optane(TM) SSD DC P4801X Series PHKM2051009H375A - Status : NVMeFormat successful. root@pve01:~# nvme list Node Generic SN Model Namespace Usage Format FW Rev --------------------- --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- -------- /dev/nvme3n1 /dev/ng3n1 S7KGNJ0W912446D Samsung SSD 990 PRO 4TB 1 449.25 GB / 4.00 TB 512 B + 0 B 0B2QJXG7 /dev/nvme2n1 /dev/ng2n1 PHKM2051009H375A INTEL SSDPEL1K375GA 1 375.08 GB / 375.08 GB 4 KiB + 0 B E2010600 /dev/nvme1n1 /dev/ng1n1 S7KGNJ0W912452X Samsung SSD 990 PRO 4TB 1 449.27 GB / 4.00 TB 512 B + 0 B 0B2QJXG7 /dev/nvme0n1 /dev/ng0n1 S7KGNJ0W912464T Samsung SSD 990 PRO 4TB 1 449.27 GB / 4.00 TB 512 B + 0 B 0B2QJXG7
Partition the disk
Disk /dev/nvme2n1: 91573146 sectors, 349.3 GiB Model: INTEL SSDPEL1K375GA Sector size (logical/physical): 4096/4096 bytes Disk identifier (GUID): 70C34BA9-1175-42A5-B00E-16CFC022861F Partition table holds up to 128 entries Main partition table begins at sector 2 and ends at sector 5 First usable sector is 6, last usable sector is 91573140 Partitions will be aligned on 256-sector boundaries Total free space is 1395599 sectors (5.3 GiB) Number Start (sector) End (sector) Size Code Name 1 256 8388863 32.0 GiB BF01 Solaris /usr & Mac ZFS 2 8388864 90177791 312.0 GiB BF01 Solaris /usr & Mac ZFS Command (? for help): w Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING PARTITIONS!! Do you want to proceed? (Y/N): y OK; writing new GUID partition table (GPT) to /dev/nvme2n1. The operation has completed successfully.
Phsyical layout
Mandatory config
The disks don't deal with the old NVME support on this box (it's listed as oculink .91, not 1.0) and there are some bugs in linux around this and power handling.
A config must be added to the kernel cmdline
vim /etc/kernel/cmdline
add this to it:
nvme_core.default_ps_max_latency_us=0 pcie_aspm=off
then
proxmox-boot-tool refresh
udev rules
https://www.reactivated.net/writing_udev_rules.html#strsubst
https://github.com/bkus/by-enclosure-slot
iscsi target
https://forum.level1techs.com/t/has-anyone-here-tried-to-create-an-iscsi-target-in-proxmox/193862
https://www.reddit.com/r/homelab/comments/ih374t/poor_linux_iscsi_target_performance_tips/
mtu!
sudo ip link set eth1 mtu 9000
"Backstore name is too long for "INQUIRY_MODEL" iscsi"