Difference between revisions of "HP DL360 G7 Hypervisor"
Line 356: | Line 356: | ||
== Fix the /etc/pve/storage.cfg == | == Fix the /etc/pve/storage.cfg == | ||
− | vim /etc/pve/storage.cfg | + | vim /etc/pve/storage.cfg |
− | Move MoleMan from local-zfs to localDataStore | + | Move MoleMan from local-zfs to localDataStore |
+ | |||
+ | = Reboot = | ||
+ | Reboot and check everything is working on the cluster and move VM's |
Revision as of 22:55, 3 September 2024
This describes the build process for Proxmox 8.2 on existing Compaq/HP DL360 G7 Server
These have served me well for several years and while the could stand to be upgraded, they are maxed out on CPU, RAM and disk, and have plenty of life left in them. As I'd installed Proxmox years ago, and before I knew what I know now regarding the naming local storage for ZFS pools, I decided to rebuild them during the 8.2 upgrade.
Contents
Config
The Servers consist of the following:
- 2x Xenon X5675 @3.07 GHz (12 real/24 virtual cores)
- 192 GB (12x16gb DDR3 1600 MHz) RAM
- 8x SAS 1.6 TB SSD HUSMM141CLAR1600
- LSI SAS2308 storage controller - Firmware link
- Mellanox MT27520 ConnectX-3 Pro dual 10g Nic
Disk Layout
I use local ZFS storage on most of my HVs, and as such proxmox cannot migrate or replicate VMs or CTs from nodes with different pool names. By default everything is rpool, and rpool is the root disks, which on other newer servers are used for booting only. This means that one of the core features of proxmox will not function unless it's fixed.
As I'm not wanting to invest more in these servers and figure out how to put a boot disk in, we must boot off the existing SSD's. This has the limitation of requiring 512b sector disks and GRUB as the servers are too old to use UEFI boot and native 4k disks. Proxmox does write to it's root disk so vmware kinda boot (ie USB boot or DOM SATA) will not work, but the IO requirements are not that intense. The decision was made to use part of the disks as a raidz3 pool with 16gb per disk making 80gb reserved for the root device.
The localDataStore pool will be raidz1 with the remaining space on each device. zraid1 was chosen as each VM is fully backed up and even if >1 disks fail, the server will still boot as we can lose 3 disks before it's unbootable. I've not lost a disk on these servers before either, and cold spares are on site.
We'll also pad out the space between the root pool partition and the localDataStore partitions with a 16GiByte partition, allowing easy expansion of the root pool if needed in the future. Finally we'll leave 16MiB of space at the end of each disk.
Thus we'll have the following:
Disk /dev/sda: 3125627568 sectors, 1.5 TiB Model: HUSMM141CLAR1600 Sector size (logical/physical): 512/4096 bytes Disk identifier (GUID): 7E89E66A-F81F-431A-B71A-F8BA7583CDC1 Partition table holds up to 128 entries Main partition table begins at sector 2 and ends at sector 33 First usable sector is 34, last usable sector is 3125627534 Partitions will be aligned on 8-sector boundaries Total free space is 32774 sectors (16.0 MiB) Number Start (sector) End (sector) Size Code Name 1 34 2047 1007.0 KiB EF02 2 2048 2099199 1024.0 MiB EF00 3 2099200 33554432 15.0 GiB BF01 4 33554440 67108871 16.0 GiB BF01 Solaris /usr & Mac ZFS 5 67108872 3125594767 1.4 TiB BF01 Solaris /usr & Mac ZFS
# zpool list -v NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT localDataStore 11.4T 2.27T 9.12T - - 0% 19% 1.00x ONLINE - raidz1-0 11.4T 2.27T 9.12T - - 0% 19.9% - ONLINE scsi-35000cca0504cf90c-part5 1.42T - - - - - - - ONLINE scsi-35000cca05053df48-part5 1.42T - - - - - - - ONLINE scsi-35000cca0504d201c-part5 1.42T - - - - - - - ONLINE scsi-35000cca0506d06e0-part5 1.42T - - - - - - - ONLINE scsi-35000cca0504d3378-part5 1.42T - - - - - - - ONLINE scsi-35000cca0506a76bc-part5 1.42T - - - - - - - ONLINE scsi-35000cca0504c33c8-part5 1.42T - - - - - - - ONLINE scsi-35000cca0504d1e3c-part5 1.42T - - - - - - - ONLINE rpool 119G 66.4G 52.6G - - 1% 55% 1.00x ONLINE - raidz3-0 119G 66.4G 52.6G - - 1% 55.8% - ONLINE scsi-35000cca0504cf90c-part3 15.0G - - - - - - - ONLINE scsi-35000cca05053df48-part3 15.0G - - - - - - - ONLINE scsi-35000cca0504d201c-part3 15.0G - - - - - - - ONLINE scsi-35000cca0506d06e0-part3 15.0G - - - - - - - ONLINE scsi-35000cca0504d3378-part3 15.0G - - - - - - - ONLINE scsi-35000cca0506a76bc-part3 15.0G - - - - - - - ONLINE scsi-35000cca0504c33c8-part3 15.0G - - - - - - - ONLINE scsi-35000cca0504d1e3c-part3 15.0G - - - - - - - ONLINE
Preparation
As these are currently running in the pool, we need to:
- Migrate VM's off of the HV
- Ensure shutdown VMs are backuped in PBS
- Backup the remaining rpool
- Check that the ILO works
ZFS backup to FatTony
Backup the pve to /root/pve
cp -av /etc/pve/ /root/pve-etc
Make a pool to store this on fattony
ssh root@192.168.8.184 zfs create testpool/`hostname`-old
Make a snapshot of the rootfs
zfs snapshot rpool/ROOT/pve-1@20240903-01
Send it to the host
zfs send rpool/ROOT/pve-1@20240903-01 |pv| ssh -c aes128-gcm@openssh.com 192.168.8.184 zfs recv -Fdu -o canmount=noauto testpool/`hostname`-old
Do the Install
Put the 8.2 iso on the server and connect it to the ILO CD drive http://192.168.8.150/~bryan/zfs/proxmox-ve_8.2-2.iso
This may take 5-10 min to load, look the the http server logs.
At the GRUB prompt you will need to add
video=1024x768@60 nomodeset
after the linux line in the gui installer
'e' then edit as below and hit ctrl-x <image>
accept the GNU license
Setup the Disks
- zfs RAIDZ-3
- all disks
- ashift = 12
- compress on
- checksum on
- copies 1
- ARC max size 16384
- hdsize 16.0 GB
'set the county and time zone
- United states
- America/New_york (you need to use arrow keys, using the mouse doesn't work)
- Keyboard U.S. English
set root passwd and email
setup the network
we have to manually set this as it needs a vlan tag.
Install
- reboot when done
- don't forget to remove the CD
post reboot
network
The nics are on bonding so we need to setup LACP/bonding
brctl delif vmbr0 ens2 ip link add bond0 type bond mode 802.3ad ip link set ens2 down ip link set ens2 master bond0 ip link set ens2d1 master bond0 ip link add link bond0 name bond0.42 type vlan id 42 ip link set bond0 up ip link set bond0.42 up brctl addif vmbr0 bond0.42
The network should be up and you can use ssh from here
Now push the old config from FatTony
scp -o UserKnownHostsFile=/dev/null /etc/hosts Moleman:/etc/hosts scp -o UserKnownHostsFile=/dev/null /testpool/MoleMan-old/ROOT/pve-1/etc/ssh/ssh_host_* MoleMan:/etc/ssh/
restart ssh on molman
scp /root/.ssh/authorized_keys root@MoleMan://root/.ssh/authorized_keys scp /testpool/MoleMan-old/ROOT/pve-1/etc/network/interfaces root@MoleMan:/etc/network/interfaces scp /testpool/MoleMan-old/ROOT/pve-1/etc/udev/rules.d/70-persistent-net.rules root@MoleMan:/etc/udev/rules.d/ scp /testpool/MoleMan-old/ROOT/pve-1/etc/network/if-up.d/00-keekles.sh MoleMan:/etc/network/if-up.d/00-keekles.sh
Reboot now verify the network is up
Add Software
Configure the repos
mv /etc/apt/sources.list.d/ceph.list /etc/apt/sources.list.d/ceph.list.bak mv /etc/apt/sources.list.d/pve-enterprise.list /etc/apt/sources.list.d/pve-enterprise.bak echo -e "deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription \ndeb http://download.proxmox.com/debian bookworm pve-no-subscription" >/etc/apt/sources.list.d/pve-no-subscription.list
apt-get update apt-get install sg3-utils-udev sdparm ledmon lsscsi net-tools nvme-cli lldpd rsyslog ipmitool vim unzip git fio sudo locate screen snmpd libsnmp-dev mstflint pv
Run through the standard configs
https://wiki.w9cr.net/index.php/Hypervisor_1RU_2024#Software
Add users
useradd -s /bin/bash -m -U -G sudo bryan useradd -s /bin/bash -m -U -G sudo stacy echo "bryan ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers.d/00-admins echo "stacy ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers.d/00-admins
From Fattony
rsync -avz /testpool/carbonrod-old/ROOT/pve-1/home/bryan/.ssh root@moleman:/home/bryan/ rsync -avz /testpool/carbonrod-old/ROOT/pve-1/home/stacy/.ssh root@moleman:/home/stacy/
Configure the MTA
Configure Observium
This has mutiple parts, and the hardest is the config of the hp tools for the ILO3. This requires a dummy package and some other custom links to get stuff working.
As part of the SNMP config we'll be making this snmpv3 only.
- Install needed packages
sudo apt-get install snmpd php perl curl xinetd snmp libsnmp-dev libwww-perl rrdtool mailgraph dpkg-reconfigure mailgraph
- install the current distro program on the target:
sudo curl -o /usr/local/bin/distro https://gitlab.com/observium/distroscript/raw/master/distro sudo chmod +x /usr/local/bin/distro
- Do this from eyes
export SERVER=<hostname> scp /opt/observium/scripts/observium_agent_xinetd $SERVER:/etc/xinetd.d/observium_agent_xinetd scp /opt/observium/scripts/observium_agent $SERVER:/usr/bin/observium_agent ssh $SERVER mkdir -p /usr/lib/observium_agent ssh $SERVER mkdir -p /usr/lib/observium_agent/scripts-available ssh $SERVER mkdir -p /usr/lib/observium_agent/scripts-enabled scp /opt/observium/scripts/agent-local/* $SERVER:/usr/lib/observium_agent/scripts-available
- on target enable the various allowed items
ln -s /usr/lib/observium_agent/scripts-available/os /usr/lib/observium_agent/scripts-enabled #ln -s /usr/lib/observium_agent/scripts-available/zimbra /usr/lib/observium_agent/scripts-enabled ln -s /usr/lib/observium_agent/scripts-available/dpkg /usr/lib/observium_agent/scripts-enabled ln -s /usr/lib/observium_agent/scripts-available/ntpd /usr/lib/observium_agent/scripts-enabled ln -s /usr/lib/observium_agent/scripts-available/virt-what /usr/lib/observium_agent/scripts-enabled ln -s /usr/lib/observium_agent/scripts-available/proxmox-qemu /usr/lib/observium_agent/scripts-enabled ln -s /usr/lib/observium_agent/scripts-available/postfix_mailgraph /usr/lib/observium_agent/scripts-enabled
- Edit /etc/xinetd.d/observium_agent_xinetd so the Observium server is allowed to connect. You can do this by substituting 127.0.0.1, or place your IP after it, separated by a space. Make sure to restart xinetd afterwards so the configuration file is read.
sudo service xinetd restart
- Test from eyes
telnet $SERVER 36602
Setup SNMP and HP utils
- configure the HP 10.80 repo for the HPutils
echo "deb http://downloads.linux.hpe.com/SDR/repo/mcp bionic/10.80 non-free" > /etc/apt/sources.list.d/hp-mcp.list curl -sS https://downloads.linux.hpe.com/SDR/hpPublicKey2048.pub | gpg --dearmor > /etc/apt/trusted.gpg.d/hpPublicKey2048.gpg curl -sS https://downloads.linux.hpe.com/SDR/hpPublicKey2048_key1.pub | gpg --dearmor > /etc/apt/trusted.gpg.d/hpPublicKey2048_key1.gpg curl -sS https://downloads.linux.hpe.com/SDR/hpePublicKey2048_key1.pub | gpg --dearmor > /etc/apt/trusted.gpg.d/hpePublicKey2048_key1.gpg
- install the libc6-i386 equivs package from carbonrod
scp /root/libc6-i686_2.36_all.deb moleman:/root/
Now on the host
apt-get install libc6-i386 dpkg -i ./libc6-i686_2.36_all.deb
- Install the HP utils
apt-get update apt-get install hp-snmp-agents hp-health
- The following symlinks are needed for the older version to work.
ln -s /usr/lib/x86_64-linux-gnu/libnetsnmpmibs.so.40 /usr/lib/x86_64-linux-gnu/libnetsnmpmibs.so.30 ln -s /usr/lib/x86_64-linux-gnu/libnetsnmpagent.so.40 /usr/lib/x86_64-linux-gnu/libnetsnmpagent.so.30 ln -s /usr/lib/x86_64-linux-gnu/libnetsnmp.so.40 /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
- copy over the old snmpd.conf from FatTony
scp /testpool/MoleMan-old/ROOT/pve-1/etc/snmp/snmpd.conf root@moleman:/etc/snmp/snmpd.conf
- edit the snmpd.conf on MoleMan
Remove the 'rocommunity line'
vim /etc/snmp/snmpd.conf
- Make a SNMP User
sudo service snmpd stop sudo service hp-health stop sudo service hp-asrd stop sudo service hp-snmp-agents stop sudo net-snmp-config --create-snmpv3-user -ro -A keeklesSNMPpasswd -a SHA -x AES -X keeklesSNMPpasswd KeeklesSNMP sudo service hp-health start sudo service hp-asrd start sudo service hp-snmp-agents start sudo service snmpd start
- test it from eyes
snmpwalk -v3 -u KeeklesSNMP -l authPriv -a SHA -A keeklesSNMPpasswd -x aes -X keeklesSNMPpasswd MoleMan 1.3.6.1.4.1.232
localDataStore config
Configure the disks
For sda - sdh we need to add some partitions. We must have them alligned on the 4096 byte boundary.
Total free space is 32774 sectors (16.0 MiB) Number Start (sector) End (sector) Size Code Name 1 34 2047 1007.0 KiB EF02 2 2048 2099199 1024.0 MiB EF00 3 2099200 33554432 15.0 GiB BF01 4 33554440 67108871 16.0 GiB BF01 Solaris /usr & Mac ZFS 5 67108872 3125594767 1.4 TiB BF01 Solaris /usr & Mac ZFS
1024*1024*16/512 = 32768 sectors from end 3125627534 - last sector end should be 67108872 - 3125594767 =3,058,485,895 sectors =
- install partprobe
apt-get install parted partprobe
Configure the ZFS for localDataStore
zpool create -f -o ashift=12 -O compression=lz4 -O atime=off -O xattr=sa localDataStore \ raidz1 /dev/sda5 /dev/sdb5 /dev/sdc5 /dev/sdd5 /dev/sde5 /dev/sdf5 /dev/sdg5 /dev/sdh5 zpool export localDataStore zpool import -d /dev/disk/by-id/ localDataStore zfs create localDataStore/proxmox zfs snapshot rpool/var-lib-vz@xfer zfs send -vR rpool/var-lib-vz@xfer | zfs receive -Fdu localDataStore zfs umount rpool/var-lib-vz zfs destroy -r rpool/var-lib-vz zfs mount localDataStore/var-lib-vz zfs destroy -v localDataStore/var-lib-vz@xfer zfs set mountpoint=none rpool/data zfs set mountpoint=/localDataStore/proxmox localDataStore/proxmox
- reboot the node now
Add node to cluster
- from a node in the cluster delete the old node
pvecm delnode MoleMan
- now add the node
pvecm add fink
Please enter superuser (root) password for 'fink': ********** Establishing API connection with host 'fink' The authenticity of host 'fink' can't be established. X509 SHA256 key fingerprint is D2:5C:21:E4:CE:8B:3A:6C:27:31:79:B3:52:5A:FE:83:3B:8F:8C:A2:28:F2:4D:E1:E6:DB:03:14:40:BA:F1:B6. Are you sure you want to continue connecting (yes/no)? yes Login succeeded. check cluster join API version No cluster network links passed explicitly, fallback to local node IP '192.168.8.181' Request addition of this node Join request OK, finishing setup locally stopping pve-cluster service backup old database to '/var/lib/pve-cluster/backup/config-1725421258.sql.gz' waiting for quorum...OK (re)generate node files generate new node certificate merge authorized SSH keys generated new node certificate, restart pveproxy and pvedaemon services successfully added node 'MoleMan' to cluster.
Fix the /etc/pve/storage.cfg
vim /etc/pve/storage.cfg Move MoleMan from local-zfs to localDataStore
Reboot
Reboot and check everything is working on the cluster and move VM's