Difference between revisions of "Hypervisor 1RU 2024"

From W9CR
Jump to navigation Jump to search
 
(14 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
[[category:Keekles Infrastructure]]
 +
 
This is the build about the hypervisor in 2024.   
 
This is the build about the hypervisor in 2024.   
  
Line 182: Line 184:
 
= Software =
 
= Software =
  
  apt-get install sg3-utils-udev sdparm ledmon lsscsi net-tools nvme-cli lldpd rsyslog ipmitool vim unzip git fio sudo locate screen snmpd libsnmp-dev  
+
  apt-get install sg3-utils-udev sdparm ledmon lsscsi net-tools nvme-cli lldpd rsyslog ipmitool vim unzip git fio sudo locate screen snmpd libsnmp-dev mstflint
  
 
== configs ==
 
== configs ==
Line 213: Line 215:
  
 
* Reboot
 
* Reboot
 +
 +
=== /etc/hosts ===
 +
 +
This needs to be the same across the cluster
 +
 +
scp /etc/hosts root@fink:/etc/hosts
 +
 +
=== SSHD===
 +
 +
Configure SSH to only allow root via PSK and users via PSK.
  
 
=== snmp ===
 
=== snmp ===
Line 223: Line 235:
  
 
* Add a SNMPv3 user:
 
* Add a SNMPv3 user:
sudo net-snmp-config --create-snmpv3-user -ro -A AuthPassword -a MD5 -x AES privUser
 
 
   
 
   
  sudo net-snmp-config --create-snmpv3-user -ro -A keeklespasswd -a MD5 -x AES KeeklesSNMP
+
  sudo net-snmp-config --create-snmpv3-user -ro -A keeklesSNMPpasswd -a SHA -x AES -X keeklesSNMPpasswd KeeklesSNMP
 +
 
 +
* copy the /etc/snmp/snmpd.conf from carbonrod
 +
scp /etc/snmp/snmpd.conf root@fink:/etc/snmp/snmpd.conf
  
* copy the /etc/snmp/snmpd.conf from eyes to the subject server
 
scp /etc/snmp/snmpd.conf bryan@p25hub.w9cr.net:/home/bryan
 
mv /home/bryan/snmpd.conf /etc/snmp/snmpd.conf
 
  
 
* edit the /etc/snmp/snmpd.conf
 
* edit the /etc/snmp/snmpd.conf
Line 240: Line 251:
 
* Test SNMP  
 
* Test SNMP  
 
  # TEST
 
  # TEST
  snmpwalk -v3 -u KeeklesSNMP -l authNoPriv -a MD5 -A keeklespasswd 127.0.0.1
+
  snmpwalk -v3 -u KeeklesSNMP -l authPriv -a SHA -A keeklespasswd -x aes -X keeklespasswd 192.168.8.186
 
 
  
 
=== observium client ===
 
=== observium client ===
 +
* Do this from eyes
  
 
   export SERVER=<hostname>
 
   export SERVER=<hostname>
  
 
* on target server:
 
* on target server:
    sudo apt-get install snmpd php perl curl xinetd snmp libsnmp-dev libwww-perl  
+
sudo apt-get install snmpd php perl curl xinetd snmp libsnmp-dev libwww-perl  
    #only needed for postfix
+
#only needed for postfix
    apt-get -y install rrdtool mailgraph
+
apt-get -y install rrdtool mailgraph
    dpkg-reconfigure mailgraph
+
dpkg-reconfigure mailgraph
  
 
* install the current distro program on the target:
 
* install the current distro program on the target:
    sudo curl -o /usr/local/bin/distro https://gitlab.com/observium/distroscript/raw/master/distro
+
  sudo curl -o /usr/local/bin/distro https://gitlab.com/observium/distroscript/raw/master/distro
    sudo chmod +x /usr/local/bin/distro
+
  sudo chmod +x /usr/local/bin/distro
 +
 
 +
 
 
* From eyes
 
* From eyes
    scp /opt/observium/scripts/observium_agent_xinetd $SERVER:/etc/xinetd.d/observium_agent_xinetd
+
scp /opt/observium/scripts/observium_agent_xinetd $SERVER:/etc/xinetd.d/observium_agent_xinetd
    scp /opt/observium/scripts/observium_agent $SERVER:/usr/bin/observium_agent
+
scp /opt/observium/scripts/observium_agent $SERVER:/usr/bin/observium_agent
    ssh $SERVER mkdir -p /usr/lib/observium_agent
+
ssh $SERVER mkdir -p /usr/lib/observium_agent
    ssh $SERVER mkdir -p /usr/lib/observium_agent/scripts-available
+
ssh $SERVER mkdir -p /usr/lib/observium_agent/scripts-available
    ssh $SERVER mkdir -p /usr/lib/observium_agent/scripts-enabled
+
ssh $SERVER mkdir -p /usr/lib/observium_agent/scripts-enabled
    scp /opt/observium/scripts/agent-local/* $SERVER:/usr/lib/observium_agent/scripts-available
+
scp /opt/observium/scripts/agent-local/* $SERVER:/usr/lib/observium_agent/scripts-available
 
* on target enable the various allowed items
 
* on target enable the various allowed items
    ln -s /usr/lib/observium_agent/scripts-available/os /usr/lib/observium_agent/scripts-enabled
+
ln -s /usr/lib/observium_agent/scripts-available/os /usr/lib/observium_agent/scripts-enabled
 
     #ln -s /usr/lib/observium_agent/scripts-available/zimbra /usr/lib/observium_agent/scripts-enabled
 
     #ln -s /usr/lib/observium_agent/scripts-available/zimbra /usr/lib/observium_agent/scripts-enabled
    ln -s /usr/lib/observium_agent/scripts-available/dpkg /usr/lib/observium_agent/scripts-enabled
+
ln -s /usr/lib/observium_agent/scripts-available/dpkg /usr/lib/observium_agent/scripts-enabled
    ln -s /usr/lib/observium_agent/scripts-available/ntpd /usr/lib/observium_agent/scripts-enabled
+
ln -s /usr/lib/observium_agent/scripts-available/ntpd /usr/lib/observium_agent/scripts-enabled
    ln -s /usr/lib/observium_agent/scripts-available/virt-what /usr/lib/observium_agent/scripts-enabled  
+
ln -s /usr/lib/observium_agent/scripts-available/virt-what /usr/lib/observium_agent/scripts-enabled  
    #ln -s /usr/lib/observium_agent/scripts-available/postfix_mailgraph /usr/lib/observium_agent/scripts-enabled
+
ln -s /usr/lib/observium_agent/scripts-available/proxmox-qemu /usr/lib/observium_agent/scripts-enabled
 +
ln -s /usr/lib/observium_agent/scripts-available/postfix_mailgraph /usr/lib/observium_agent/scripts-enabled
  
  
 
* Edit /etc/xinetd.d/observium_agent_xinetd so the Observium server is allowed to connect. You can do this by substituting 127.0.0.1, or place your IP after it, separated by a space. Make sure to restart xinetd afterwards so the configuration file is read.
 
* Edit /etc/xinetd.d/observium_agent_xinetd so the Observium server is allowed to connect. You can do this by substituting 127.0.0.1, or place your IP after it, separated by a space. Make sure to restart xinetd afterwards so the configuration file is read.
    sudo service xinetd restart
+
sudo service xinetd restart
  
 
* Test from eyes
 
* Test from eyes
    telnet $SERVER 36602
+
telnet $SERVER 36602
    snmpwalk -v3 -u KeeklesSNMP -l authNoPriv -a MD5 -A keeklespasswd $SERVER
+
snmpwalk -v3 -u KeeklesSNMP -l authNoPriv -a MD5 -A keeklespasswd $SERVER
 
 
  
 
=== default editor ===
 
=== default editor ===
Line 298: Line 311:
 
=== Configure sudo ===
 
=== Configure sudo ===
  
echo "bryan ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers.d/00-admins
+
echo "bryan ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers.d/00-admins
  
 
=== Configure Postfix ===
 
=== Configure Postfix ===
Line 318: Line 331:
 
== NIC ==
 
== NIC ==
  
The MLX4 2x40g nic needs some setting in it's module to set to ethernet and not infaband mode.
+
The MLX4 2x40g nic needs some setting in it's module to set to Ethernet and not infiniband mode.
  
 
  echo "options mlx4_core port_type_array=2,2" >/etc/modprobe.d/mlx4.conf
 
  echo "options mlx4_core port_type_array=2,2" >/etc/modprobe.d/mlx4.conf
 +
 +
You'll need the PCI ID for the next ones to set the port type to 2 Ethernet.
 +
mstconfig -d 82:00.0 s LINK_TYPE_P1=2
 +
mstconfig -d 82:00.0 s LINK_TYPE_P2=2
 +
mstfwreset -d 82:00.0 -l3 -y reset
  
 
now to setup the mapping of port to mac for the qe0 and qe1 interfaces
 
now to setup the mapping of port to mac for the qe0 and qe1 interfaces
Line 354: Line 372:
  
 
  sed -Ezi.bak "s/(function\(orig_cmd\) \{)/\1\n\torig_cmd\(\);\n\treturn;/g" /usr/share/javascript/proxmox-widget-toolkit/proxmoxlib.js && systemctl restart pveproxy.service
 
  sed -Ezi.bak "s/(function\(orig_cmd\) \{)/\1\n\torig_cmd\(\);\n\treturn;/g" /usr/share/javascript/proxmox-widget-toolkit/proxmoxlib.js && systemctl restart pveproxy.service
 +
 +
== Join Server to existing cluster ==
 +
 +
pvecm add 192.168.8.180
 +
 +
<pre>
 +
root@Fink:~# pvecm add 192.168.8.180
 +
Please enter superuser (root) password for '192.168.8.180': **********
 +
Establishing API connection with host '192.168.8.180'
 +
The authenticity of host '192.168.8.180' can't be established.
 +
X509 SHA256 key fingerprint is 53:FD:4B:EE:AC:7A:2C:10:60:05:71:58:99:45:26:EA:26:07:62:C0:6C:1B:46:F6:8A:DC:3D:32:99:E0:55:51.
 +
Are you sure you want to continue connecting (yes/no)? yes
 +
Login succeeded.
 +
check cluster join API version
 +
No cluster network links passed explicitly, fallback to local node IP '192.168.8.186'
 +
Request addition of this node
 +
Join request OK, finishing setup locally
 +
stopping pve-cluster service
 +
backup old database to '/var/lib/pve-cluster/backup/config-1724515764.sql.gz'
 +
waiting for quorum...OK
 +
(re)generate node files
 +
generate new node certificate
 +
merge authorized SSH keys
 +
generated new node certificate, restart pveproxy and pvedaemon services
 +
successfully added node 'Fink' to cluster.
 +
</pre>
  
 
== local storage ==
 
== local storage ==
Line 369: Line 413:
 
  zfs mount localDataStore/var-lib-vz
 
  zfs mount localDataStore/var-lib-vz
 
  zfs destroy -v localDataStore/var-lib-vz@xfer
 
  zfs destroy -v localDataStore/var-lib-vz@xfer
 +
zfs set mountpoint=none rpool/data
 +
zfs set mountpoint=/localDataStore/proxmox localDataStore/proxmox
  
edit /etc/pve/storage.cfg
+
 
  dir: local
+
=== /etc/pve/storage.conf ===
         path /var/lib/vz
+
Make a new class LDS-zfs and assign it only to the new node.
         content iso,vztmpl,backup
+
 
 +
  zfspool: local-zfs
 +
         pool rpool/data
 +
        blocksize 64K
 +
         content rootdir,images
 +
        sparse 1
 +
        nodes SpiderPig Moleman CarbonRod
 
   
 
   
  zfspool: local-zfs
+
  zfspool: LDS-zfs
 
         pool localDataStore/proxmox
 
         pool localDataStore/proxmox
         sparse
+
         blocksize 32K
         content images,rootdir
+
         content rootdir,images
 +
        sparse 1
 +
        nodes Fink
  
 
= Reference material =
 
= Reference material =

Latest revision as of 14:08, 8 September 2024


This is the build about the hypervisor in 2024.

This is a 1 RU server with fast local SSD, aprox 18TB of storage.

Bios settings

A patched bios is needed for this to boot of UEFI and also enable PCI hotplug menu in the bios. I was able to edit this and have posted the latest version here.



Disk Layout

Front Looking at the front
Slot001 Slot003 Slot005 Slot007 Slot009
Slot000 Slot002 Slot004 Slot006 Slot008
Rear looking at rear
40g nic Upper-2 Upper-3
Lower-0 Lower-1

boot disk

Boot disk = 128g mirror ZFS 
  1              34            2047   1007.0 KiB  EF02
  2            2048         2099199   1024.0 MiB  EF00
  3         2099200       838860800    128.0 GiB  BF01

ZFS storage

/data

5 vdev's mirrored of the SAS disks 
2 mirror NVME 384g partition, but dedicated on the whole disk.  This can grow.
optional log and l2arc on the boot NVME's

partition the NVME

root@pve:~# gdisk /dev/nvme0n1
GPT fdisk (gdisk) version 1.0.9

Partition table scan:
  MBR: not present
  BSD: not present
  APM: not present
  GPT: not present

Creating new GPT entries in memory.

Command (? for help): n
Partition number (1-128, default 1): 1
First sector (6-390624994, default = 256) or {+-}size{KMGTP}: 2048
Last sector (2048-390624994, default = 390624767) or {+-}size{KMGTP}: +384G
Current type is 8300 (Linux filesystem)
Hex code or GUID (L to show codes, Enter = 8300): BF01
Changed type of partition to 'Solaris /usr & Mac ZFS'

Command (? for help): p
Disk /dev/nvme0n1: 390625000 sectors, 1.5 TiB
Model: MZ1LB1T9HBLS-000FB
Sector size (logical/physical): 4096/4096 bytes
Disk identifier (GUID): 5A039EE7-96B6-4D53-BEA4-56BCF48F4ABA
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 5
First usable sector is 6, last usable sector is 390624994
Partitions will be aligned on 256-sector boundaries
Total free space is 289961693 sectors (1.1 TiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048       100665343   384.0 GiB   BF01  Solaris /usr & Mac ZFS

Command (? for help): w

Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING
PARTITIONS!!

Do you want to proceed? (Y/N): y
OK; writing new GUID partition table (GPT) to /dev/nvme0n1.
The operation has completed successfully.

create the pool

Note that due to slot 8 and 9 being on their own 2 ports of 8 they will run faster than the rest and fill unequally if paired together.

zpool create -f -o ashift=12 -O compression=lz4 -O atime=off -O xattr=sa localDataStore \
mirror /dev/disk/by-enclosure-slot/front-slot000 /dev/disk/by-enclosure-slot/front-slot001 \
mirror /dev/disk/by-enclosure-slot/front-slot002 /dev/disk/by-enclosure-slot/front-slot003 \
mirror /dev/disk/by-enclosure-slot/front-slot004 /dev/disk/by-enclosure-slot/front-slot005 \
mirror /dev/disk/by-enclosure-slot/front-slot006 /dev/disk/by-enclosure-slot/front-slot008 \
mirror /dev/disk/by-enclosure-slot/front-slot007 /dev/disk/by-enclosure-slot/front-slot009 \
special mirror /dev/disk/by-enclosure-slot/nvme-upper-2-part1 /dev/disk/by-enclosure-slot/nvme-lower-0-part1

Backplane

The onboard backplane is a BPN-SAS3-116A-N2 which has 8 SAS disks and then 2 NVME or SAS. Howerver this last thing is not true if you want to run 10 SAS disks. The right most NVME/SAS ports are called "SAS2" ports on the backplane, but are really SATA ports, and connected to the onboard SATA. As this is a backplane only, not an expander each physical SAS port from the controller is connected to one SAS drive. Since the included controller only had 8 ports, a 16 port controller is used.


NVME name spaces

https://narasimhan-v.github.io/2020/06/12/Managing-NVMe-Namespaces.html

The NVME come setup as 1.88T disks:

tnvmcap   : 1,880,375,648,256
unvmcap   : 375,648,256

I suspect this is a 2.0 TiB (2,199,023,255,552b) provisioned down in the controller or about 85%. moving this to a 1.6TB disk will under provision this and make it perform better in the event we use it as log or write intensive.

Ensure we're on 4096 bytes

nvme id-ns -H /dev/nvme0n1 | grep "LBA Format"
nvme id-ns -H /dev/nvme1n1 | grep "LBA Format"
nvme id-ns -H /dev/nvme2n1 | grep "LBA Format"
nvme id-ns -H /dev/nvme3n1 | grep "LBA Format"

1 & 2 Detatch the name space

nvme detach-ns /dev/nvme1 -namespace-id=1 -controllers=4
nvme detach-ns /dev/nvme3 -namespace-id=1 -controllers=4
nvme detach-ns /dev/nvme2 -namespace-id=1 -controllers=4
nvme detach-ns /dev/nvme0 -namespace-id=1 -controllers=4

Delete the namespace

nvme delete-ns /dev/nvme1 -namespace-id=1
nvme delete-ns /dev/nvme3 -namespace-id=1
nvme delete-ns /dev/nvme2 -namespace-id=1
nvme delete-ns /dev/nvme0 -namespace-id=1

Make the new namespace

nvme create-ns /dev/nvme1 --nsze-si=1.6T --ncap-si=1.6T --flbas=0 --dps=0 --nmic=0
nvme create-ns /dev/nvme3 --nsze-si=1.6T --ncap-si=1.6T --flbas=0 --dps=0 --nmic=0
nvme create-ns /dev/nvme2 --nsze-si=1.6T --ncap-si=1.6T --flbas=0 --dps=0 --nmic=0
nvme create-ns /dev/nvme0 --nsze-si=1.6T --ncap-si=1.6T --flbas=0 --dps=0 --nmic=0

Attach the namespace to the controller

nvme attach-ns /dev/nvme1 --namespace-id=1 --controllers=4
nvme attach-ns /dev/nvme3 --namespace-id=1 --controllers=4
nvme attach-ns /dev/nvme2 --namespace-id=1 --controllers=4
nvme attach-ns /dev/nvme0 --namespace-id=1 --controllers=4

reset the controller to make it visable to the OS

nvme reset /dev/nvme1
nvme reset /dev/nvme3
nvme reset /dev/nvme2
nvme reset /dev/nvme0

Confirm it

nvme list
Node                  Generic               SN                   Model                                    Namespace Usage                      Format           FW Rev
--------------------- --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme4n1          /dev/ng4n1            S64ANS0T515282K      Samsung SSD 980 1TB                      1         114.21  GB /   1.00  TB    512   B +  0 B   2B4QFXO7
/dev/nvme3n1          /dev/ng3n1            S5XANA0R537286       MZ1LB1T9HBLS-000FB                       1           0.00   B /   1.60  TB      4 KiB +  0 B   EDW73F2Q
/dev/nvme2n1          /dev/ng2n1            S5XANA0R694994       MZ1LB1T9HBLS-000FB                       1         157.62  GB /   1.88  TB      4 KiB +  0 B   EDW73F2Q
/dev/nvme1n1          /dev/ng1n1            S5XANA0R682634       MZ1LB1T9HBLS-000FB                       1         157.47  GB /   1.88  TB      4 KiB +  0 B   EDW73F2Q
/dev/nvme0n1          /dev/ng0n1            S5XANA0R682645       MZ1LB1T9HBLS-000FB                       1           0.00   B /   1.60  TB      4 KiB +  0 B   EDW73F2Q

udev rules

Look here for the info on setting this up

Software

apt-get install sg3-utils-udev sdparm ledmon lsscsi net-tools nvme-cli lldpd rsyslog ipmitool vim unzip git fio sudo locate screen snmpd libsnmp-dev mstflint

configs

Screen

echo -e "#Bryan Config for scroll back buffer\ntermcapinfo xterm|xterms|xs|rxvt ti@:te@" >>/etc/screenrc

Bash Completion

configure bash completion for interactive shells

vim /etc/bash.bashrc
uncomment the stuff below 
# enable bash completion in interactive shells 
#add in zfs completition
. /usr/share/bash-completion/completions/zfs

root profile

# You may uncomment the following lines if you want `ls' to be colorized:
export LS_OPTIONS='--color=auto'
eval "$(dircolors)"
alias ls='ls $LS_OPTIONS'

hostname

echo Fink >/etc/hostname
  • Edit the /etc/hosts
127.0.0.1 localhost.localdomain localhost
192.168.8.186 Fink.keekles.org Fink
  • Reboot

/etc/hosts

This needs to be the same across the cluster

scp /etc/hosts root@fink:/etc/hosts

SSHD

Configure SSH to only allow root via PSK and users via PSK.

snmp

  • Install SNMP
sudo apt-get -y install snmp snmpd libsnmp-dev
  • Stop the snmpd service so we can add a user
sudo service snmpd stop
  • Add a SNMPv3 user:
sudo net-snmp-config --create-snmpv3-user -ro -A keeklesSNMPpasswd -a SHA -x AES -X keeklesSNMPpasswd KeeklesSNMP
  • copy the /etc/snmp/snmpd.conf from carbonrod
scp /etc/snmp/snmpd.conf root@fink:/etc/snmp/snmpd.conf


  • edit the /etc/snmp/snmpd.conf
vim /etc/snmp/snmpd.conf
Update the syslocation and such
  • restart snmpd
service snmpd start
  • Test SNMP
# TEST
snmpwalk -v3 -u KeeklesSNMP -l authPriv -a SHA -A keeklespasswd -x aes -X keeklespasswd 192.168.8.186

observium client

  • Do this from eyes
  export SERVER=<hostname>
  • on target server:
sudo apt-get install snmpd php perl curl xinetd snmp libsnmp-dev libwww-perl 
#only needed for postfix
apt-get -y install rrdtool mailgraph
dpkg-reconfigure mailgraph
  • install the current distro program on the target:
 sudo curl -o /usr/local/bin/distro https://gitlab.com/observium/distroscript/raw/master/distro
 sudo chmod +x /usr/local/bin/distro


  • From eyes
scp /opt/observium/scripts/observium_agent_xinetd $SERVER:/etc/xinetd.d/observium_agent_xinetd
scp /opt/observium/scripts/observium_agent $SERVER:/usr/bin/observium_agent
ssh $SERVER mkdir -p /usr/lib/observium_agent
ssh $SERVER mkdir -p /usr/lib/observium_agent/scripts-available
ssh $SERVER mkdir -p /usr/lib/observium_agent/scripts-enabled
scp /opt/observium/scripts/agent-local/* $SERVER:/usr/lib/observium_agent/scripts-available
  • on target enable the various allowed items
ln -s /usr/lib/observium_agent/scripts-available/os /usr/lib/observium_agent/scripts-enabled
   #ln -s /usr/lib/observium_agent/scripts-available/zimbra /usr/lib/observium_agent/scripts-enabled
ln -s /usr/lib/observium_agent/scripts-available/dpkg /usr/lib/observium_agent/scripts-enabled
ln -s /usr/lib/observium_agent/scripts-available/ntpd /usr/lib/observium_agent/scripts-enabled
ln -s /usr/lib/observium_agent/scripts-available/virt-what /usr/lib/observium_agent/scripts-enabled 
ln -s /usr/lib/observium_agent/scripts-available/proxmox-qemu /usr/lib/observium_agent/scripts-enabled
ln -s /usr/lib/observium_agent/scripts-available/postfix_mailgraph /usr/lib/observium_agent/scripts-enabled


  • Edit /etc/xinetd.d/observium_agent_xinetd so the Observium server is allowed to connect. You can do this by substituting 127.0.0.1, or place your IP after it, separated by a space. Make sure to restart xinetd afterwards so the configuration file is read.
sudo service xinetd restart
  • Test from eyes
telnet $SERVER 36602
snmpwalk -v3 -u KeeklesSNMP -l authNoPriv -a MD5 -A keeklespasswd $SERVER

default editor

update-alternatives --config editor 
Then select #3 vim.basic

timezone

sudo timedatectl set-timezone UTC

sudo config

add local user accounts

useradd -D -s /bin/bash -m -U -G sudo bryan
  • copy over ssh
rsync -avz /home/bryan/.ssh root@192.168.8.186:/home/bryan/

Configure sudo

echo "bryan ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers.d/00-admins

Configure Postfix

Postfix is installed to forward mail for root to a smtp host.

apt-get install postfix mailutils

This will run an installer with a curses interface and you must select Satallite System. Check the System mail name is the hostname of the server, and the SMTP relay host is morty.keekles.org. Root and postmaster mail should be rootmail@allstarlink.org.

Should you need to reconfigure this use:

dpkg-reconfigure postfix

other aliases are setup in /etc/aliases. You must run newaliases after this is updated for them to take effect.

Network

NIC

The MLX4 2x40g nic needs some setting in it's module to set to Ethernet and not infiniband mode.

echo "options mlx4_core port_type_array=2,2" >/etc/modprobe.d/mlx4.conf

You'll need the PCI ID for the next ones to set the port type to 2 Ethernet.

mstconfig -d 82:00.0 s LINK_TYPE_P1=2
mstconfig -d 82:00.0 s LINK_TYPE_P2=2
mstfwreset -d 82:00.0 -l3 -y reset

now to setup the mapping of port to mac for the qe0 and qe1 interfaces

echo -e \
'SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{dev_id}=="0x0", ATTR{type}=="1", ATTR{address}=="00:02:c9:37:bc:80", NAME="qe0"'"\n"\
'SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{dev_id}=="0x0", ATTR{type}=="1", ATTR{address}=="00:02:c9:37:bc:81", NAME="qe1"' \
 >/etc/udev/rules.d/70-persistent-net.rules

/etc/network/interfaces

Setup this based on the other servers

/etc/network/if-up.d/00-keekles.sh

copy this file to the server for others.

/etc/pve/localfirewall.sh

Copy this from the other servers. this probally is not needed as once it joins the cluster it should appear there.

domain names

Edit this in the dns to set it up

Fink.keekles.org        3600    IN      A       23.149.104.16
Fink.keekles.org        3600    IN      AAAA    2602:2af:0:1::a16

proxmox config

remove the nag

From https://johnscs.com/remove-proxmox51-subscription-notice/

sed -Ezi.bak "s/(function\(orig_cmd\) \{)/\1\n\torig_cmd\(\);\n\treturn;/g" /usr/share/javascript/proxmox-widget-toolkit/proxmoxlib.js && systemctl restart pveproxy.service

Join Server to existing cluster

pvecm add 192.168.8.180
root@Fink:~# pvecm add 192.168.8.180
Please enter superuser (root) password for '192.168.8.180': **********
Establishing API connection with host '192.168.8.180'
The authenticity of host '192.168.8.180' can't be established.
X509 SHA256 key fingerprint is 53:FD:4B:EE:AC:7A:2C:10:60:05:71:58:99:45:26:EA:26:07:62:C0:6C:1B:46:F6:8A:DC:3D:32:99:E0:55:51.
Are you sure you want to continue connecting (yes/no)? yes
Login succeeded.
check cluster join API version
No cluster network links passed explicitly, fallback to local node IP '192.168.8.186'
Request addition of this node
Join request OK, finishing setup locally
stopping pve-cluster service
backup old database to '/var/lib/pve-cluster/backup/config-1724515764.sql.gz'
waiting for quorum...OK
(re)generate node files
generate new node certificate
merge authorized SSH keys
generated new node certificate, restart pveproxy and pvedaemon services
successfully added node 'Fink' to cluster.

local storage

localDataStore is the pool for this

make a zvol for this

zfs create localDataStore/proxmox

copy it over from the rpool

zfs snapshot rpool/var-lib-vz@xfer
zfs send -vR rpool/var-lib-vz@xfer | zfs receive -Fdu localDataStore
zfs umount rpool/var-lib-vz
zfs destroy -r rpool/var-lib-vz
zfs mount localDataStore/var-lib-vz
zfs destroy -v localDataStore/var-lib-vz@xfer
zfs set mountpoint=none rpool/data
zfs set mountpoint=/localDataStore/proxmox localDataStore/proxmox


/etc/pve/storage.conf

Make a new class LDS-zfs and assign it only to the new node.

zfspool: local-zfs
        pool rpool/data
        blocksize 64K
        content rootdir,images
        sparse 1
        nodes SpiderPig Moleman CarbonRod

zfspool: LDS-zfs
        pool localDataStore/proxmox
        blocksize 32K
        content rootdir,images
        sparse 1
        nodes Fink

Reference material

Samsung SSD 845DC 04 Over-provisioning

QNAP - SSD Over-provisioning White Paper

Innodisk SATADOM-SL Datasheet

https://medium.com/@reefland/over-provisioning-ssd-for-increased-performance-and-write-endurance-142feb015b4e