Difference between revisions of "Hypervisor 1RU 2024"

From W9CR
Jump to navigation Jump to search
Line 339: Line 339:
  
 
Copy this from the other servers.  this probally is not needed as once it joins the cluster it should appear there.
 
Copy this from the other servers.  this probally is not needed as once it joins the cluster it should appear there.
 +
 +
=== domain names ===
 +
 +
Edit this in the dns to set it up
 +
 +
Fink.keekles.org        3600    IN      A      23.149.104.16
 +
Fink.keekles.org        3600    IN      AAAA    2602:2af:0:1::a16
  
 
= proxmox config =
 
= proxmox config =

Revision as of 11:29, 24 August 2024

This is the build about the hypervisor in 2024.

This is a 1 RU server with fast local SSD, aprox 18TB of storage.

Bios settings

A patched bios is needed for this to boot of UEFI and also enable PCI hotplug menu in the bios. I was able to edit this and have posted the latest version here.



Disk Layout

Front Looking at the front
Slot001 Slot003 Slot005 Slot007 Slot009
Slot000 Slot002 Slot004 Slot006 Slot008
Rear looking at rear
40g nic Upper-2 Upper-3
Lower-0 Lower-1

boot disk

Boot disk = 128g mirror ZFS 
  1              34            2047   1007.0 KiB  EF02
  2            2048         2099199   1024.0 MiB  EF00
  3         2099200       838860800    128.0 GiB  BF01

ZFS storage

/data

5 vdev's mirrored of the SAS disks 
2 mirror NVME 384g partition, but dedicated on the whole disk.  This can grow.
optional log and l2arc on the boot NVME's

partition the NVME

root@pve:~# gdisk /dev/nvme0n1
GPT fdisk (gdisk) version 1.0.9

Partition table scan:
  MBR: not present
  BSD: not present
  APM: not present
  GPT: not present

Creating new GPT entries in memory.

Command (? for help): n
Partition number (1-128, default 1): 1
First sector (6-390624994, default = 256) or {+-}size{KMGTP}: 2048
Last sector (2048-390624994, default = 390624767) or {+-}size{KMGTP}: +384G
Current type is 8300 (Linux filesystem)
Hex code or GUID (L to show codes, Enter = 8300): BF01
Changed type of partition to 'Solaris /usr & Mac ZFS'

Command (? for help): p
Disk /dev/nvme0n1: 390625000 sectors, 1.5 TiB
Model: MZ1LB1T9HBLS-000FB
Sector size (logical/physical): 4096/4096 bytes
Disk identifier (GUID): 5A039EE7-96B6-4D53-BEA4-56BCF48F4ABA
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 5
First usable sector is 6, last usable sector is 390624994
Partitions will be aligned on 256-sector boundaries
Total free space is 289961693 sectors (1.1 TiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048       100665343   384.0 GiB   BF01  Solaris /usr & Mac ZFS

Command (? for help): w

Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING
PARTITIONS!!

Do you want to proceed? (Y/N): y
OK; writing new GUID partition table (GPT) to /dev/nvme0n1.
The operation has completed successfully.

create the pool

Note that due to slot 8 and 9 being on their own 2 ports of 8 they will run faster than the rest and fill unequally if paired together.

zpool create -f -o ashift=12 -O compression=lz4 -O atime=off -O xattr=sa localDataStore \
mirror /dev/disk/by-enclosure-slot/front-slot000 /dev/disk/by-enclosure-slot/front-slot001 \
mirror /dev/disk/by-enclosure-slot/front-slot002 /dev/disk/by-enclosure-slot/front-slot003 \
mirror /dev/disk/by-enclosure-slot/front-slot004 /dev/disk/by-enclosure-slot/front-slot005 \
mirror /dev/disk/by-enclosure-slot/front-slot006 /dev/disk/by-enclosure-slot/front-slot008 \
mirror /dev/disk/by-enclosure-slot/front-slot007 /dev/disk/by-enclosure-slot/front-slot009 \
special mirror /dev/disk/by-enclosure-slot/nvme-upper-2-part1 /dev/disk/by-enclosure-slot/nvme-lower-0-part1

Backplane

The onboard backplane is a BPN-SAS3-116A-N2 which has 8 SAS disks and then 2 NVME or SAS. Howerver this last thing is not true if you want to run 10 SAS disks. The right most NVME/SAS ports are called "SAS2" ports on the backplane, but are really SATA ports, and connected to the onboard SATA. As this is a backplane only, not an expander each physical SAS port from the controller is connected to one SAS drive. Since the included controller only had 8 ports, a 16 port controller is used.


NVME name spaces

https://narasimhan-v.github.io/2020/06/12/Managing-NVMe-Namespaces.html

The NVME come setup as 1.88T disks:

tnvmcap   : 1,880,375,648,256
unvmcap   : 375,648,256

I suspect this is a 2.0 TiB (2,199,023,255,552b) provisioned down in the controller or about 85%. moving this to a 1.6TB disk will under provision this and make it perform better in the event we use it as log or write intensive.

Ensure we're on 4096 bytes

nvme id-ns -H /dev/nvme0n1 | grep "LBA Format"
nvme id-ns -H /dev/nvme1n1 | grep "LBA Format"
nvme id-ns -H /dev/nvme2n1 | grep "LBA Format"
nvme id-ns -H /dev/nvme3n1 | grep "LBA Format"

1 & 2 Detatch the name space

nvme detach-ns /dev/nvme1 -namespace-id=1 -controllers=4
nvme detach-ns /dev/nvme3 -namespace-id=1 -controllers=4
nvme detach-ns /dev/nvme2 -namespace-id=1 -controllers=4
nvme detach-ns /dev/nvme0 -namespace-id=1 -controllers=4

Delete the namespace

nvme delete-ns /dev/nvme1 -namespace-id=1
nvme delete-ns /dev/nvme3 -namespace-id=1
nvme delete-ns /dev/nvme2 -namespace-id=1
nvme delete-ns /dev/nvme0 -namespace-id=1

Make the new namespace

nvme create-ns /dev/nvme1 --nsze-si=1.6T --ncap-si=1.6T --flbas=0 --dps=0 --nmic=0
nvme create-ns /dev/nvme3 --nsze-si=1.6T --ncap-si=1.6T --flbas=0 --dps=0 --nmic=0
nvme create-ns /dev/nvme2 --nsze-si=1.6T --ncap-si=1.6T --flbas=0 --dps=0 --nmic=0
nvme create-ns /dev/nvme0 --nsze-si=1.6T --ncap-si=1.6T --flbas=0 --dps=0 --nmic=0

Attach the namespace to the controller

nvme attach-ns /dev/nvme1 --namespace-id=1 --controllers=4
nvme attach-ns /dev/nvme3 --namespace-id=1 --controllers=4
nvme attach-ns /dev/nvme2 --namespace-id=1 --controllers=4
nvme attach-ns /dev/nvme0 --namespace-id=1 --controllers=4

reset the controller to make it visable to the OS

nvme reset /dev/nvme1
nvme reset /dev/nvme3
nvme reset /dev/nvme2
nvme reset /dev/nvme0

Confirm it

nvme list
Node                  Generic               SN                   Model                                    Namespace Usage                      Format           FW Rev
--------------------- --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme4n1          /dev/ng4n1            S64ANS0T515282K      Samsung SSD 980 1TB                      1         114.21  GB /   1.00  TB    512   B +  0 B   2B4QFXO7
/dev/nvme3n1          /dev/ng3n1            S5XANA0R537286       MZ1LB1T9HBLS-000FB                       1           0.00   B /   1.60  TB      4 KiB +  0 B   EDW73F2Q
/dev/nvme2n1          /dev/ng2n1            S5XANA0R694994       MZ1LB1T9HBLS-000FB                       1         157.62  GB /   1.88  TB      4 KiB +  0 B   EDW73F2Q
/dev/nvme1n1          /dev/ng1n1            S5XANA0R682634       MZ1LB1T9HBLS-000FB                       1         157.47  GB /   1.88  TB      4 KiB +  0 B   EDW73F2Q
/dev/nvme0n1          /dev/ng0n1            S5XANA0R682645       MZ1LB1T9HBLS-000FB                       1           0.00   B /   1.60  TB      4 KiB +  0 B   EDW73F2Q

udev rules

Look here for the info on setting this up

Software

apt-get install sg3-utils-udev sdparm ledmon lsscsi net-tools nvme-cli lldpd rsyslog ipmitool vim unzip git fio sudo locate screen snmpd libsnmp-dev 

configs

Screen

echo -e "#Bryan Config for scroll back buffer\ntermcapinfo xterm|xterms|xs|rxvt ti@:te@" >>/etc/screenrc

Bash Completion

configure bash completion for interactive shells

vim /etc/bash.bashrc
uncomment the stuff below 
# enable bash completion in interactive shells 
#add in zfs completition
. /usr/share/bash-completion/completions/zfs

root profile

# You may uncomment the following lines if you want `ls' to be colorized:
export LS_OPTIONS='--color=auto'
eval "$(dircolors)"
alias ls='ls $LS_OPTIONS'

hostname

echo Fink >/etc/hostname
  • Edit the /etc/hosts
127.0.0.1 localhost.localdomain localhost
192.168.8.186 Fink.keekles.org Fink
  • Reboot

snmp

  • Install SNMP
sudo apt-get -y install snmp snmpd libsnmp-dev
  • Stop the snmpd service so we can add a user
sudo service snmpd stop
  • Add a SNMPv3 user:
sudo net-snmp-config --create-snmpv3-user -ro -A AuthPassword -a MD5 -x AES privUser

sudo net-snmp-config --create-snmpv3-user -ro -A keeklespasswd -a MD5 -x AES KeeklesSNMP
  • copy the /etc/snmp/snmpd.conf from eyes to the subject server
scp /etc/snmp/snmpd.conf bryan@p25hub.w9cr.net:/home/bryan
mv /home/bryan/snmpd.conf /etc/snmp/snmpd.conf
  • edit the /etc/snmp/snmpd.conf
vim /etc/snmp/snmpd.conf
Update the syslocation and such
  • restart snmpd
service snmpd start
  • Test SNMP
# TEST
snmpwalk -v3 -u KeeklesSNMP -l authNoPriv -a MD5 -A keeklespasswd 127.0.0.1


observium client

  export SERVER=<hostname>
  • on target server:
   sudo apt-get install snmpd php perl curl xinetd snmp libsnmp-dev libwww-perl 
   #only needed for postfix
   apt-get -y install rrdtool mailgraph
   dpkg-reconfigure mailgraph
  • install the current distro program on the target:
   sudo curl -o /usr/local/bin/distro https://gitlab.com/observium/distroscript/raw/master/distro
   sudo chmod +x /usr/local/bin/distro
  • From eyes
   scp /opt/observium/scripts/observium_agent_xinetd $SERVER:/etc/xinetd.d/observium_agent_xinetd
   scp /opt/observium/scripts/observium_agent $SERVER:/usr/bin/observium_agent
   ssh $SERVER mkdir -p /usr/lib/observium_agent
   ssh $SERVER mkdir -p /usr/lib/observium_agent/scripts-available
   ssh $SERVER mkdir -p /usr/lib/observium_agent/scripts-enabled
   scp /opt/observium/scripts/agent-local/* $SERVER:/usr/lib/observium_agent/scripts-available
  • on target enable the various allowed items
   ln -s /usr/lib/observium_agent/scripts-available/os /usr/lib/observium_agent/scripts-enabled
   #ln -s /usr/lib/observium_agent/scripts-available/zimbra /usr/lib/observium_agent/scripts-enabled
   ln -s /usr/lib/observium_agent/scripts-available/dpkg /usr/lib/observium_agent/scripts-enabled
   ln -s /usr/lib/observium_agent/scripts-available/ntpd /usr/lib/observium_agent/scripts-enabled
   ln -s /usr/lib/observium_agent/scripts-available/virt-what /usr/lib/observium_agent/scripts-enabled 
   #ln -s /usr/lib/observium_agent/scripts-available/postfix_mailgraph /usr/lib/observium_agent/scripts-enabled


  • Edit /etc/xinetd.d/observium_agent_xinetd so the Observium server is allowed to connect. You can do this by substituting 127.0.0.1, or place your IP after it, separated by a space. Make sure to restart xinetd afterwards so the configuration file is read.
   sudo service xinetd restart
  • Test from eyes
   telnet $SERVER 36602
   snmpwalk -v3 -u KeeklesSNMP -l authNoPriv -a MD5 -A keeklespasswd $SERVER


default editor

update-alternatives --config editor 
Then select #3 vim.basic

timezone

sudo timedatectl set-timezone UTC

sudo config

add local user accounts

useradd -D -s /bin/bash -m -U -G sudo bryan
  • copy over ssh
rsync -avz /home/bryan/.ssh root@192.168.8.186:/home/bryan/

Configure sudo

echo "bryan ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers.d/00-admins

Configure Postfix

Postfix is installed to forward mail for root to a smtp host.

apt-get install postfix mailutils

This will run an installer with a curses interface and you must select Satallite System. Check the System mail name is the hostname of the server, and the SMTP relay host is morty.keekles.org. Root and postmaster mail should be rootmail@allstarlink.org.

Should you need to reconfigure this use:

dpkg-reconfigure postfix

other aliases are setup in /etc/aliases. You must run newaliases after this is updated for them to take effect.

Network

NIC

The MLX4 2x40g nic needs some setting in it's module to set to ethernet and not infaband mode.

echo "options mlx4_core port_type_array=2,2" >/etc/modprobe.d/mlx4.conf

now to setup the mapping of port to mac for the qe0 and qe1 interfaces

echo -e \
'SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{dev_id}=="0x0", ATTR{type}=="1", ATTR{address}=="00:02:c9:37:bc:80", NAME="qe0"'"\n"\
'SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{dev_id}=="0x0", ATTR{type}=="1", ATTR{address}=="00:02:c9:37:bc:81", NAME="qe1"' \
 >/etc/udev/rules.d/70-persistent-net.rules

/etc/network/interfaces

Setup this based on the other servers

/etc/network/if-up.d/00-keekles.sh

copy this file to the server for others.

/etc/pve/localfirewall.sh

Copy this from the other servers. this probally is not needed as once it joins the cluster it should appear there.

domain names

Edit this in the dns to set it up

Fink.keekles.org        3600    IN      A       23.149.104.16
Fink.keekles.org        3600    IN      AAAA    2602:2af:0:1::a16

proxmox config

remove the nag

From https://johnscs.com/remove-proxmox51-subscription-notice/

sed -Ezi.bak "s/(function\(orig_cmd\) \{)/\1\n\torig_cmd\(\);\n\treturn;/g" /usr/share/javascript/proxmox-widget-toolkit/proxmoxlib.js && systemctl restart pveproxy.service

local storage

localDataStore is the pool for this

make a zvol for this

zfs create localDataStore/proxmox

copy it over from the rpool

zfs snapshot rpool/var-lib-vz@xfer
zfs send -vR rpool/var-lib-vz@xfer | zfs receive -Fdu localDataStore
zfs umount rpool/var-lib-vz
zfs destroy -r rpool/var-lib-vz
zfs mount localDataStore/var-lib-vz
zfs destroy -v localDataStore/var-lib-vz@xfer

edit /etc/pve/storage.cfg

dir: local
        path /var/lib/vz
        content iso,vztmpl,backup

zfspool: local-zfs
        pool localDataStore/proxmox
        sparse
        content images,rootdir

Reference material

Samsung SSD 845DC 04 Over-provisioning

QNAP - SSD Over-provisioning White Paper

Innodisk SATADOM-SL Datasheet

https://medium.com/@reefland/over-provisioning-ssd-for-increased-performance-and-write-endurance-142feb015b4e