Monday, May 17, 2021

HPE H240 - the new value king of HBAs

With the release of ESXi 7.0 came a farewell to the vmklinux driver stack. Device vendors focused on writing native drivers for current generation hardware, meaning that a large subset of network cards and storage controllers were left behind (a near full list of deprecated devices can be found on vDan's blog: https://vdan.cz/deprecated-devices-supported-by-vmklinux-drivers-in-esxi-7-0/). One device in particular were the LSI 92xx series of HBAs, which utilized the mpt2sas driver. These are widely used for 6.x vSAN as well as other storage server operating systems. While the 92xx series can still be used in BSD and Linux based systems, this leaves a gap for those who want to run vSAN 7.0. The LSI 93xx is readily available and uses the lsi_msgpt3 native driver, but typically runs in the $80-100+ range.


The new value king is... an unlikely candidate. It isn't vendor neutral, although by my testing it should work on most mainstream systems. It advertises 12Gb/s connectivity, but still uses the old style mini-SAS SFF-8087 connectors. The new value king for vSAN 7.0 usage is the HPE Smart HBA H240. At a price range of $22-30 (as of this writing) depending on the bracket needed, the H240 proves to be a pretty capable card. It supports RAID 0, 1 and 5, but I wouldn't recommend it for this use case as it doesn't have cache. What is critical about this card is that it has a native driver, which is supported in ESXi 7.0 and is a part of the standard ESXi image.


The major concern I had was if this card would work in a non-HPE system. My homelab is comprised of unorthodox and whiteboxed machines. The Cisco UCS C220 M4 is the only complete server that I have - the 2 node vSAN cluster I had on 6.x comprised of a Dell Precision 7810 and a SuperMicro X10 motherboard in an E-ATX case. Introducing the card to both systems went without issue - all drives are detected and it defaulted to drive passthrough mode (non-RAID). One caveat is that I am using directly cabled drives - the only backplane I have to test with would be the Cisco and it doesn't appear to support hot swap. The other issue I've found is that you cannot set the controller as a boot device, although I didn't purchase it for this purpose. If you're looking for these capabilities, I would suggest sticking with HPE Proliant servers, or find a cheaper LSI 93xx series controller.


For my use case, the HPE H240 was a drop in replacement that brought my old 2 node vSAN cluster onto 7.0 without much drama. The H8SCM micro atx server remains on 6.7, but is more than capable of running the 7.0 witness appliance. Here's a few shots of the environment post-HBA swap:




Wednesday, May 12, 2021

Cisco UCS M4 - Out of band management update post-Flash era (the easy way)

Historically, the easiest way to update Cisco UCS CIMC firmware has been to load the update ISO in the KVM virtual media, reboot the server and check off the updates required. As Flash has been discontinued, folks can no longer login to the CIMC and access KVM. The only official workaround is to perform a dd copy from a supported Linux distribution with a custom bash script to a USB drive. 

Fortunately, there is a much easier workaround that involves using your CIMC login credentials to download the JNLP file required for KVM. Enter the following into your browser:

https://<CIMC_IP>/kvm.jnlp?cimcAddr=<CIMC_IP>&tkn1=<CIMC_username>&tkn2=<CIMC_Password>


By doing so, this will download the JNLP file and allow you to access the KVM without having to actually log into the CIMC Flash interface. I'd suggest updating as soon as possible, the HTML5 client is excellent. Unfortunately, the UCS M3 CIMC will always be Flash based, but the same workaround can be used to access the KVM.

Wednesday, February 17, 2021

Quality of life guide to homelabbing

 





Intel NUC

Short for Next Unit of Computing, these tiny boxes are desktops built on mobile and sometimes desktop processors. They don't produce much noise or heat, but you wind up paying a premium for that. While they scale out quite nicely, their scale up ability is hampered by the form factor, and will require external storage of some sort. The NUC makes a great candidate for hosting vCenter and a vSAN witness node for 2-node clusters.

AMD also builds "NUC" like form factor systems, but lack Thunderbolt (for now). There are third party vendors that make tiny form factor desktops, but often use Realtek NICs which (again, for now) are not compatible with ESXi. If you buy one of these, be sure to also purchase a compatible USB NIC and load the USB Network Native Driver Fling.

It is worth noting that second hand workstation laptops can often be had  at a similar price point. While these sacrifice form factor, they allow for a little more storage, have a display, and also have a built in battery. The same amount of USB/Thunderbolt ports can be had as well.


Rack server

Referring back to my 2020 Homelab Buyer's Guide, rackmount servers are the go-to for most datacenters. Second hand servers can be found on eBay for a fraction of what they sold for as new, as e-cyclers resell off-lease and retired equipment. While this hits the sweet spot in terms of price, capacity and performance, this is easily the loudest and least space efficient design on the chart. Rack mount servers... well, require a rack to mount them to. You can get away with stacking one on top of another, but you'll be cursing yourself as Murphy's Law takes out a memory module on the bottom server. 2U servers tend to be quieter, and some have better fan controls than others. Unless you have a dedicated room or shed that is properly prepared for such an endeavor, I'd suggest looking at other options.


Tower workstation

Although workstations are not servers, they have a lot in common. The Precision T7810 and HP Z840 are both examples that share the same processors with 13G/G10 servers, and have a large number of PCI-E slots to work with. Workstations have less memory capacity, require dedicated graphics (no onboard VGA chipset, critical callout if you buy a barebones setup), and cannot hold as many drives (usually 3x3.5" cabled). For these reasons, they come at a significant discount as they are viewed as prosumer equipment. Noise is practically an afterthought, but this is due to the large form factor supporting larger fans. Proper planning should be taken if more than one is purchased - these are not apartment friendly. Workstations will often check most of the boxes for performance/dollar/noise ratio.


Tower server

Tower servers are intended for small offices, for those that need the power of a server that can comfortably sit under a desk. Unlike workstations, these have much larger capacities in terms of memory and hard drives. For instance, the PowerEdge T630 can hold up to eighteen 3.5" disks, or 32 2.5" disks. This makes it a great candidate for a diy NAS. Tower servers can often be converted to a rackmount configuration as well, giving them more versatility should you choose to incorporate rackmount servers. Because tower servers have much greater memory and disk density, these command a bit more of a premium than tower workstations. Again, towers are not apartment friendly in terms of space.


Cloud instance

Instead of camping eBay, waiting on a shipment (if it ever comes), planning where you're going to put the things when they arrive, cursing out network equipment, finding out the server requires 240V, etc, etc... why not just spin up a container in the cloud? The big three (GCP, AWS, Azure) all have a free tier/trial period if you just want to learn how to run specific apps. 

So why is pricing in red? Well, public cloud is easily the most expensive way to get started with homelabbing. The free tiers are great for running one app at a time, but if you're looking to do anything more than that, you'll need to open up your wallet. I hesitated to mark all but "space" green here; while capacity and performance are top notch, you really need to be mindful of how much you use. If I could add one thing to my Amazon wishlist, it would be a homelab tier that dumped everything on the lowest tier equipment without SLA guarantees for a deep discount.

Lightsail might be the closest thing to the "homelab tier", as you can get started with a Linux machine for $3.50 a month. The price scales with resources required, but is probably one of the more turnkey options on the market. 

Wednesday, September 30, 2020

CentOS 7 Kubernetes Installation Guide

Kubernetes has been a bit of a buzzword lately. I haven't had much exposure to it, focusing my energy on VMware products over the last few years. Today, we'll cover the Kubernetes install process on my go to Linux distribution of choice, CentOS 7.

Step -1: Why?

I like to practice exercises to get a better understanding. A lot of the documentation surrounding Kubernetes, like most technologies, covers the "what" but not the "why". To add some color to "why containers", we'll solve a problem that I would have for personal use: A Plex media server.

It used to be that Plex had to be installed on Windows or Linux bare metal. Virtualization helped by allowing a VM to be right sized for this purpose, and freed up other resources for other machines. Containers simplified this even more; instead of having to install a VM from scratch or from a template, update the OS, and then install the application, I can deploy Plex with a single command. Kubernetes takes this one step further, allowing for management and orchestration, similar to a command-line only version of vCenter managing ESXi hosts.

Step 0: Prerequisites

For this exercise, I'm going to be using an ESXi 7.0 host with an NFS datastore mounted. Any type of datastore will do, we just need a place to put VMDKs.

We'll be creating 3 virtual machines for this, one master node and two worker nodes. The minimum requirements for each VM is as follows:

    - 2 vCPUs for the master node, 1 for each worker node (official requirements are 1.5 and 0.7)

    - 2GB of memory for the master, 1GB for workers

    - I'm using 100GB of storage each but could probably get away with less

When in the installation environment, be sure to select "Virtualization Host" option. This should install docker automagically.

I'll include the step when the time comes in the guide, but a big callout that should be made is that installing Kubernetes requires disabling SELinux. I initially wasn't comfortable with this, if I have an issue caused by SELinux, journalctl usually gives me the reason why and the steps to fix it; however, kubeadm notes indicate that SELinux must be disabled so that containers can access the filesystem. I don't plan on putting anything else on these VMs, and neither should you.

While this isn't an official requirement, I would strongly recommend configuring NTP. Kubernetes requires the nodes to maintain the same time. NTP is a great way to maintain consistent time between VMs without much manual intervention. 

After installing CentOS 7, run the following commands on all three VMs:

Install ntp and ntpdate (ntp sync)

        yum install ntp ntpdate

Start the service

        systemctl start ntpd

Enable the service

        systemctl enable ntpd

Configure NTP servers

        ntpdate -u -s 0.centos.pool.ntp.org 1.centos.pool.ntp.org 2.centos.pool.ntp.org

Restart the service

        systemctl restart ntpd

Check time, ensure that it's the same on all three VMs

        timedatectl

Set the hardware clock to match current system time

        hwclock  -w


Step 1: Configure Kubernetes Repository

Kubernetes packages are not on the official CentOS 7 repo, so we'll add the repo. Skipping this will cause step 2 to fail.  Feel free to use the text editor of your choice to add the repository. In this tutorial, I'll be using vi. Be sure to carry out the following on all of the nodes.

    vi /etc/yum.repos.d/kubernetes.repo

    [kubernetes]
    name=Kubernetes
    baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
    enabled=1
    gpgcheck=1
    repo_gpgcheck=1
    gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
    https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg

Save and quit.

Step 2: Install required packages

Install the following packages on each node as a super user:

    yum install -y kubelet kubeadm kubectl

Enable kubelet on each node:

    systemctl enable kubelet

And finally, start kubelet on each node:

    systemctl start kubelet

Step 3: Set hostname

Record the IP addresses of each VM (we'll need this later to update DNS records)

    ip addr

Choose one of the VMs to be the master node, and run the following command:

    hostnamectl set-hostname master-node

Choose which VMs will be worker-node1 and worker-node2. Run this on the first:

    hostnamectl set-hostname worker-node1

And this on the second:

    hostnamectl set-hostname worker-node2

Once complete, be sure to edit /etc/hosts to reflect these changes (using the IPs recorded previously):

    vi /etc/hosts
    192.168.0.10    master-node
    192.168.0.11    worker-node1
    192.168.0.12    worker-node2

Save and quit.

Step 4: Configure firewall-cmd

We'll need to open up some ports to allow the nodes to communicate with one another.

Run the following on the master node:

    firewall-cmd --permanent --add-port=6443/tcp
    firewall-cmd --permanent --add-port=2379-2380/tcp
    firewall-cmd --permanent --add-port=10250/tcp
    firewall-cmd --permanent --add-port=10251/tcp
    firewall-cmd --permanent --add-port=10252/tcp
    firewall-cmd --permanent --add-port=10255/tcp
    firewall-cmd --reload

On each of the worker nodes:

    firewall-cmd --permanent --add-port=6783/tcp
    firewall-cmd --permanent --add-port=10250/tcp
    firewall-cmd --permanent --add-port=10255/tcp
    firewall-cmd --permanent --add-port=30000-32767/tcp
    firewall-cmd --reload

Step 5: Update Kubernetes firewall configuration

Kubernetes uses iptables by default. To ensure that our firewalld rules work, we'll need to modify the sysctl.conf file. Run this command on each node:

    echo 'net.bridge.bridge-nf-call-iptables=1' | sudo tee -a /etc/sysctl.conf

This should persist through reboots. If for some reason you only wish to have it valid for this session, run:

    echo '1' > /proc/sys/net/bridge/bridge-nf-call-iptables

Step 6: Disable SELinux

As mentioned in Step 0 above, we need to disable SELinux (in other words, set it to permissive mode) so containers can access the host filesystem. Run the following on each node:

    setenforce 0
    sed -i ‘s/^SELINUX=enforcing$/SELINUX=permissive/’ /etc/selinux/config

Step 7: Disable swap

We'll need to disable swap on each node:
    
    swapoff -a

Step 8: Deploy the cluster

Use the following command to initialize the cluster:

    kubeadm init

This will take a few minutes, and should generate a token that we'll need to copy for future use, similar to:
    --token i5tx56.q5q6tx37m9j2acea --discovery-token-ca-cert-hash sha256:2a0c84ce92c6185009941730aac1955e7a445d5115b768c5955dab356e645fd5

Before joining the worker nodes, we need to install a pod network, which will allow nodes to communicate. There are several options to choose from, for this we'll use weave.

Create a regular expression that copies your current kubectl version:

    export kubever=$(kubectl version | base64 | tr -d '\n')

Then run this command to apply the weave network:

    kubectl apply -f “https://cloud.weave.works/k8s/net?k8s-version=$kubever"

Once complete, join the worker nodes by using the following command on each (using the token documented above as an example:

    kubeadm join 192.168.0.10:6443 --token i5tx56.q5q6tx37m9j2acea --discovery-token-ca-cert-hash sha256:2a0c84ce92c6185009941730aac1955e7a445d5115b768c5955dab356e645fd5

Once complete, you should be able to check the status of the cluster. Run this on the master node:

    kubectl get nodes

Finally, we need to enable a user to start the cluster. I ran all of the previous commands as root, and used:

    mkdir -p $HOME/.kube
    cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    chown $(id -u):$(id -g) $HOME/.kube/config

If you're using a sudo enabled user, run:

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config

And that should do it! In the next blog post, I'll be checking out a project that goes back to the example used previously... can we get a Plex transcode running on a Kubernetes cluster?

Monday, June 22, 2020

ZOTAC P106-90 Folding@home in a VM

Folding@home has been a long running passion of mine and was probably responsible for most of my system building shenanigans. Sure, I had built gaming systems before, but Folding@home brought about my introduction to open loop liquid cooling, Linux, and set me on the trajectory of my current career path.

GPU mining has been the most efficient way to fold for years now. When Fermi was released (GTX 400 series), CPU bigadv units were all the rage, but over time projects adapted to GPUs. Bigadv units are no longer available unfortunately, and now graphics cards are king.

Here's a quick price to performance comparison after a bit of googling:
Threadripper 3990x - $3449 - 1.1M PPD
RTX 2080 ti - $1200 - 2M PPD

Of course, the title of the blog wouldn't have us comparing top of the line hardware for the expressed purpose of folding, but rather what we can do with a limited budget. Today, I'll be covering a particular graphics card that has been flooding the eBay market, thanks to cryptocurrency difficulty increase. At $60 used on eBay, let's see what kind of performance we can expect!

Step 0: Requirements

Find a ZOTAC P106-90 on eBay - as of this writing, there are several stores selling them. I opted for a card that costs ~$60 that is available domestically. Other sellers have them for an even lower price, but with the state of tariffs and the risk involved with overseas shipping, I opted for the more expensive option.
Most of the computers in my homelab have available PCI-E slots. If you have an x8 or x4 slot available, you will need an adapter. Don't worry about using this on an x4 slot - the performance impact will likely be minimal.
Ensure that your power supply can handle the extra power draw of the GPU and that there is an available 6 pin PCI-E connector. Some servers don't have this, but can work with either a SATA to PCI-E power adapter, or Molex (MATE-N-LOK) 4 pin adapter.

Step 1: PCI Passthrough

Once you've installed the graphics card, we'll need to enable PCI passthrough on the host in question.

From vCenter, navigate to Host > Configure > Hardware, PCI devices > Configure Passthrough


This should provide a list of PCI devices that can be configured - scroll down until you see the device in question, then place a check mark next to it.

The host will have to be rebooted for the change to take affect.

Step 2: Create the VM

Pretty straightforward; follow the New Virtual Machine wizard and create a new VM. On step 7 (hardware customization), you'll want to add a new device, and select the GPU in question.


  • First caveat: Some folks have indicated that they have had issues getting the card to pass through to a VM after installing the driver. The workarounds are to either install a supported baremetal OS, or modify the VM configuration. If you have an issue such as Code 10 in Windows device manager, or nvidia-smi isn't able to manage the device in Linux, then you'll probably need to do the following:



  1. Power off the VM
  2. Edit virtual machine settings
  3. Click on VM Options
  4. Expand advanced, and select Configuration Parameters: Edit Configuration
  5. Add a new parameter:
    1. Name: Hypervisor.CPUID.v0
    2. Value: FALSE
  6. Click OK
  7. Power on the VM



If it's still having issues, test the card with a baremetal operating system and see if it works there.

Step 3: Fold!

If you plan on using Windows, download and install the FAH client here.
You can also skip most of the VM configuration by using VMware's fling. It should auto configure for the GPU.

Results:

After two days, I average anywhere from 160-230k PPD depending on the project being worked:


I could probably get more if I switched to Linux, but the ability to RDP is useful for me. If you don't mind local console or VNC, I would implore you to use Linux instead. The OVA fling linked above is based on VMware's Photon OS, Ubuntu also works pretty well.

In conclusion, while this card won't be breaking any world records, it certainly doesn't break the bank. $60 for ~15-20% of the performance of a $3449 processor and lower power consumption seems like a great fit for my lab. If you have open PCI-E slots available in your servers, this could be a tremendous help in fighting COVID-19 along with other diseases for not a lot of cash.

HPE H240 - the new value king of HBAs

With the release of ESXi 7.0 came a farewell to the vmklinux driver stack. Device vendors focused on writing native drivers for current gene...