Monday, June 22, 2020

ZOTAC P106-90 Folding@home in a VM

Folding@home has been a long running passion of mine and was probably responsible for most of my system building shenanigans. Sure, I had built gaming systems before, but Folding@home brought about my introduction to open loop liquid cooling, Linux, and set me on the trajectory of my current career path.

GPU mining has been the most efficient way to fold for years now. When Fermi was released (GTX 400 series), CPU bigadv units were all the rage, but over time projects adapted to GPUs. Bigadv units are no longer available unfortunately, and now graphics cards are king.

Here's a quick price to performance comparison after a bit of googling:
Threadripper 3990x - $3449 - 1.1M PPD
RTX 2080 ti - $1200 - 2M PPD

Of course, the title of the blog wouldn't have us comparing top of the line hardware for the expressed purpose of folding, but rather what we can do with a limited budget. Today, I'll be covering a particular graphics card that has been flooding the eBay market, thanks to cryptocurrency difficulty increase. At $60 used on eBay, let's see what kind of performance we can expect!

Step 0: Requirements

Find a ZOTAC P106-90 on eBay - as of this writing, there are several stores selling them. I opted for a card that costs ~$60 that is available domestically. Other sellers have them for an even lower price, but with the state of tariffs and the risk involved with overseas shipping, I opted for the more expensive option.
Most of the computers in my homelab have available PCI-E slots. If you have an x8 or x4 slot available, you will need an adapter. Don't worry about using this on an x4 slot - the performance impact will likely be minimal.
Ensure that your power supply can handle the extra power draw of the GPU and that there is an available 6 pin PCI-E connector. Some servers don't have this, but can work with either a SATA to PCI-E power adapter, or Molex (MATE-N-LOK) 4 pin adapter.

Step 1: PCI Passthrough

Once you've installed the graphics card, we'll need to enable PCI passthrough on the host in question.

From vCenter, navigate to Host > Configure > Hardware, PCI devices > Configure Passthrough


This should provide a list of PCI devices that can be configured - scroll down until you see the device in question, then place a check mark next to it.

The host will have to be rebooted for the change to take affect.

Step 2: Create the VM

Pretty straightforward; follow the New Virtual Machine wizard and create a new VM. On step 7 (hardware customization), you'll want to add a new device, and select the GPU in question.


  • First caveat: Some folks have indicated that they have had issues getting the card to pass through to a VM after installing the driver. The workarounds are to either install a supported baremetal OS, or modify the VM configuration. If you have an issue such as Code 10 in Windows device manager, or nvidia-smi isn't able to manage the device in Linux, then you'll probably need to do the following:



  1. Power off the VM
  2. Edit virtual machine settings
  3. Click on VM Options
  4. Expand advanced, and select Configuration Parameters: Edit Configuration
  5. Add a new parameter:
    1. Name: Hypervisor.CPUID.v0
    2. Value: FALSE
  6. Click OK
  7. Power on the VM



If it's still having issues, test the card with a baremetal operating system and see if it works there.

Step 3: Fold!

If you plan on using Windows, download and install the FAH client here.
You can also skip most of the VM configuration by using VMware's fling. It should auto configure for the GPU.

Results:

After two days, I average anywhere from 160-230k PPD depending on the project being worked:


I could probably get more if I switched to Linux, but the ability to RDP is useful for me. If you don't mind local console or VNC, I would implore you to use Linux instead. The OVA fling linked above is based on VMware's Photon OS, Ubuntu also works pretty well.

In conclusion, while this card won't be breaking any world records, it certainly doesn't break the bank. $60 for ~15-20% of the performance of a $3449 processor and lower power consumption seems like a great fit for my lab. If you have open PCI-E slots available in your servers, this could be a tremendous help in fighting COVID-19 along with other diseases for not a lot of cash.

Evacuate ESXi host without DRS

One of the biggest draws to vSphere Enterprise Plus licensing is the Distributed Resource Scheduler feature. DRS allows for recommendations ...