Mastodawn

Alexander Bochmann Aug 14, 2018

Looks like we'll be planning for the next round of VMware updates due to #L1TF tomorrow: https://www.vmware.com/security/advisories/VMSA-2018-0020.html

"vCenter Server, ESXi, Workstation, and Fusion updates include Hypervisor-Specific Mitigations for L1 Terminal Fault - VMM. This issue may allow a malicious VM running on a given CPU core to effectively read the hypervisor’s or another VM’s privileged information that resides sequentially or concurrently in the same core’s L1 Data cache."

#infosec

VMSA-2018-0020

VMware vSphere, Workstation, and Fusion updates enable Hypervisor-Specific Mitigations for L1 Terminal Fault - VMM vulnerability.

Show thread

Alexander Bochmann Aug 15, 2018

So, #L1TF is kinda the "hyperthreading is dead" one... VMware introduces a new "ESXi Side-Channel-Aware Scheduler".
Quote: "Currently, this scheduler provides the Hyper-Threading-aware mitigation by scheduling on only one Hyper-Thread of a Hyper-Thread-enabled core. As described in more detail below, careful capacity planning is required prior to enabling the ESXi Side-Channel-Aware Scheduler as it could have a performance impact for enterprise applications."

https://kb.vmware.com/s/article/55767

VMware Knowledge Base

Show thread

Alexander Bochmann Aug 22, 2018

#VMware has released a (Powershell) tool to help assess the effects of activating the "Side-Channel-Aware Scheduler" on the hosts of an ESXi-Cluster: https://kb.vmware.com/s/article/56931

Not sure how useful the output is since I only ran it against our test cluster up to now, which, as I was reminded by the tool, has CPUs that don't do hyperthreading anyways 🙄

Still waiting for results on the first production cluster...

VMware Knowledge Base

Show thread

Alexander Bochmann Aug 22, 2018

According to the tool, we should mostly be fine after activating the ESXi SCA scheduler.
It generates a report for each host, listing how much time the host spent in a band of CPU utilisation, and warns if there were times of more than 70% usage. Also finds VMs that have more vCPUs than cores available on each host (minus HT).
We have two host with CPU usage above the threshold, and will need to juggle some resources. Don't expect any problems with that.

On to patching >50 ESXi hosts then.

Show thread

The Gibson 🅅 Aug 22, 2018

@galaxis

um, not my experience...

after patching a few test environments, we were unable to vmotion due to high CPU.

even on hosts that weren't using Hyperthreading to start with...

do some testing.

Show thread

Alexander Bochmann Aug 22, 2018

@thegibson Yeah, planning on that. Tool or not, there has to be some visible fallout from removing half of the CPU threads from a cluster (and ESXi seemed to treat hyperthreads mostly as full cores up to now).

Show thread

The Gibson 🅅

@galaxis It is a HUGE hit on a well used host.

You know your stuff.. proceed cautiously.