The perfect servers environnement

In this blog post i list the essential features a server environnement should have to be considered almost perfect :

  • 100 % automated image build and deploy to a store (Azure compute gallery, VmWare Content library, etc.).
  • 100 % automated server build mechanism, so no humain is involve for the server delivery.
  • 100 % automated server destruction mechanism.
  • Servers are shut down when not in use to save energy, hosts shut down, save cost in cloud.
  • Servers capacity are well used, right-sizing, automated size reviews, storage sizing, storage tier (archive, cold, hot, performance), tier reviews and auto-adjust, scheduled.
  • Predictable stability, i mean that if your servers are all built the same way, same tools, same versions, you then are in control of the environnement where you do modifications, you just have to take the software exceptions into account.
  • All the server function/roles are deploy automated, so you can rebuild the server if needed, you can migrate OS with a high level of confidence .
  • Receipes of auto-adaptability of most of the scenarios that your entreprise can have, ex. auto-storage provisionning in a controled environnement if batch need it, end of month CPU needed for some compliance reports
  • All (in/out) the network flows are controled and well known, they are validated from the server point of view
  • Servers utilization are controled and well known, ex. accounts that authenticate, protocol usage, number of requests per account, per protocol, per port, CPU usage, Memory usage, Network usage
  • Servers are 100% compliant to the business standards, all exceptions are documentation into the script that does the validation, this script if auto maintained by automation (server destroy, function/roles deployment) and validate every week.
  • All the server functions/roles are tested/monitor with automations, transactions are validated
  • All the access to functions/roles support teams are delegated remotly, no interactive logon, so deployment, support logs, software/infrastructure restart are delegated directly or by pipeline.
  • The environnement is auto-documented, server names, network flows, servers specs, account access, etc.

In that list i haven’t talk about backup, EDR, SIEM, patches, tooling etc.. only the server functions. We should take server management to a level where we almost never have to log to a server

Hot-add CPU issue Windows Server 2016/2019 and VmWare ESXi => BSOD

There’s a known bug out there where if you add CPUs to a virtual machine running Windows Server 2016 or Windows Server 2019, the result is a BSOD. Latest VmWare tools, VmWare Hardware doesn’t fix the issue. Same thing for a roll back on those item. Microsoft works on a fix that should be release this month

vmStatsProvider event 256 – 258

If you are a user of VmWare platform and are often looking at you Application log in Windows, you are probably aware of the spam that vmStatsProvider brings with the events id 256  and 258. A lot of threads in forums since many years on this issue. Since i became angry about those spam in the EventViewer that i try to keep clean, i’ve worked to find the cause. I took my favorites sysinternals tools (ProcMon, Process Explorer) to demonstrate that the calls perform by our monitoring tools on the Performance counter of VM (that uses the vmware dll behind) were the source of those events. So i had a talked with a VmWare developper who kindly accepted to modify the behavior of the dll and it’s now included in VmWare Tools 10.2.5 (March 28 2018) !!!

Many stupid solutions like to remove the performance counters in your VmWare Tools installation were published in the past years and i’m really proud to have contribute to the Clean EventViewer Community of SysAdmin.

Enjoy!

CPU Speculation Control vulnerabilities

What better than a vulnerability that affect almost everything to start the new year ! We’ll start to work hard this year since this vulnerability needs you to patch the hardware, the OS (clients and servers), mobile devices, hypervisor, cloud machines, etc.. I don’t know about other cloud provider but i can say that Microsoft were pretty fast on patching Azure hardware and hypervisor. First we need to look how huge is the performance impact, specially on database and hypervisor servers.

There’s also a Powershell module to test your system ! Just run a elevated Powershell and install the module :

Install-Module SpeculationControl

Get-SpeculationControlSettings

After Microsoft patch and reboot i got

The Microsoft update is already available on Windows Update (prior to patch tuesday)

http://www.catalog.update.microsoft.com/Search.aspx?q=2018-01

In order to have Windows OS support for branch target injection mitigation is enabled : True you need to update the Bios with latest patch

After Bios update of tablet manufacturer

On an older machine I got

Now interesting to see how fast my motherboard manufacturer will be since it’s a 2012 motherboard..

So Suits Up and get to work!