Runtime Zero
ESC
Browse by topic
Articles  /  VMware

vCenter 8 Performance Tuning: The Settings That Actually Matter

vCenter's default configuration is conservative. These tuning adjustments — from database maintenance to DRS sensitivity and alarm storm suppression — make a measurable difference in large environments.

CS

vCenter Server Appliance (VCSA) runs on embedded Postgres and a collection of Java services. Out of the box, it's tuned for a wide range of deployment sizes. In large environments (500+ VMs, 20+ hosts), several defaults become bottlenecks. Here's what to adjust.

Database Maintenance: Don't Skip This

The embedded Postgres database accumulates stale task and event records. With default retention settings, large environments can grow the database to multi-gigabyte sizes that cause vCenter UI slowness.

Configure database retention via the VCSA appliance management UI (port 5480) or via API:

# Set task retention to 30 days (default is often much higher)
curl -X PUT "https://vcenter/api/appliance/vcenter-settings/v1/config/components/database" \
  -H "vmware-api-session-id: $SESSION" \
  -d '{"task_cleanup_enabled": true, "task_retention_period": 30,
       "event_cleanup_enabled": true, "event_retention_period": 30}'

Run this on a schedule via a cron job on the VCSA. The first cleanup after a long period can take 30-60 minutes — schedule it during a maintenance window.

DRS Sensitivity: Understand What "Aggressive" Means

DRS migration threshold runs from 1 (Conservative) to 5 (Aggressive). The default is 3. The number maps directly to the minimum DRS score imbalance that triggers a migration recommendation:

Level Threshold Moves for...
1 Very high imbalance only Severe overcommit
3 (default) Moderate imbalance Most environments
5 Any improvement Constant migrations

For environments with NUMA-sensitive workloads (databases, latency-sensitive apps), keep DRS at 3 or even 2. Frequent migrations interrupt CPU caches and NUMA locality — the DRS load balancing benefit may be outweighed by workload disruption.

For general compute workloads, 4 is reasonable.

Alarm Storm Suppression

In large environments, a single host disconnection can trigger hundreds of alarms as every VM on that host transitions to invalid states. vCenter's default alarm configuration has no dampening.

Configure alarm actions with a repeat interval:

Alarm → Edit → Actions → Reset to Green
  Repeat: Every 24 hours

More importantly, use alarm acknowledgment workflows. An acknowledged alarm stops generating repeat notifications, which prevents on-call engineers from being paged dozens of times for a single event. Integrate vCenter alarms with your alerting platform (PagerDuty, OpsGenie) via the vCenter alarm webhook, not via email.

vCenter Log Level

The default info log level generates significant disk I/O on busy clusters. Switch non-critical services to warning:

# Via vmon-cli on the VCSA shell
/usr/lib/vmware-vmon/vmon-cli --get-log-levels

Reduce vpxd log verbosity to warning unless you're actively troubleshooting — the vpxd.log is the highest-volume log on most vCenter instances.

Linked Mode and Enhanced Linked Mode

In multi-vCenter environments (ELM), the PSC replication traffic between vCenters adds up. Keep ELM site connections on a low-latency link (< 10ms). High latency between ELM sites degrades vCenter UI responsiveness globally, not just for cross-site operations.

If you're on VCF, SDDC Manager handles the ELM topology automatically. If you're running standalone vCenters in ELM, document the replication topology and include it in network change reviews — a routing change that adds 20ms of latency between PSC nodes will generate helpdesk tickets within hours.