Question 1

What is a safe CPU usage percentage for a production server?

Accepted Answer

Keep steady-state CPU below 60–70% to leave room for traffic spikes, GC pauses, and background jobs. At 80%+ average utilisation, any burst can saturate the CPU and cause latency to spike or requests to queue. Cloud auto-scaling typically triggers at 70–75% to allow new instances to provision before saturation.

Question 2

How do I measure CPU time per request for my service?

Accepted Answer

Use APM tools like Datadog, New Relic, or Pyroscope to get per-request CPU flame graphs. For a quick estimate, run a load test with a CPU profiler attached and divide total CPU time by total requests processed. Subtract I/O wait time — you want CPU burn only, not wall-clock latency, which includes network and database round trips.

Question 3

Does this formula work for multi-threaded or async services?

Accepted Answer

Yes for CPU-bound work. Async and non-blocking I/O reduce thread count but do not change how many CPU cycles are consumed per request. Enter the actual CPU ms burned per request from profiling data. For services where requests block threads waiting on I/O, also check the Concurrency Calculator to size your thread pool correctly alongside CPU.

Question 4

How many vCPUs does a cloud instance actually give me?

Accepted Answer

On AWS, GCP, and Azure, one vCPU equals one hyperthread on a physical core — roughly half a physical core's compute. For CPU-intensive workloads, physical core count matters; for typical web services, vCPU count is the right number to enter here. Check your instance's advertised vCPU count in the cloud provider's instance type documentation.

Question 5

What if my CPU usage estimate exceeds 100%?

Accepted Answer

It means the workload exceeds your available CPU capacity. Requests will queue, latency will increase, and eventually timeouts will cascade into errors. Fix it by adding CPU cores (scale out or up), reducing CPU time per request via code optimisation or caching, or shedding load with rate limiting. Use the QPS Calculator to find the maximum safe request rate for your current core count.

CPU Usage Estimator

How the CPU Usage Estimator Calculates Utilization

Formula

Example CPU Usage Calculations

Example 1 — Node.js API at moderate load

Example 2 — Python Flask service approaching saturation

Example 3 — Go microservice on a 16-core instance

Tips for Accurate CPU Capacity Planning

Notes

Frequently Asked Questions