Question 1

What is Little's Law and how does it apply to API concurrency?

Accepted Answer

Little's Law states that the average number of items in a system equals the arrival rate multiplied by the average time each item spends in the system. For APIs: Concurrency = RPS × Avg Response Time (seconds). It applies to any stable system — connection pools, thread pools, message queues. If either RPS or latency rises, required concurrency rises proportionally.

Question 2

How do I find my average response time for the calculation?

Accepted Answer

Use the p50 (median) latency from your observability tool — Datadog, Grafana, New Relic, or your load balancer access logs. Avoid using p99 or max latency, which inflate the result and lead to oversized, wasteful pools. If you don't have live data yet, run a load test with a tool like k6 or Locust and read the p50 from the output.

Question 3

What safety factor should I use?

Accepted Answer

A 1.5× factor is a safe default for most production services. Use 1.25× for internal services with predictable traffic. Use 2.0× for consumer-facing endpoints, services with GC-heavy runtimes (JVM, Go), or anything calling a slow external dependency like an LLM API. Never use 1.0× — your theoretical minimum leaves zero headroom for any latency spike.

Question 4

How many connections should I set in my database connection pool?

Accepted Answer

Calculate concurrency using this tool, then use that as your pool max. For PostgreSQL, the common formula is also: pool size = (core count × 2) + effective_spindle_count, but Little's Law gives you a demand-driven number. Set your application pool to the calculated safe concurrency, and ensure it stays under 80% of the DB's max_connections. Use the QPS Calculator to verify your queries-per-second headroom.

Question 5

What happens if my concurrency is too low or too high?

Accepted Answer

Too low: requests queue waiting for a free connection, p99 latency spikes, and clients time out even though your service is healthy. Too high: you exhaust database connection limits, waste memory on idle threads, and increase context-switching overhead. Size it right with this calculator, then monitor pool wait time in production to confirm. See the API Rate Limit Calculator to pair concurrency limits with rate limit planning.

Concurrency Calculator

How to Calculate API Concurrency with Little's Law

Formula

Example Concurrency Calculations

Example 1 — REST API with database queries

Example 2 — High-throughput event ingestion service

Example 3 — Background job worker fleet (slow LLM calls)

Tips for Sizing Concurrency in Production

Notes

Frequently Asked Questions