Performance Testing Basics: Load, Stress, and Spike Testing Explained

A database query that runs in 30ms with 100 rows can timeout with 1 million, and functional tests will never catch it because they run one scenario at a time. Performance testing simulates the concurrent traffic that reveals these failures: load testing for normal traffic, stress testing to find the breaking point, spike testing for sudden surges, and soak testing to expose memory leaks that only appear after hours. This guide covers k6, the JavaScript-based tool that runs load tests from the CLI, exits with a non-zero code when thresholds aren't met, and integrates directly into GitHub Actions.

Why Performance Testing Matters

A login form that takes 200ms is fine. One that takes 5 seconds loses users. One that takes 30 seconds under load brings down the server.

Performance bugs are often the hardest to fix — they require architectural changes, not simple code fixes. Finding them early matters.

Common performance failures:

Database queries that work fine with 100 rows, timeout with 1 million
API endpoints that handle 10 concurrent users, fail at 100
Memory leaks that accumulate over hours of usage
Third-party integrations that become bottlenecks under load

Types of Performance Tests

Load Testing

Simulates expected production load to verify the system performs within acceptable limits.

Question: Does the system handle normal traffic? Example: Our app has 500 concurrent users during business hours. Run 500 virtual users for 30 minutes and measure response times. Acceptable results:

95th percentile response time < 1 second
Error rate < 1%
No memory leaks

Stress Testing

Push the system beyond its limits to find the breaking point.

Question: How does the system behave when overloaded? Does it fail gracefully? Example: Start with 100 users, increase by 100 every minute until the system breaks. Observe when errors start, how the system fails, and whether it recovers.

Spike Testing

Sudden massive load increase, then return to normal.

Question: Can the system handle sudden traffic spikes? Example: Normal load is 100 users. Suddenly jump to 1,000 for 2 minutes, then back to 100. Concern: Social media mentions, news articles, flash sales.

Soak/Endurance Testing

Sustained load over extended period.

Question: Are there memory leaks or performance degradation over time? Example: 200 concurrent users for 8 hours. Monitor memory usage, response time trends.

k6: The Modern Load Testing Tool

k6 is the most QA-friendly load testing tool — JavaScript-based, runs from the CLI, integrates with CI.

Installation

# macOS
brew install k6

# Windows (winget)
winget install k6

# Docker
docker pull grafana/k6

Your First Load Test

// load-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';

// Test configuration
export const options = {
  vus: 50,           // Virtual users
  duration: '30s',  // Run for 30 seconds
};

export default function() {
  // Make a request
  const response = http.get('https://lab.becomeqa.com/api/products');
  
  // Assertions
  check(response, {
    'status is 200': (r) => r.status === 200,
    'response time < 500ms': (r) => r.timings.duration < 500,
    'has products': (r) => JSON.parse(r.body).length > 0,
  });
  
  // Wait between iterations (simulates real user behavior)
  sleep(1);
}

Run:

k6 run load-test.js

Output:

✓ status is 200         100.00% ✓ 1500  ✗ 0
✓ response time < 500ms 95.33%  ✓ 1430  ✗ 70
✓ has products          100.00% ✓ 1500  ✗ 0

http_req_duration..............: avg=185ms min=45ms  med=160ms  max=1.2s    p(90)=350ms p(95)=480ms
http_reqs......................: 1500   49.91/s

Ramp-Up Scenarios

More realistic than hitting full load immediately:

export const options = {
  stages: [
    { duration: '2m', target: 100 },  // Ramp up to 100 users over 2 minutes
    { duration: '5m', target: 100 },  // Hold 100 users for 5 minutes
    { duration: '2m', target: 200 },  // Ramp up to 200
    { duration: '5m', target: 200 },  // Hold 200 users
    { duration: '2m', target: 0 },    // Ramp down
  ],
};

Testing an API Endpoint with Authentication

import http from 'k6/http';
import { check, group } from 'k6';

export const options = {
  stages: [
    { duration: '1m', target: 50 },
    { duration: '3m', target: 50 },
    { duration: '1m', target: 0 },
  ],
  thresholds: {
    'http_req_duration': ['p(95)<500'],  // 95% of requests under 500ms
    'http_req_failed': ['rate<0.01'],     // Less than 1% errors
  },
};

export function setup() {
  // Login once, return token for all virtual users
  const res = http.post('https://api.myapp.com/auth/login', JSON.stringify({
    email: 'load-test@myapp.com',
    password: 'LoadTestPass1',
  }), { headers: { 'Content-Type': 'application/json' } });
  
  return { token: JSON.parse(res.body).token };
}

export default function(data) {
  const headers = {
    'Authorization': `Bearer ${data.token}`,
    'Content-Type': 'application/json',
  };
  
  group('List products', () => {
    const res = http.get('https://api.myapp.com/products', { headers });
    check(res, { 'products loaded': (r) => r.status === 200 });
  });
  
  group('Get product detail', () => {
    const res = http.get('https://api.myapp.com/products/1', { headers });
    check(res, { 'product detail loaded': (r) => r.status === 200 });
  });
  
  sleep(Math.random() * 3 + 1);  // Random think time 1-4 seconds
}

Performance Thresholds

Define pass/fail criteria:

export const options = {
  thresholds: {
    // 95% of requests must complete below 500ms
    'http_req_duration': ['p(95)<500'],
    
    // Specific endpoint threshold
    'http_req_duration{name:products}': ['p(95)<300'],
    
    // Less than 0.1% errors allowed
    'http_req_failed': ['rate<0.001'],
    
    // Custom metric threshold
    'checkout_duration': ['p(90)<2000'],
  },
};

If thresholds are not met, k6 exits with a non-zero code — CI fails automatically.

What to Measure

Response time percentiles:

p50 (median): 50% of requests faster than this
p90: 90% of requests faster than this
p95: 95% of requests faster than this
p99: The slow tail — what the worst 1% experience

Error rate: Percentage of requests that fail (5xx, timeouts) Throughput: Requests per second the system handles Concurrent users: How many users are active simultaneously Resource utilization: CPU, memory, database connections (measured on the server side)

Integrating with CI

# GitHub Actions
performance-test:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    
    - name: Install k6
      run: |
        curl https://github.com/grafana/k6/releases/download/v0.50.0/k6-v0.50.0-linux-amd64.tar.gz -L | tar xvz
        sudo mv k6-v0.50.0-linux-amd64/k6 /usr/local/bin
    
    - name: Run load test
      run: k6 run load-tests/api-load-test.js
      env:
        BASE_URL: ${{ secrets.STAGING_URL }}
    
    - name: Upload results
      uses: actions/upload-artifact@v4
      if: always()
      with:
        name: k6-results
        path: k6-results.json

Common Performance Issues to Look For

Slow database queries:

Monitor query execution time under load
N+1 queries (1 query per row instead of 1 total)
Missing indexes

Memory leaks:

Memory usage growing over time without decreasing
Observed in soak tests

Connection pool exhaustion:

Database connections run out under high concurrency
Error: "too many connections"

Third-party service bottlenecks:

External API that becomes slow under load
Payment gateway with rate limits

Caching missing:

Same data fetched from DB repeatedly — should be cached

Summary

| Test type | Purpose | Duration |

|-----------|---------|---------|

| Load test | Verify normal-load performance | 30min-2h |

| Stress test | Find breaking point | Until failure |

| Spike test | Handle sudden traffic surges | Short spikes |

| Soak test | Detect memory leaks | Hours |

k6 basics:

vus — virtual users
duration — how long to run
stages — ramp-up patterns
thresholds — pass/fail criteria
check() — per-request assertions

Performance testing is separate from functional testing — different tools, different goals. Functional tests verify correctness; performance tests verify speed and reliability under load. Both are needed before shipping.