Skip to content
Last updated

API Rate Limiting & Throttling

MediaMath APIs implement rate limiting and throttling to ensure platform stability and fair resource allocation across all users. This page explains the limits, how they work, and how to handle rate-limited responses in your integrations.

Overview

The MediaMath API uses a multi-layer protection system to maintain service reliability. When limits are exceeded, the API returns appropriate HTTP status codes with details to help you adjust your request patterns.

Rate Limits

Limit TypeGuaranteed Safe LimitRecommended TargetTheoretical MaximumNotes
Per-User Rate Limit*10 req/s30 req/s50 req/sSustained request rate per authenticated user (token bucket refill rate)
Burst Capacity50 requests150 requests250 requestsMaximum requests that can be sent instantly after idle period (token bucket capacity)
Unauthenticated Requests50 req/s150 req/s250 req/sGlobal limit for requests without authentication
  • Per user rate limiting is currently disabled by default.

How Burst Capacity Works:

  • Uses token bucket algorithm with continuous refill (not fixed time windows)
  • Each request consumes 1 token; tokens refill at the rate limit (e.g., 10 tokens/second = 1 token per 100ms)
  • After idle period ≥5 seconds, bucket fills to capacity
  • Example: User can send 50 requests instantly, then must wait for refill at 10 req/s
  • Sustained rate at or below limit never depletes burst capacity

Safe Operating Guidelines:

  • At or below recommended targets (30 req/s sustained, 150 request burst capacity): Low risk of throttling with normal distribution variance
  • Between recommended and maximum: Possible intermittent throttling during uneven distribution
  • Above theoretical maximum: Will be throttled

Concurrent Write Operations

Standard tier: Up to 50 concurrent write requests

Priority tier: Up to 30 concurrent write requests

During periods of high usage, available concurrency is distributed fairly among active clients to ensure equitable access.

Applies to: POST, PUT, PATCH, DELETE methods only. GET requests are not limited.

Error Responses

HTTP 429 Too Many Requests

Returned when rate limit or concurrent write limit is exceeded.

{
  "meta": {
    "status": "error",
    "uuid": "request-uuid"
  },
  "errors": [{
    "code": "rate-limit-exceeded",
    "message": "Rate limit exceeded, please slow down",
    "details": {
      "limit": 10,
      "window": "1s"
    }
  }]
}

Or for concurrent write limit:

{
  "meta": {
    "status": "error",
    "uuid": "request-uuid"
  },
  "errors": [{
    "code": "too-many-concurrent-writes",
    "message": "Too many concurrent write operations, please retry",
    "details": {
      "limit": 10
    }
  }]
}

Headers returned:

  • Retry-After: 1 - Suggested wait time in seconds before retrying

HTTP 503 Service Unavailable

Returned when the API is under heavy load or experiencing issues.

{
  "meta": {
    "status": "error",
    "uuid": "request-uuid"
  },
  "errors": [{
    "code": "service-overloaded",
    "message": "Service temporarily overloaded, please retry later"
  }]
}

Headers returned:

  • Retry-After: 5 or Retry-After: 10 - Suggested wait time in seconds
  1. Implement Exponential Backoff:

    • On 429 or 503, wait for Retry-After header value
    • If retrying fails, double the wait time (max 60 seconds)
    • Example: 1s → 2s → 4s → 8s → 16s → 32s → 60s
  2. Respect Rate Limits:

    • Space out requests to stay under 10 req/s per user
    • Batch operations when possible instead of many small requests
  3. Handle Concurrent Write Limits:

    • For bulk updates, submit requests sequentially or in small batches
    • Don't fire 20+ simultaneous write requests
  4. Monitor Response Headers:

    • X-RateLimit-Limit: Our rate limit (when enabled)
    • X-RateLimit-Remaining: Remaining requests in current window
    • Retry-After: Seconds to wait before retrying

Example: Retry Logic (JavaScript)

async function apiRequestWithRetry(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status === 429 || response.status === 503) {
      const retryAfter = parseInt(response.headers.get('Retry-After') || '1');
      const waitTime = retryAfter * 1000 * Math.pow(2, attempt);
      await new Promise(resolve => setTimeout(resolve, Math.min(waitTime, 60000)));
      continue;
    }

    return response;
  }
  throw new Error('Max retries exceeded');
}

Endpoint Exceptions

The following endpoint is excluded from rate limiting:

  • /healthcheck - Service health verification

FAQ

Why am I getting rate limited at low request volumes?

Rate limits are per-user, not per-application. If you have multiple applications or scripts using the same API credentials, they share the same rate limit bucket.

How do I request a higher rate limit?

Contact MediaMath Support to discuss your use case. Higher limits may be available for specific approved integrations.

Are rate limits applied to read and write operations equally?

Rate limits (429 responses) apply to all requests equally. However, write concurrency limits only apply to POST, PUT, PATCH, and DELETE operations.

What timezone are rate limit windows based on?

Rate limit windows are rolling and based on UTC time from the moment of each request, not calendar-based windows.