Subtask 2-3: Rate Limits and Throttling Policies - Complete Documentation

Date: 2025-02-10 Subtask: 2-3 - Document rate limits and throttling policies Status: ✅ COMPLETED

Overview

Based on comprehensive research of Lamoda's API documentation, OpenAPI specifications, and integration guides, this document provides all available information about rate limits and throttling policies for all three Lamoda API systems.

Key Findings Summary

⚠️ Important: Rate Limits Are NOT Publicly Documented

Critical Discovery: Lamoda's official documentation does NOT publicly specify exact rate limit numbers (e.g., "100 requests per minute" or "1000 requests per day").

What IS Documented:

Rate limiting EXISTS - HTTP 429 errors are documented in the OpenAPI specification
Token generation is unlimited - Multiple tokens can be generated without restrictions
Token lifetimes are specified - Different TTL for different API systems
Batch operations NOT supported - Sequential requests required
Specific rate limit case - Notification resend endpoint has explicit rate limiting

What is NOT Publicly Documented:

❌ Specific requests-per-minute limits
❌ Specific requests-per-day limits
❌ Rate limit headers (X-RateLimit-* headers)
❌ Different limits per endpoint
❌ Burst limits
❌ Concurrent connection limits

1. B2B Platform API (REST) - Rate Limits

Base URLs

Production: https://api-b2b.lamoda.ru
Demo (Test): https://api-demo-b2b.lamoda.ru

Token Lifetime

Access Token TTL: 15 minutes (900 seconds)
Recommended Refresh: Every 23 hours
Authentication: OAuth2 with JWT tokens

Documented Rate Limiting

HTTP 429 - Rate Limit Exceeded

The B2B Platform API returns HTTP status code 429 when rate limits are exceeded.

Documented Case: Notification Resend Endpoint

Endpoint: POST /api/v1/notifications/resend
Error Message: "Rate limit has been reached"
Error Description: "Ошибка по достижении RateLimit (попытка переотправки уведомлений, которые уже в работе)"

Response Format:

{
  "code": 0,
  "description": "Rate limit has been reached",
  "message": "Error details",
  "errors": []
}

Batch Operations

Status: ❌ NOT SUPPORTED

From official documentation:

"Пакетная обработка: Не поддерживается. Используйте последовательные запросы."

"Batch Processing: Not supported. Use sequential requests."

Implication: You must make individual API calls for each item. No bulk operations are available.

Token Generation

Status: ✅ UNLIMITED

From official documentation:

"Поддерживается генерация нескольких токенов без ограничений."

"Multiple token generation is supported without restrictions."

Implication: You can generate as many tokens as needed. This can be used to implement parallel processing with different tokens.

Rate Limit Headers

Status: ❌ NOT DOCUMENTED

The OpenAPI specification does not include any of the following standard rate limit headers:

X-RateLimit-Limit
X-RateLimit-Remaining
X-RateLimit-Reset
Retry-After

Note: The API may return a Retry-After header with 429 responses, but this is not explicitly documented.

2. Seller JSON-RPC API - Rate Limits

Base URL

Primary: https://public-api-seller.lamoda.ru/jsonrpc
Token Endpoint: https://public-api-seller.lamoda.ru/jsonrpc/v1/tokens.create

Token Lifetime

Access Token TTL: 15 minutes
Recommended Refresh: Every 14 minutes
Authentication: OAuth2 with JWT tokens

Documented Rate Limiting

Status: ❌ NOT EXPLICITLY DOCUMENTED

The OpenAPI/Swagger specification for the JSON-RPC API does not contain explicit rate limit documentation or HTTP 429 error codes.

However: Based on common API practices and the fact that the B2B Platform API has rate limiting, it is reasonable to assume that rate limiting exists for the JSON-RPC API as well.

Batch Operations

Status: ❌ NOT SUPPORTED

From official documentation:

"Пакетная обработка: Не поддерживается. Используйте последовательные запросы."

Implication: Each JSON-RPC method call processes one item at a time. Use sequential requests for multiple items.

Token Generation

Status: ✅ UNLIMITED

Same as B2B Platform API - unlimited token generation is supported.

Rate Limit Headers

Status: ❌ NOT DOCUMENTED

No rate limit headers are documented in the JSON-RPC API specification.

3. Seller REST API - Rate Limits

Base URL

Primary: https://public-api-seller.lamoda.ru/api

Token Lifetime

Access Token TTL: ~15 minutes (shared with JSON-RPC API)
Recommended Refresh: Every 14 minutes
Authentication: Bearer Token (uses token from JSON-RPC authentication)

Documented Rate Limiting

Status: ❌ NOT EXPLICITLY DOCUMENTED

The OpenAPI specification for the Seller REST API does not contain explicit rate limit documentation.

However: As this API shares authentication with the JSON-RPC API, it's reasonable to assume similar rate limiting policies apply.

Batch Operations

Status: ❌ NOT SUPPORTED

Based on the general documentation statement that batch processing is not supported.

Rate Limit Headers

Status: ❌ NOT DOCUMENTED

No rate limit headers are documented in the Seller REST API specification.

Rate Limit Response Handling

HTTP Status Code 429

When rate limits are exceeded, the API returns:

Status Code: 429 Too Many Requests

Response Body (B2B Platform API):

{
  "code": 0,
  "description": "Rate limit has been reached",
  "message": "Detailed error message",
  "errors": []
}

JSON-RPC Response (likely format):

{
  "jsonrpc": "2.0",
  "error": {
    "code": -32000,
    "message": "Rate limit exceeded",
    "data": {}
  },
  "id": "request-id"
}

Recommended Handling Strategy

Detect 429 Responses: Check for HTTP status code 429
Extract Retry-After Header: If present, use the value to determine wait time
Implement Exponential Backoff:
- Start with 1 second delay
- Double the delay on each retry (1s, 2s, 4s, 8s, 16s)
- Maximum delay: 60 seconds
Log Rate Limit Events: Track when and how often rate limits are hit
Adjust Request Rate: Slow down if frequently hitting rate limits

Python Example:

import time
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

def create_session_with_retry():
    session = requests.Session()

    retry_strategy = Retry(
        total=5,
        backoff_factor=1,  # 1s, 2s, 4s, 8s, 16s
        status_forcelist=[429],
        allowed_methods=["GET", "POST", "PUT", "PATCH"]
    )

    adapter = HTTPAdapter(max_retries=retry_strategy)
    session.mount("https://", adapter)
    session.mount("http://", adapter)

    return session

# Usage
session = create_session_with_retry()
response = session.get(
    "https://api-b2b.lamoda.ru/api/v1/orders",
    headers={"Authorization": f"Bearer {token}"}
)

if response.status_code == 429:
    retry_after = response.headers.get("Retry-After")
    if retry_after:
        time.sleep(int(retry_after))

Throttling Policies

Observed Throttling Behaviors

Based on API documentation analysis:

Sequential Processing Required
- Batch operations not supported
- Must process items one at a time
- Affects operations like:
  - Bulk product updates
  - Multiple order status changes
  - Mass stock updates
Token-Based Throttling
- Each token has independent rate limits
- Multiple tokens can be used for parallel processing
- Token generation is unlimited
Endpoint-Specific Throttling
- Some endpoints may have stricter limits
- Example: Notification resend has documented rate limiting
- Different endpoints may have different limits

Recommended Throttling Strategy

1. Sequential Processing with Delays

import time

def process_items(items, api_call, delay=0.5):
    """
    Process items sequentially with delay between calls.

    Args:
        items: List of items to process
        api_call: Function that makes API call
        delay: Delay in seconds between calls (default: 0.5s)
    """
    results = []
    for i, item in enumerate(items):
        try:
            result = api_call(item)
            results.append(result)

            # Add delay between requests (except for last item)
            if i < len(items) - 1:
                time.sleep(delay)

        except Exception as e:
            print(f"Error processing item {i}: {e}")
            results.append(None)

    return results

2. Parallel Processing with Multiple Tokens

from concurrent.futures import ThreadPoolExecutor
import threading

class TokenPool:
    def __init__(self, num_tokens=5):
        self.tokens = [self._get_token() for _ in range(num_tokens)]
        self.lock = threading.Lock()
        self.current_index = 0

    def _get_token(self):
        """Generate a new token."""
        # Implementation of token generation
        return generate_token()

    def get_token(self):
        """Get next token from pool (round-robin)."""
        with self.lock:
            token = self.tokens[self.current_index]
            self.current_index = (self.current_index + 1) % len(self.tokens)
            return token

# Usage
token_pool = TokenPool(num_tokens=5)

def process_with_token_pool(items):
    with ThreadPoolExecutor(max_workers=5) as executor:
        futures = []
        for item in items:
            token = token_pool.get_token()
            future = executor.submit(api_call, item, token)
            futures.append(future)

        results = [f.result() for f in futures]

    return results

3. Adaptive Rate Limiting

import time
from collections import deque

class AdaptiveRateLimiter:
    def __init__(self, initial_delay=0.5, max_delay=5.0):
        self.delay = initial_delay
        self.max_delay = max_delay
        self.request_times = deque(maxlen=10)
        self.rate_limit_hits = 0

    def wait_before_request(self):
        """Wait appropriate time before making request."""
        # Check if we need to slow down
        if self.rate_limit_hits > 0:
            self.delay = min(self.delay * 1.5, self.max_delay)
            self.rate_limit_hits -= 1
        else:
            # Gradually reduce delay if no rate limits
            self.delay = max(self.delay * 0.9, 0.1)

        time.sleep(self.delay)

        # Track request time
        self.request_times.append(time.time())

    def report_rate_limit(self):
        """Call this when hitting rate limit."""
        self.rate_limit_hits += 1
        self.delay = min(self.delay * 2, self.max_delay)

# Usage
limiter = AdaptiveRateLimiter()

for item in items:
    limiter.wait_before_request()
    try:
        response = make_api_request(item)
        if response.status_code == 429:
            limiter.report_rate_limit()
    except Exception as e:
        print(f"Error: {e}")

Rate Limit Estimation and Testing

How to Determine Your Actual Limits

Since exact limits are not publicly documented, you can empirically determine them:

1. Load Testing Approach

import time
import requests
from threading import Thread

def test_rate_limit(endpoint, token, num_requests=100):
    """
    Test rate limit by making rapid requests.

    WARNING: This may trigger rate limiting!
    Use caution in production.
    """
    results = []
    start_time = time.time()

    for i in range(num_requests):
        try:
            req_start = time.time()
            response = requests.get(
                endpoint,
                headers={"Authorization": f"Bearer {token}"}
            )
            req_time = time.time() - req_start

            results.append({
                "request": i + 1,
                "status": response.status_code,
                "time": req_time,
                "timestamp": time.time() - start_time
            })

            if response.status_code == 429:
                print(f"Rate limit hit at request {i + 1}")
                break

        except Exception as e:
            print(f"Request {i + 1} failed: {e}")
            results.append({
                "request": i + 1,
                "status": None,
                "error": str(e)
            })

    return results

# Analyze results to find rate limit pattern
# results = test_rate_limit("https://api-b2b.lamoda.ru/api/v1/orders", token)

2. Gradual Ramp-Up Testing

def gradual_rampup_test(endpoint, token, max_rps=10):
    """
    Gradually increase request rate until hitting rate limit.
    """
    import time

    requests_per_second = 1
    total_requests = 0

    while requests_per_second <= max_rps:
        print(f"Testing {requests_per_second} requests/second...")

        interval = 1.0 / requests_per_second
        hit_rate_limit = False

        for i in range(requests_per_second * 10):  # Test for 10 seconds
            time.sleep(interval)
            response = requests.get(
                endpoint,
                headers={"Authorization": f"Bearer {token}"}
            )
            total_requests += 1

            if response.status_code == 429:
                hit_rate_limit = True
                break

        if hit_rate_limit:
            print(f"Rate limit detected at ~{requests_per_second} req/s")
            return requests_per_second - 1

        requests_per_second += 1

    return max_rps

Monitoring Rate Limits in Production

import time
from collections import defaultdict
from datetime import datetime, timedelta

class RateLimitMonitor:
    def __init__(self):
        self.requests = defaultdict(list)
        self.rate_limit_hits = defaultdict(list)

    def record_request(self, endpoint, status_code):
        """Record an API request."""
        now = datetime.now()
        self.requests[endpoint].append(now)

        if status_code == 429:
            self.rate_limit_hits[endpoint].append(now)

    def get_stats(self, endpoint, minutes=5):
        """Get statistics for the last N minutes."""
        cutoff = datetime.now() - timedelta(minutes=minutes)

        recent_requests = [
            t for t in self.requests[endpoint]
            if t > cutoff
        ]

        recent_hits = [
            t for t in self.rate_limit_hits[endpoint]
            if t > cutoff
        ]

        return {
            "endpoint": endpoint,
            "period_minutes": minutes,
            "total_requests": len(recent_requests),
            "rate_limit_hits": len(recent_hits),
            "requests_per_minute": len(recent_requests) / minutes,
            "hit_rate": len(recent_hits) / len(recent_requests) if recent_requests else 0
        }

# Usage
monitor = RateLimitMonitor()

def make_monitored_request(endpoint, token):
    response = requests.get(
        endpoint,
        headers={"Authorization": f"Bearer {token}"}
    )
    monitor.record_request(endpoint, response.status_code)
    return response

# Check stats periodically
stats = monitor.get_stats("https://api-b2b.lamoda.ru/api/v1/orders", minutes=5)
print(f"Stats: {stats}")

Best Practices for Rate Limit Management

1. Request Batching and Chunking

Even though batch operations are not supported, you can optimize by:

def chunked_process(items, chunk_size, api_call, delay=0.5):
    """
    Process items in chunks with progress tracking.
    """
    total_chunks = (len(items) + chunk_size - 1) // chunk_size

    for i in range(0, len(items), chunk_size):
        chunk = items[i:i + chunk_size]
        chunk_num = i // chunk_size + 1

        print(f"Processing chunk {chunk_num}/{total_chunks} ({len(chunk)} items)")

        for item in chunk:
            api_call(item)
            time.sleep(delay)

2. Priority Queue for Critical Operations

import queue
import threading

class PriorityRequestQueue:
    def __init__(self, min_delay=0.1):
        self.queue = queue.PriorityQueue()
        self.min_delay = min_delay
        self.running = False

    def add_request(self, priority, api_call, *args, **kwargs):
        """Add request to queue with priority (lower number = higher priority)."""
        self.queue.put((priority, api_call, args, kwargs))

    def start_processing(self):
        """Start processing requests in background thread."""
        self.running = True
        thread = threading.Thread(target=self._process)
        thread.start()

    def _process(self):
        while self.running:
            try:
                priority, api_call, args, kwargs = self.queue.get(timeout=1)
                api_call(*args, **kwargs)
                time.sleep(self.min_delay)
            except queue.Empty:
                continue
            except Exception as e:
                print(f"Error processing request: {e}")

3. Cache Frequently Accessed Data

import time
from functools import lru_cache
import hashlib

class CachedAPI:
    def __init__(self, cache_ttl=300):  # 5 minutes default
        self.cache = {}
        self.cache_ttl = cache_ttl

    def get(self, endpoint, token, params=None):
        # Create cache key
        cache_key = hashlib.md5(
            f"{endpoint}{str(params)}".encode()
        ).hexdigest()

        # Check cache
        if cache_key in self.cache:
            cached_data, cached_time = self.cache[cache_key]
            if time.time() - cached_time < self.cache_ttl:
                return cached_data

        # Make API request
        response = requests.get(
            endpoint,
            headers={"Authorization": f"Bearer {token}"},
            params=params
        )

        data = response.json()

        # Cache response
        self.cache[cache_key] = (data, time.time())

        return data

4. Exponential Backoff with Jitter

import random
import time

def request_with_backoff(api_call, max_retries=5):
    """
    Make API request with exponential backoff and jitter.
    """
    for attempt in range(max_retries):
        try:
            return api_call()
        except Exception as e:
            if attempt == max_retries - 1:
                raise

            # Exponential backoff with jitter
            base_delay = 2 ** attempt
            jitter = random.uniform(0, 1)
            delay = base_delay + jitter

            print(f"Retry {attempt + 1}/{max_retries} after {delay:.2f}s")
            time.sleep(delay)

Contacting Support for Rate Limit Information

If you need specific rate limit information for your integration:

Official Support Contact

Email: api-integration@lamoda.ru

SLA: 4 hours response time (according to documentation)

Questions to Ask Support

What are the specific rate limits for each API system?
- Requests per minute/hour/day?
- Different limits per endpoint?
- Burst limits?
Are there rate limit headers returned in API responses?
- X-RateLimit-Limit
- X-RateLimit-Remaining
- X-RateLimit-Reset
- Retry-After
Can rate limits be increased for production integrations?
- What is the process for requesting higher limits?
- Are there different tiers for different partners?
Are there best practices for high-volume integrations?
- Recommended request patterns
- Caching strategies
- Parallel processing guidelines
What monitoring and alerting is recommended?
- How to track rate limit usage
- Warning thresholds before hitting limits

Sample Email Template

Subject: Запрос информации о лимитах API для интеграции

Добрый день!

Мы正在 интегрироваться с Lamoda API и хотели бы уточнить информацию о rate limits:

1. Каковы конкретные лимиты на количество запросов для каждого API?
   - B2B Platform API
   - Seller JSON-RPC API
   - Seller REST API

2. Есть ли ограничения на количество запросов в минуту/час/день?

3. Возвращает ли API заголовки с информацией о лимитах (X-RateLimit-*)?

4. Есть ли возможность увеличения лимитов для production-среды?

5. Какие рекомендации по оптимизации количества запросов?

Наша компания: [Название компании]
ID клиента: [Ваш client_id]
Тип интеграции: [Описание что вы интегрируете]

С уважением,
[Ваше имя]
[Ваш email]
[Телефон]

Summary and Recommendations

What We Know

Aspect	B2B Platform API	Seller JSON-RPC API	Seller REST API
Token Lifetime	15 minutes	15 minutes	~15 minutes
Token Generation	Unlimited	Unlimited	Unlimited
Batch Operations	❌ Not supported	❌ Not supported	❌ Not supported
Rate Limiting	✅ Documented (429)	Likely exists	Likely exists
Rate Limit Headers	❌ Not documented	❌ Not documented	❌ Not documented

Recommendations

Implement Conservative Rate Limiting
- Start with 1 request per second
- Use 0.5-1 second delays between requests
- Monitor for 429 responses
Use Multiple Tokens for Parallel Processing
- Generate 3-5 tokens
- Round-robin through tokens
- Each token has independent rate limits
Implement Proper Error Handling
- Catch and handle 429 responses
- Use exponential backoff
- Log rate limit events
Contact Support for Production Limits
- Email: api-integration@lamoda.ru
- Ask for specific limits for your use case
- Request limit increase if needed
Monitor and Adjust
- Track request rates
- Log rate limit hits
- Adjust based on actual usage
Use Caching Where Possible
- Cache reference data (brands, categories)
- Reduce redundant requests
- Implement TTL-based cache invalidation

Conclusion

Lamoda's APIs do have rate limiting mechanisms in place (evidenced by the 429 error code in the B2B Platform API), but the specific limits are not publicly documented. This is common for B2B marketplace APIs where limits may be customized per partner.

Key Takeaway: The safest approach is to implement conservative rate limiting with proper backoff and retry logic, and contact Lamoda support directly to get specific limits for your integration.

References

Documentation Sources

Lamoda Seller Academy: https://academy.lamoda.ru/
B2B Platform API Spec: Downloaded (252KB)
Seller JSON-RPC API Spec: Downloaded (96KB)
Seller REST API Spec: Downloaded (46KB)

Previous Research

Subtask 2-1: Authentication Methods
Subtask 2-2: Base URLs and Endpoints Structure

Support Contact

Email: api-integration@lamoda.ru
SLA: 4 hours response time

End of Documentation

Overview​

Key Findings Summary​

⚠️ Important: Rate Limits Are NOT Publicly Documented​

What IS Documented:​

What is NOT Publicly Documented:​

1. B2B Platform API (REST) - Rate Limits​

Base URLs​

Token Lifetime​

Documented Rate Limiting​

HTTP 429 - Rate Limit Exceeded​

Batch Operations​

Token Generation​

Rate Limit Headers​

2. Seller JSON-RPC API - Rate Limits​

Base URL​

Token Lifetime​

Documented Rate Limiting​

Batch Operations​

Token Generation​

Rate Limit Headers​

3. Seller REST API - Rate Limits​

Base URL​

Token Lifetime​

Documented Rate Limiting​

Batch Operations​

Rate Limit Headers​

Rate Limit Response Handling​

HTTP Status Code 429​

Recommended Handling Strategy​

Throttling Policies​

Observed Throttling Behaviors​

Recommended Throttling Strategy​

1. Sequential Processing with Delays​

2. Parallel Processing with Multiple Tokens​

3. Adaptive Rate Limiting​

Rate Limit Estimation and Testing​

How to Determine Your Actual Limits​

1. Load Testing Approach​

2. Gradual Ramp-Up Testing​

Monitoring Rate Limits in Production​

Best Practices for Rate Limit Management​

1. Request Batching and Chunking​

2. Priority Queue for Critical Operations​

3. Cache Frequently Accessed Data​

4. Exponential Backoff with Jitter​

Contacting Support for Rate Limit Information​

Official Support Contact​

Questions to Ask Support​

Sample Email Template​

Summary and Recommendations​

What We Know​

Recommendations​

Conclusion​

References​

Documentation Sources​

Previous Research​

Support Contact​

Overview

Key Findings Summary

⚠️ Important: Rate Limits Are NOT Publicly Documented

What IS Documented:

What is NOT Publicly Documented:

1. B2B Platform API (REST) - Rate Limits

Base URLs

Token Lifetime

Documented Rate Limiting

HTTP 429 - Rate Limit Exceeded

Batch Operations

Token Generation

Rate Limit Headers

2. Seller JSON-RPC API - Rate Limits

Base URL

Token Lifetime

Documented Rate Limiting

Batch Operations

Token Generation

Rate Limit Headers

3. Seller REST API - Rate Limits

Base URL

Token Lifetime

Documented Rate Limiting

Batch Operations

Rate Limit Headers

Rate Limit Response Handling

HTTP Status Code 429

Recommended Handling Strategy

Throttling Policies

Observed Throttling Behaviors

Recommended Throttling Strategy

1. Sequential Processing with Delays

2. Parallel Processing with Multiple Tokens

3. Adaptive Rate Limiting

Rate Limit Estimation and Testing

How to Determine Your Actual Limits

1. Load Testing Approach

2. Gradual Ramp-Up Testing

Monitoring Rate Limits in Production

Best Practices for Rate Limit Management

1. Request Batching and Chunking

2. Priority Queue for Critical Operations

3. Cache Frequently Accessed Data

4. Exponential Backoff with Jitter

Contacting Support for Rate Limit Information

Official Support Contact

Questions to Ask Support

Sample Email Template

Summary and Recommendations

What We Know

Recommendations

Conclusion

References

Documentation Sources

Previous Research

Support Contact