> ## Documentation Index
> Fetch the complete documentation index at: https://documentation.datalab.to/llms.txt
> Use this file to discover all available pages before exploring further.

# Usage Analytics

> Monitor inference request analytics and performance metrics in your on-prem deployment.

The usage analytics endpoint provides comprehensive metrics about inference requests processed by your on-premises container. Use this endpoint to monitor request volumes, success rates, performance statistics, and current system status.

<Info>
  This endpoint is only available in on-premises deployments and requires a valid license.
</Info>

## Endpoint

```bash theme={null}
GET /api/v1/usage
```

## Query Parameters

| Parameter    | Type              | Default      | Description                       |
| ------------ | ----------------- | ------------ | --------------------------------- |
| `start_date` | string (ISO 8601) | 24 hours ago | Start of time range for analytics |
| `end_date`   | string (ISO 8601) | Now          | End of time range for analytics   |

<Warning>
  The time range cannot exceed 7 days. Requests with larger ranges will return a 400 error.
</Warning>

## Authentication

This endpoint requires a valid on-premises license. If your license is invalid or expired, the endpoint returns a 423 (Locked) status code.

## Response Structure

The endpoint returns a comprehensive analytics object with five main sections:

### Period

The effective time range for the query (normalized to UTC):

```json theme={null}
{
  "period": {
    "start_date": "2024-06-01T00:00:00+00:00",
    "end_date": "2024-06-01T23:59:59+00:00"
  }
}
```

### Summary

Aggregate statistics across all request types:

```json theme={null}
{
  "summary": {
    "total_requests": 1250,
    "successful_requests": 1200,
    "failed_requests": 50,
    "successful_pages_processed": 15000,
    "failed_pages_processed": 500,
    "success_rate": 0.96
  }
}
```

| Field                        | Type  | Description                                 |
| ---------------------------- | ----- | ------------------------------------------- |
| `total_requests`             | int   | Total completed requests in time range      |
| `successful_requests`        | int   | Requests completed without errors           |
| `failed_requests`            | int   | Requests that failed with errors            |
| `successful_pages_processed` | int   | Total pages from successful requests        |
| `failed_pages_processed`     | int   | Total pages from failed requests            |
| `success_rate`               | float | Ratio of successful to total requests (0-1) |

### By Request Type

Per-type breakdown of the same metrics:

```json theme={null}
{
  "by_request_type": {
    "marker": {
      "total_requests": 1000,
      "successful_requests": 980,
      "failed_requests": 20,
      "successful_pages_processed": 12000,
      "failed_pages_processed": 200
    },
    "ocr": {
      "total_requests": 250,
      "successful_requests": 220,
      "failed_requests": 30,
      "successful_pages_processed": 3000,
      "failed_pages_processed": 300
    }
  }
}
```

### Performance

Processing time and queue wait statistics (only includes successful requests):

```json theme={null}
{
  "performance": {
    "average_processing_time_secs": 12.5,
    "median_processing_time_secs": 10.2,
    "p95_processing_time_secs": 25.8,
    "p99_processing_time_secs": 35.4,
    "average_queue_wait_secs": 2.3
  }
}
```

| Field                          | Type  | Description                        |
| ------------------------------ | ----- | ---------------------------------- |
| `average_processing_time_secs` | float | Mean time from start to completion |
| `median_processing_time_secs`  | float | 50th percentile processing time    |
| `p95_processing_time_secs`     | float | 95th percentile processing time    |
| `p99_processing_time_secs`     | float | 99th percentile processing time    |
| `average_queue_wait_secs`      | float | Mean time from submission to start |

<Info>
  Performance metrics are `null` when there are no successful requests in the time range. Failed requests are excluded from performance calculations.
</Info>

### Current Status

Live snapshot of in-progress and queued requests (not filtered by time range):

```json theme={null}
{
  "current_status": {
    "requests_in_progress": 5,
    "requests_queued": 12
  }
}
```

| Field                  | Type | Description                        |
| ---------------------- | ---- | ---------------------------------- |
| `requests_in_progress` | int  | Requests currently being processed |
| `requests_queued`      | int  | Requests waiting to be processed   |

## Examples

### Basic Usage (Default 24-Hour Window)

<CodeGroup>
  ```python Python SDK theme={null}
  # The Python SDK does not yet support the usage endpoint
  # Use the requests library directly
  import requests

  response = requests.get(
      "http://localhost:8000/api/v1/usage",
      headers={"X-API-Key": "any-value"}  # Not validated in on-prem
  )

  data = response.json()
  print(f"Total requests: {data['summary']['total_requests']}")
  print(f"Success rate: {data['summary']['success_rate']:.2%}")
  ```

  ```bash cURL theme={null}
  curl -X GET http://localhost:8000/api/v1/usage \
    -H "X-API-Key: any-value"
  ```

  ```python Python (requests) theme={null}
  import requests

  response = requests.get(
      "http://localhost:8000/api/v1/usage",
      headers={"X-API-Key": "any-value"}
  )

  data = response.json()

  # Print summary
  summary = data["summary"]
  print(f"Total: {summary['total_requests']}")
  print(f"Success: {summary['successful_requests']}")
  print(f"Failed: {summary['failed_requests']}")
  print(f"Success rate: {summary['success_rate']:.2%}")

  # Print performance metrics
  perf = data["performance"]
  if perf["average_processing_time_secs"]:
      print(f"\nAvg processing time: {perf['average_processing_time_secs']:.2f}s")
      print(f"P95 processing time: {perf['p95_processing_time_secs']:.2f}s")
  ```
</CodeGroup>

### Custom Time Range

<CodeGroup>
  ```python Python SDK theme={null}
  import requests
  from datetime import datetime, timedelta, timezone

  # Query last 7 days
  end_date = datetime.now(timezone.utc)
  start_date = end_date - timedelta(days=7)

  response = requests.get(
      "http://localhost:8000/api/v1/usage",
      params={
          "start_date": start_date.isoformat(),
          "end_date": end_date.isoformat()
      },
      headers={"X-API-Key": "any-value"}
  )

  data = response.json()
  ```

  ```bash cURL theme={null}
  # Query specific date range
  curl -X GET "http://localhost:8000/api/v1/usage?start_date=2024-06-01T00:00:00Z&end_date=2024-06-07T23:59:59Z" \
    -H "X-API-Key: any-value"
  ```

  ```python Python (requests) theme={null}
  import requests
  from datetime import datetime, timedelta, timezone

  # Query last 3 days
  end_date = datetime.now(timezone.utc)
  start_date = end_date - timedelta(days=3)

  response = requests.get(
      "http://localhost:8000/api/v1/usage",
      params={
          "start_date": start_date.isoformat(),
          "end_date": end_date.isoformat()
      },
      headers={"X-API-Key": "any-value"}
  )

  data = response.json()
  print(f"Period: {data['period']['start_date']} to {data['period']['end_date']}")
  ```
</CodeGroup>

### Monitoring Dashboard Example

<CodeGroup>
  ```python Python SDK theme={null}
  import requests
  from datetime import datetime, timezone

  def get_usage_metrics():
      """Fetch current usage metrics for monitoring dashboard."""
      response = requests.get(
          "http://localhost:8000/api/v1/usage",
          headers={"X-API-Key": "any-value"}
      )
      
      if response.status_code != 200:
          raise Exception(f"Failed to fetch metrics: {response.status_code}")
      
      return response.json()

  def print_dashboard():
      """Print a simple monitoring dashboard."""
      data = get_usage_metrics()
      
      print("=" * 60)
      print("DATALAB ON-PREM USAGE DASHBOARD")
      print("=" * 60)
      
      # Summary
      summary = data["summary"]
      print(f"\n📊 SUMMARY (Last 24 Hours)")
      print(f"  Total Requests:     {summary['total_requests']:,}")
      print(f"  Successful:         {summary['successful_requests']:,}")
      print(f"  Failed:             {summary['failed_requests']:,}")
      print(f"  Success Rate:       {summary['success_rate']:.2%}")
      print(f"  Pages Processed:    {summary['successful_pages_processed']:,}")
      
      # By type
      print(f"\n📈 BY REQUEST TYPE")
      for req_type, metrics in data["by_request_type"].items():
          print(f"  {req_type.upper()}:")
          print(f"    Requests: {metrics['total_requests']:,} ({metrics['successful_requests']:,} successful)")
          print(f"    Pages: {metrics['successful_pages_processed']:,}")
      
      # Performance
      perf = data["performance"]
      if perf["average_processing_time_secs"]:
          print(f"\n⚡ PERFORMANCE")
          print(f"  Avg Processing:     {perf['average_processing_time_secs']:.2f}s")
          print(f"  Median Processing:  {perf['median_processing_time_secs']:.2f}s")
          print(f"  P95 Processing:     {perf['p95_processing_time_secs']:.2f}s")
          print(f"  P99 Processing:     {perf['p99_processing_time_secs']:.2f}s")
          print(f"  Avg Queue Wait:     {perf['average_queue_wait_secs']:.2f}s")
      
      # Current status
      status = data["current_status"]
      print(f"\n🔄 CURRENT STATUS")
      print(f"  In Progress:        {status['requests_in_progress']}")
      print(f"  Queued:             {status['requests_queued']}")
      
      print("=" * 60)

  if __name__ == "__main__":
      print_dashboard()
  ```

  ```bash cURL theme={null}
  # Simple monitoring script
  curl -s http://localhost:8000/api/v1/usage \
    -H "X-API-Key: any-value" | \
    jq '{
      total: .summary.total_requests,
      success_rate: .summary.success_rate,
      in_progress: .current_status.requests_in_progress,
      queued: .current_status.requests_queued
    }'
  ```

  ```python Python (requests) theme={null}
  import requests
  from datetime import datetime, timezone

  def monitor_system_health():
      """Check system health based on usage metrics."""
      response = requests.get(
          "http://localhost:8000/api/v1/usage",
          headers={"X-API-Key": "any-value"}
      )
      
      data = response.json()
      summary = data["summary"]
      status = data["current_status"]
      perf = data["performance"]
      
      # Check success rate
      if summary["success_rate"] < 0.95:
          print(f"⚠️  WARNING: Success rate is {summary['success_rate']:.2%}")
      
      # Check queue depth
      if status["requests_queued"] > 100:
          print(f"⚠️  WARNING: {status['requests_queued']} requests queued")
      
      # Check processing time
      if perf["p95_processing_time_secs"] and perf["p95_processing_time_secs"] > 60:
          print(f"⚠️  WARNING: P95 processing time is {perf['p95_processing_time_secs']:.1f}s")
      
      print("✅ System health check complete")

  monitor_system_health()
  ```
</CodeGroup>

## Error Responses

### 400 Bad Request

Invalid query parameters:

```json theme={null}
{
  "detail": "start_date must be before end_date."
}
```

```json theme={null}
{
  "detail": "Time range must not exceed 7 days."
}
```

### 423 Locked

License validation failed:

```json theme={null}
{
  "detail": "License validation failed"
}
```

## Implementation Notes

* Only **completed requests** (with `end_time` set) are included in summary statistics
* Failed requests are counted in totals but excluded from performance metrics
* Performance percentiles use linear interpolation for accurate calculation
* Queue wait time is calculated as `start_time - submission_time`
* Processing time is calculated as `end_time - start_time`
* Naive datetimes (without timezone) are treated as UTC
* The `current_status` section provides a live snapshot and is not filtered by the time range

## Use Cases

### Capacity Planning

Monitor request volumes and processing times to plan infrastructure scaling:

```python theme={null}
import requests
from datetime import datetime, timedelta, timezone

# Get last 7 days of data
end = datetime.now(timezone.utc)
start = end - timedelta(days=7)

response = requests.get(
    "http://localhost:8000/api/v1/usage",
    params={"start_date": start.isoformat(), "end_date": end.isoformat()},
    headers={"X-API-Key": "any-value"}
)

data = response.json()
avg_daily_requests = data["summary"]["total_requests"] / 7
avg_daily_pages = data["summary"]["successful_pages_processed"] / 7

print(f"Average daily requests: {avg_daily_requests:.0f}")
print(f"Average daily pages: {avg_daily_pages:.0f}")
```

### Performance Monitoring

Track processing times to identify performance degradation:

```python theme={null}
import requests

response = requests.get(
    "http://localhost:8000/api/v1/usage",
    headers={"X-API-Key": "any-value"}
)

perf = response.json()["performance"]

# Alert if P95 exceeds threshold
if perf["p95_processing_time_secs"] and perf["p95_processing_time_secs"] > 30:
    print(f"ALERT: P95 processing time is {perf['p95_processing_time_secs']:.1f}s")
```

### Queue Monitoring

Monitor queue depth to detect bottlenecks:

```python theme={null}
import requests

response = requests.get(
    "http://localhost:8000/api/v1/usage",
    headers={"X-API-Key": "any-value"}
)

status = response.json()["current_status"]

if status["requests_queued"] > 50:
    print(f"WARNING: {status['requests_queued']} requests in queue")
```

## Next Steps

<CardGroup cols={2}>
  <Card title="On-Prem API" icon="server" href="/docs/on-prem/api">
    Full API reference for the on-prem container.
  </Card>

  <Card title="Running the Container" icon="play" href="/docs/on-prem/running-the-container">
    Get the on-prem container up and running.
  </Card>

  <Card title="Error Codes" icon="circle-exclamation" href="/platform/errors">
    Understand HTTP error codes and troubleshooting.
  </Card>

  <Card title="On-Prem Overview" icon="building" href="/docs/on-prem/overview">
    Compare open-source and paid on-prem options.
  </Card>
</CardGroup>
