Retention Policies

Retention policies allow you to automatically manage data lifecycle by defining how long data should be kept.

Manual Execution Only

Automatic execution is reserved for Arc Enterprise. In Arc OSS, retention policies must be manually triggered via the API.

Overview

Retention policies in Arc help you:

Define data retention periods at database or measurement level
Automatically clean up old data through manual execution
Reduce storage costs by removing unnecessary historical data
Maintain compliance with data retention requirements
Test deletion operations safely with dry-run mode

How It Works

Arc implements retention through physical file deletion:

Scanning: Examines Parquet files in measurement directories
Metadata Analysis: Reads file metadata to find maximum timestamps
Identification: Locates files where all rows are older than the cutoff date
Deletion: Physically removes entire files from disk

Cutoff Calculation: cutoff_date = today - retention_days - buffer_days

API Endpoints

Create Policy

Create a new retention policy:

POST /api/v1/retention

Request Body:

{
  "name": "delete_old_metrics",
  "database": "telegraf",
  "measurement": "cpu",
  "retention_days": 90,
  "buffer_days": 7,
  "is_active": true
}

Parameters:

name (string, required): Unique policy identifier
database (string, required): Target database name
measurement (string, optional): Target measurement (null for database-wide)
retention_days (integer, required): Number of days to retain data
buffer_days (integer, required): Safety margin in days
is_active (boolean, required): Enable/disable the policy

List Policies

Retrieve all retention policies:

GET /api/v1/retention

Response:

[
  {
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "name": "delete_old_metrics",
    "database": "telegraf",
    "measurement": "cpu",
    "retention_days": 90,
    "buffer_days": 7,
    "is_active": true,
    "created_at": "2024-01-15T10:30:00Z",
    "last_executed_at": "2024-01-20T02:00:00Z",
    "last_deleted_count": 1500
  }
]

Get Single Policy

Retrieve a specific retention policy:

GET /api/v1/retention/{policy_id}

Update Policy

Update an existing retention policy:

PUT /api/v1/retention/{policy_id}

Request Body: Same as create policy

Delete Policy

Remove a retention policy:

DELETE /api/v1/retention/{policy_id}

Execute Policy

Manually trigger a retention policy:

POST /api/v1/retention/{policy_id}/execute

Request Body:

{
  "dry_run": false,
  "confirm": true
}

Dry Run Example:

{
  "dry_run": true,
  "confirm": false
}

Response:

{
  "policy_id": "550e8400-e29b-41d4-a716-446655440000",
  "cutoff_date": "2023-10-22T00:00:00Z",
  "files_to_delete": [
    "/data/telegraf/cpu/2023-10-15.parquet",
    "/data/telegraf/cpu/2023-10-20.parquet"
  ],
  "total_files": 2,
  "dry_run": true
}

View Execution History

View past executions of a retention policy:

GET /api/v1/retention/{policy_id}/executions?limit=50

Response:

[
  {
    "execution_id": "abc123",
    "executed_at": "2024-01-20T02:00:00Z",
    "deleted_count": 1500,
    "execution_time_ms": 2500,
    "status": "success"
  }
]

Configuration Parameters

Retention Days

The number of days to keep data before it becomes eligible for deletion. Choose based on:

Business requirements
Compliance regulations
Storage capacity
Query patterns

Example: retention_days: 90 keeps data for 90 days.

Buffer Days

A safety margin added to the retention period to prevent accidental deletion of recent data.

Recommended Values:

Development: 7 days
Production: 14-30 days

Example: With retention_days: 90 and buffer_days: 7, data older than 97 days will be deleted.

Database vs Measurement Level

Database-wide policy:

{
  "database": "telegraf",
  "measurement": null,
  "retention_days": 365
}

Measurement-specific policy:

{
  "database": "telegraf",
  "measurement": "cpu",
  "retention_days": 90
}

Use measurement-specific policies for granular control over different data types.

Usage Examples

Example 1: Clean Old Metrics

import requests

# Create a retention policy for old CPU metrics
response = requests.post(
    "http://localhost:8000/api/v1/retention",
    headers={"Authorization": "Bearer $ARC_TOKEN"},
    json={
        "name": "cpu_cleanup",
        "database": "telegraf",
        "measurement": "cpu",
        "retention_days": 90,
        "buffer_days": 7,
        "is_active": True
    }
)

policy_id = response.json()["id"]

# Test with dry run first
dry_run = requests.post(
    f"http://localhost:8000/api/v1/retention/{policy_id}/execute",
    headers={"Authorization": "Bearer $ARC_TOKEN"},
    json={"dry_run": True, "confirm": False}
)

print(f"Would delete {dry_run.json()['total_files']} files")

# Execute for real
if input("Proceed? (yes/no): ") == "yes":
    result = requests.post(
        f"http://localhost:8000/api/v1/retention/{policy_id}/execute",
        headers={"Authorization": "Bearer $ARC_TOKEN"},
        json={"dry_run": False, "confirm": True}
    )
    print(f"Deleted {result.json()['total_files']} files")

Example 2: Database-wide Retention

# Apply retention to all measurements in a database
response = requests.post(
    "http://localhost:8000/api/v1/retention",
    headers={"Authorization": "Bearer $ARC_TOKEN"},
    json={
        "name": "database_cleanup",
        "database": "telegraf",
        "measurement": None,  # Apply to all measurements
        "retention_days": 180,
        "buffer_days": 14,
        "is_active": True
    }
)

Example 3: List and Monitor Policies

# List all policies
policies = requests.get(
    "http://localhost:8000/api/v1/retention",
    headers={"Authorization": "Bearer $ARC_TOKEN"}
)

for policy in policies.json():
    print(f"Policy: {policy['name']}")
    print(f"  Last executed: {policy['last_executed_at']}")
    print(f"  Last deleted: {policy['last_deleted_count']} rows")

    # Get execution history
    history = requests.get(
        f"http://localhost:8000/api/v1/retention/{policy['id']}/executions?limit=10",
        headers={"Authorization": "Bearer $ARC_TOKEN"}
    )
    print(f"  Recent executions: {len(history.json())}")

Best Practices

1. Always Test First

Use dry-run mode before executing retention policies:

# Always start with dry run
result = requests.post(
    f"http://localhost:8000/api/v1/retention/{policy_id}/execute",
    json={"dry_run": True, "confirm": False}
)

# Review what will be deleted
print(f"Files to delete: {result.json()['files_to_delete']}")

2. Use Buffer Days

Implement a safety buffer to prevent accidental deletion:

{
  "retention_days": 90,
  "buffer_days": 14  // 14-day safety margin
}

3. Start Conservative

Begin with longer retention periods and gradually shorten:

// Start here
{"retention_days": 365, "buffer_days": 30}

// After monitoring, reduce if needed
{"retention_days": 180, "buffer_days": 14}

4. Test in Non-Production

Create and test policies in a development environment first:

# Development environment
export ARC_ENV=dev
# Test policies thoroughly before production

5. Monitor Execution History

Regularly check the last_deleted_count field:

# Check if deletion counts are as expected
policy = requests.get(f"/api/v1/retention/{policy_id}").json()
if policy['last_deleted_count'] > 10000:
    print("Warning: Large deletion detected!")

6. Use Measurement-Specific Policies

Create granular policies for different data types:

# High-frequency metrics - shorter retention
{"measurement": "cpu", "retention_days": 30}

# Business metrics - longer retention
{"measurement": "revenue", "retention_days": 730}

Important Limitations

Local Storage Only

Currently, retention policies only work with local filesystem storage. Cloud storage backends (S3, MinIO, GCS) are not yet implemented.

File-Level Granularity

Retention operates at the file level, not row level. A file is only deleted if all rows are older than the cutoff date.

caution

For optimal retention policy effectiveness, ensure your data is properly compacted. Files with mixed timestamps may not be eligible for deletion.

No Rollback

Deleted data cannot be recovered. Always:

Use dry-run mode first
Maintain backups of critical data
Test in non-production environments

Sequential Processing

Retention policies process measurements sequentially. Large databases may take time to process.

Works Best with Compacted Files

Retention policies are most effective when files contain data from similar time periods. Enable automatic compaction for better results.

Troubleshooting

No Files Being Deleted

Problem: Dry run shows 0 files to delete.

Solutions:

Check that data actually exists older than retention_days + buffer_days
Verify the policy targets the correct database and measurement
Ensure files are fully older than the cutoff (file-level granularity)

Policy Not Executing

Problem: Manual execution returns an error.

Solutions:

Verify the policy is_active is set to true
Check that confirm: true is set for actual execution
Ensure you have write permissions on the data directory

Unexpected File Count

Problem: More/fewer files than expected are being deleted.

Solutions:

Remember: Only files where all rows are older than cutoff are deleted
Check file timestamps using ls -l on the measurement directory
Review recent compaction activity that may have merged files

Delete Operations - Manual delete operations for specific data
Continuous Queries - Downsample data before deletion
Compaction - Optimize file structure for better retention

Overview​

How It Works​

API Endpoints​

Create Policy​

List Policies​

Get Single Policy​

Update Policy​

Delete Policy​

Execute Policy​

View Execution History​

Configuration Parameters​

Retention Days​

Buffer Days​

Database vs Measurement Level​

Usage Examples​

Example 1: Clean Old Metrics​

Example 2: Database-wide Retention​

Example 3: List and Monitor Policies​

Best Practices​

1. Always Test First​

2. Use Buffer Days​

3. Start Conservative​

4. Test in Non-Production​

5. Monitor Execution History​

6. Use Measurement-Specific Policies​

Important Limitations​

Local Storage Only​

File-Level Granularity​

No Rollback​

Sequential Processing​

Works Best with Compacted Files​

Troubleshooting​

No Files Being Deleted​

Policy Not Executing​

Unexpected File Count​

Related Topics​

Overview

How It Works

API Endpoints

Create Policy

List Policies

Get Single Policy

Update Policy

Delete Policy

Execute Policy

View Execution History

Configuration Parameters

Retention Days

Buffer Days

Database vs Measurement Level

Usage Examples

Example 1: Clean Old Metrics

Example 2: Database-wide Retention

Example 3: List and Monitor Policies

Best Practices

1. Always Test First

2. Use Buffer Days

3. Start Conservative

4. Test in Non-Production

5. Monitor Execution History

6. Use Measurement-Specific Policies

Important Limitations

Local Storage Only

File-Level Granularity

No Rollback

Sequential Processing

Works Best with Compacted Files

Troubleshooting

No Files Being Deleted

Policy Not Executing

Unexpected File Count

Related Topics