Skip to main content

Performance Benchmarks

Arc delivers industry-leading performance for time-series workloads. This page documents our benchmark methodology and results.

Summary

MetricResult
Ingestion (MessagePack)9.47M records/sec
Query (Arrow)2.88M rows/sec
Query (JSON)2.23M rows/sec
Line Protocol1.92M records/sec

Test Hardware

Apple MacBook Pro (2023)

  • CPU: M3 Max (14 cores)
  • RAM: 36GB unified memory
  • Storage: 1TB NVMe SSD

Ingestion Performance

The fastest ingestion method, using binary MessagePack with columnar data layout.

MetricValue
Throughput9.47M records/sec
p50 Latency2.79ms
p95 Latency4.66ms
p99 Latency6.11ms
Workers35
Duration60 seconds
Success Rate100%

Line Protocol

InfluxDB-compatible text protocol, suitable for existing tooling.

MetricValue
Throughput1.92M records/sec
p50 Latency49.53ms
p99 Latency108.53ms

Protocol Comparison

ProtocolThroughputRelative Speed
MessagePack Columnar9.47M rec/s100% (baseline)
Line Protocol1.92M rec/s20%

Why MessagePack is faster:

  • Binary format (no parsing overhead)
  • Columnar layout matches Parquet storage
  • Native gzip compression support
  • Batch-optimized for high throughput

Query Performance

Arrow IPC vs JSON

FormatThroughputResponse Size (50K rows)
Arrow IPC2.88M rows/s1.71 MB
JSON2.23M rows/s2.41 MB

Arrow advantages:

  • Zero-copy conversion to Pandas/Polars
  • 29% smaller response payload
  • Native columnar format
  • Ideal for large result sets (10K+ rows)

Go vs Python Implementation

Arc was rewritten from Python to Go, delivering significant improvements:

MetricGoPythonImprovement
Ingestion9.47M rec/s4.21M rec/s+125%
Memory StabilityStable372MB leak/500 queriesFixed
DeploymentSingle binaryMulti-worker processesSimpler
Cold Start<100ms2-3 seconds20x faster

Why Go is Faster

  1. Stable Memory: Go's GC returns memory to OS. Python leaked memory under sustained load.
  2. Native Concurrency: Goroutines handle thousands of connections with minimal overhead.
  3. Single Binary: No Python interpreter or dependency management.
  4. Production GC: Sub-millisecond pause times at scale.

ClickBench Results

Industry-standard analytical query benchmark on the hits dataset.

Test Environment:

  • Instance: AWS c6a.4xlarge (16 vCPU, 32GB RAM)
  • Dataset: 100M rows (14GB Parquet)
  • Queries: 43 analytical queries
RunTotal TimeQueries
Cold (cache flushed)120.25s43
Warm35.70s43

Comparison with Other TSDBs

DatabaseWarm RunRelative Speed
Arc35.70s1.00x (baseline)
QuestDB64.26s1.80x slower
TimescaleDB335.22s9.39x slower

Arc is 1.80x faster than QuestDB and 9.39x faster than TimescaleDB in analytical workloads.

Detailed Query Performance

All 43 analytical queries completed successfully:

QueryRun 1 (Cold)Run 2Run 3 (Best)Speedup
Q00.0656s0.0493s0.0372s1.76x
Q10.0788s0.0593s0.0628s1.25x
Q20.1617s0.1006s0.0838s1.93x
Q30.3933s0.1135s0.0866s4.54x
Q41.0929s0.3696s0.3703s2.95x
...............

Why Arc is Fast

1. DuckDB Query Engine

Arc leverages DuckDB's columnar execution engine:

  • Vectorized execution: Process thousands of values per CPU instruction
  • Parallel query execution: Utilize all CPU cores automatically
  • Advanced optimizations: Join reordering, predicate pushdown, filter pushdown
  • SIMD instructions: Use modern CPU features (AVX2, AVX-512)

2. Parquet Columnar Storage

  • Columnar format: Read only columns needed for queries
  • Compression: 80% smaller than raw data (Snappy/ZSTD)
  • Predicate pushdown: Skip entire row groups based on statistics
  • Efficient scans: DuckDB reads Parquet natively

3. Go Runtime Efficiency

  • Stable memory: Go's GC returns memory to OS
  • Native concurrency: Goroutines handle thousands of connections
  • Single binary: No interpreter overhead
  • Sub-ms GC pauses: Production-ready garbage collection

Performance Tips

Maximize Ingestion Throughput

  1. Use MessagePack columnar format

    data = {"m": "cpu", "columns": {...}}  # 9.47M rec/s
    # vs
    data = "cpu,host=x value=1" # 1.92M rec/s
  2. Batch your writes (10,000+ records per request)

  3. Enable gzip compression for network efficiency

  4. Use multiple workers (35 optimal for M3 Max)

Maximize Query Throughput

  1. Use Arrow format for large result sets (10K+ rows)

    response = requests.post(url + "/api/v1/query/arrow", ...)
  2. Enable compaction for query optimization

    [compaction]
    enabled = true
  3. Use time-range filters (partition pruning)

    WHERE time > now() - INTERVAL '1 hour'

Scaling Characteristics

Vertical Scaling

  • CPU: Near-linear scaling with core count
  • Memory: Auto-configured to ~50% system RAM

Storage Backend Impact

BackendWrite OverheadQuery Overhead
Local NVMeBaselineBaseline
MinIO (local)+5-10%+2-5%
AWS S3+20-30%+10-20%

Reproducibility

Run benchmarks locally:

git clone https://github.com/basekick-labs/arc.git
cd arc
make bench

Next Steps