# Tuning Guide

This guide helps you optimize EntryTarget's performance for your specific workload.

## Batch Engine Tuning

The batch engine has two primary controls:

### LEDGER\_BATCH\_SIZE

Maximum transactions per batch. The batch flushes when this number of transactions has accumulated.

| Setting | Effect                                                     |
| ------- | ---------------------------------------------------------- |
| Lower   | Smaller batches, lower latency, lower throughput           |
| Higher  | Larger batches, higher throughput, slightly higher latency |

**Guidelines:**

* Start with **150** for most workloads
* Increase to **200-300** for high-throughput scenarios
* The optimal setting depends on your RDS instance — larger instances handle larger batches better

### LEDGER\_BATCH\_TIMEOUT\_MS

Maximum wait time (milliseconds) before flushing a batch, even if `BATCH_SIZE` hasn't been reached.

| Setting | Effect                                                       |
| ------- | ------------------------------------------------------------ |
| Lower   | Lower latency at low volume, more frequent (smaller) commits |
| Higher  | Better batching at low volume, higher latency floor          |

**Guidelines:**

* **15ms** is a good default — balances latency and throughput
* **20ms** for workloads that prioritize throughput over latency
* **10ms** for latency-sensitive workloads with consistently high volume

### How They Interact

The batch flushes when **either** condition is met:

```
if transactions_queued >= BATCH_SIZE || time_since_last_flush >= BATCH_TIMEOUT_MS:
    flush()
```

* **High volume:** batches fill to `BATCH_SIZE` before the timeout — timeout is irrelevant
* **Low volume:** timeout triggers the flush — smaller batches are committed more frequently
* **Burst traffic:** batches fill rapidly, achieving maximum throughput

## Connection Pool Tuning

### LEDGER\_DB\_POOL\_SIZE

Sets the maximum connections per pool (master and replica independently).

| Setting  | Effect                                                      |
| -------- | ----------------------------------------------------------- |
| Too low  | Connection contention, increased latency, possible timeouts |
| Too high | Wastes RDS connections, minimal benefit                     |

**Guidelines:**

* **10** for low-volume workloads (Tier 1)
* **20** for medium-volume workloads (Tier 2)
* **40** for high-volume workloads (Tier 3)

Ensure your RDS `max_connections` accommodates `2 * POOL_SIZE` plus overhead.

## Idempotency Tuning

### LEDGER\_IDEMPOTENCY\_TTL\_HOURS

How long idempotency records are kept.

* **1 hour** is recommended for most workloads
* Increase if your system needs a longer reconciliation window
* Longer TTLs mean more records in the table — monitor `idempotency_record` size

### LEDGER\_IDEMPOTENCY\_CLEANUP\_INTERVAL\_MIN

How often expired records are cleaned up.

* **5 minutes** is recommended
* The first cleanup runs after this interval from startup
* More frequent cleanup keeps the table smaller but adds slight overhead

## Client-Side Tuning

### Concurrency

Match your client concurrency to `BATCH_SIZE`:

```
Optimal client concurrency ≈ BATCH_SIZE
```

More concurrent requests than `BATCH_SIZE` just increases queue depth without increasing throughput.

### Connection Keep-Alive

Use HTTP keep-alive connections to avoid TCP handshake overhead. Most HTTP clients do this by default.

### Retry Strategy

For timeout errors:

1. Check `GET /idempotency/<key>` before retrying
2. Use exponential backoff (100ms, 200ms, 400ms...)
3. Maximum 3 retries

## Monitoring Performance

Use these Prometheus queries to evaluate your tuning:

```promql
# Is your batch size well-utilized?
rate(ledger_batch_size_sum[5m]) / rate(ledger_batch_size_count[5m])
# Should be close to BATCH_SIZE during peak hours

# Is the commit time reasonable?
histogram_quantile(0.99, rate(ledger_batch_commit_ms_bucket[5m]))
# Should be < 20ms on well-sized RDS instances

# What's the actual throughput?
rate(ledger_transactions_success_total[1m])

# Is latency acceptable?
histogram_quantile(0.99, rate(ledger_http_duration_ms_bucket{path="/transaction"}[5m]))
```

## Tuning Checklist

1. Set `BATCH_SIZE` based on your expected peak TPS
2. Set `BATCH_TIMEOUT_MS` based on your latency tolerance
3. Set `DB_POOL_SIZE` based on your RDS instance tier
4. Match client concurrency to `BATCH_SIZE`
5. Deploy and monitor via Prometheus/Grafana
6. Adjust based on actual metrics


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://entrytarget.gitbook.io/docs/performance/tuning.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
