Evose
Private

Private · High Availability

HA topology · DB primary-replica · LLM multi-vendor failover

Eliminate single points of failure. Consider it across the application layer, the data layer, and the model layer simultaneously.

Three-Layer HA Checklist

Application layer
☑ Multiple Evose Web/API replicas (≥ 3)
☑ Multiple Worker replicas (≥ 2)
☑ Active-active Ingress / load balancer

Data layer
☑ MySQL primary-replica (synchronous replication) + auto failover
☑ PostgreSQL primary-replica + pgvector replication
☑ Redis primary-backup / sentinel / cluster
☑ Object storage cross-region replication

Model layer
☑ Multi-LLM-vendor integration (Failover)
☑ Multi-instance self-hosted LLM (Round Robin)
☑ Multi-instance Embedding / Reranking

Application Layer

Recommended K8s configuration:

ComponentReplicasPDB (PodDisruptionBudget)
evose-api≥ 3minAvailable: 2
evose-worker≥ 2minAvailable: 1
evose-web≥ 2minAvailable: 1

Data Layer

MySQL

OptionBest for
Primary-replica + manual failoverSmall-mid scale
MGR (Group Replication)Recommended, production
MySQL InnoDB ClusterLarge scale, mature ops needed
Aliyun RDS / cloud DBManaged — least overhead

PostgreSQL + pgvector

OptionBest for
Streaming replication + PatroniRecommended
Citus shardingLarge-scale knowledge bases
Aliyun PolarDB / cloud DBManaged

pgvector primary-replica

Vector data does sync via streaming replication, but the vector index (HNSW) needs to be rebuilt on the replica. Use replicas for read; for large-scale, consider Citus.

Redis

OptionBest for
Primary-backup + sentinelDefault
Redis ClusterLarge scale / cross-DC

Model-Layer HA

Multi-Vendor Failover

Interface platform registers the same model with multiple vendors:

gpt-4-turbo
├─ OpenAI US-West       → priority 1
├─ Azure China          → priority 2 (used when primary is down)
└─ Self-hosted OpenAI-compatible → priority 3 (used when all SaaS down)

Self-Hosted Multi-Instance

Model deployment:

qwen-max:
  ├─ Instance 1: 10.0.0.5:8000  (replicas 2)
  ├─ Instance 2: 10.0.0.6:8000  (replicas 2)
  └─ Instance 3: 10.0.0.7:8000  (replicas 2)
Strategy: Round Robin + health checks

Cross-DC / Multi-Region

Advanced. Recommended:

Primary DC                       DR DC
─────                            ──────
Full Evose stack + DB primary    Full Evose stack + DB replica
        │                              ↑
        └─── async DB cross-region replication ───┘

DNS switch / GLB swaps automatically within RPO/RTO

Cross-region DR RPO

Async replication has data-loss risk. Strict-compliance scenarios (finance / healthcare) need to evaluate sync replication and its latency cost.

Failure Drills

DrillFrequencyTarget RTO
Single API Pod failureMonthly< 30s
Primary MySQL failureQuarterly< 5min
Single LLM vendor failureMonthly< 10s
Whole DC failureSemi-annually< 30min

Next Steps

On this page