Mitra-availability state (online flag, deactivated flag, per-mitra session count, heartbeat liveness) mirrored into Valkey so the customer beacon + pairing blast + dashboard counts no longer hit Postgres on the hot path. Postgres remains the durable source of truth; Valkey state is fully derivable via seedFromPostgres on startup + reconnect. Schema - mitras:online SET — mirror of is_online - mitras:deactivated SET — mirror of is_active=false - mitra:capacity:<id> STRING — active+pending_payment session count - mitra💓<id> STRING — ISO timestamp of last ping - availability:snapshot JSON — beacon cache, TTL 10s, cluster-shared Write paths (Postgres first, best-effort Valkey) - setOnline/setOffline mirror SADD/SREM + heartbeat SET/DEL - updateMitraStatus mirrors mitras:deactivated AND revokes auth_sessions on deactivate (bounds the "ghost online" window to access-token TTL) - heartbeat is Valkey-only on the hot path; the per-ping Postgres UPDATE on last_heartbeat_at is eliminated (was 1,200 ops/min at prod scale) - chat_session lifecycle (accept/end/reroute/extension/expiry) calls recomputeCapacityForMitra after each UPDATE — derive-from-truth avoids the bookkeeping risk of per-transition INCR/DECR Read paths (Valkey-first, Postgres fallback on Valkey error) - isMitraReachable: SISMEMBER mitras:online + heartbeat freshness - findAvailableMitras: SDIFF + pipelined GETs, filter by capacity + heartbeat - countAvailableMitrasFromCache: Valkey-driven, cached cluster-wide 10s TTL - dashboard online count: SCARD - Each reader wraps Valkey ops in try/catch → Postgres fallback on outage Heartbeat path on /api/mitra/status/heartbeat - resolveMitra preHandler replaced with heartbeatGuard: SISMEMBER on mitras:deactivated (~0 DB hits per ping). Falls back to full DB resolveMitra if Valkey is unreachable so a Valkey outage doesn't silently accept heartbeats from deactivated mitras. Three sweeps, env-configurable cadences - MITRA_AUTO_OFFLINE_SWEEP_SECONDS (30) — Valkey-driven stale detection - HEARTBEAT_MIRROR_INTERVAL_SECONDS (60) — batched UPSERT writes Valkey timestamps to Postgres last_heartbeat_at via UNNEST (1 statement per cycle, idempotent across instances) - VALKEY_ONLINE_MIRROR_SWEEP_SECONDS (300) — periodic reseed heals drift Startup - restoreActiveTimers → seedFromPostgres → bind listeners - onValkeyReady re-runs the seed on every reconnect (cold start + reseed on Valkey restart, no manual intervention) Failure semantics - Read fallback: every Valkey read wrapped, falls back to existing Postgres JOIN query — system stays correct during Valkey outage, performance degrades not breaks - Write best-effort: Postgres write commits before Valkey is touched; Valkey errors log + continue; reconciliation sweep heals drift - Auto-offline sweep aborts entirely on Valkey error (does NOT mass- offline via Postgres scan during Valkey hiccup) Tests - New: 32 integration tests in mitra-status.valkey-mirror.test.js covering seed, write-through, fallbacks, capacity lifecycle, auto-offline sweep, heartbeat mirror, deactivation flow, beacon cache - Updated: fixtures.js seeds Valkey alongside Postgres when isOnline=true - Updated: helpers/db.js resetDb also flushes test Valkey - Fixed 2 pre-existing session-timer flakes (string IDs failed uuid parse; vi.advanceTimersByTimeAsync raced real Postgres I/O) - All 124/124 backend tests pass (was 90/92) Docs - requirement/valkey-online-mirror-plan.md — canonical plan - requirement/valkey-online-mirror-testing.md — manual E2E checklist - requirement/deployment.md — infra + Valkey persistence guidance for prod (Memorystore Standard tier recommended; migration from self-hosted Valkey is zero-downtime via reseed-from-Postgres) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
63 lines
2.6 KiB
JavaScript
63 lines
2.6 KiB
JavaScript
import { getDb } from '../db/client.js'
|
|
import * as valkey from '../plugins/valkey.js'
|
|
import { VK_MITRAS_ONLINE } from './mitra-status.service.js'
|
|
import { SessionStatus, TopicSensitivity } from '../constants.js'
|
|
|
|
const sql = getDb()
|
|
|
|
// Valkey-fast SCARD with Postgres fallback. The CC dashboard polls every few
|
|
// seconds; SCARD is sub-ms so this keeps the dashboard responsive at any scale.
|
|
const getOnlineMitrasCount = async () => {
|
|
try {
|
|
return await valkey.scard(VK_MITRAS_ONLINE)
|
|
} catch (err) {
|
|
console.warn('[dashboard] valkey unavailable, falling back to DB:', err.message)
|
|
const [{ c }] = await sql`SELECT COUNT(*)::int AS c FROM mitra_online_status WHERE is_online = true`
|
|
return c
|
|
}
|
|
}
|
|
|
|
export const getDashboardStats = async () => {
|
|
const [
|
|
[{ active_chats }],
|
|
online_mitras,
|
|
[{ pending_requests }],
|
|
[{ sensitive_total }],
|
|
[{ sensitive_last_30d_total }],
|
|
[{ sensitive_last_30d_sensitive }],
|
|
] = await Promise.all([
|
|
sql`SELECT COUNT(*) AS active_chats FROM chat_sessions WHERE status IN (${SessionStatus.ACTIVE}, ${SessionStatus.PENDING_PAYMENT})`,
|
|
getOnlineMitrasCount(),
|
|
sql`SELECT COUNT(*) AS pending_requests FROM chat_sessions WHERE status IN (${SessionStatus.SEARCHING}, ${SessionStatus.PENDING_ACCEPTANCE})`,
|
|
sql`SELECT COUNT(*) AS sensitive_total FROM chat_sessions WHERE topic_sensitivity = ${TopicSensitivity.SENSITIVE}`,
|
|
sql`SELECT COUNT(*) AS sensitive_last_30d_total FROM chat_sessions WHERE created_at >= NOW() - INTERVAL '30 days'`,
|
|
sql`SELECT COUNT(*) AS sensitive_last_30d_sensitive FROM chat_sessions WHERE created_at >= NOW() - INTERVAL '30 days' AND topic_sensitivity = ${TopicSensitivity.SENSITIVE}`,
|
|
])
|
|
|
|
const customersPerMitra = await sql`
|
|
SELECT m.id, m.display_name,
|
|
(SELECT COUNT(*) FROM chat_sessions cs
|
|
WHERE cs.mitra_id = m.id AND cs.status IN (${SessionStatus.ACTIVE}, ${SessionStatus.PENDING_PAYMENT})) AS active_session_count
|
|
FROM mitras m
|
|
INNER JOIN mitra_online_status s ON s.mitra_id = m.id
|
|
WHERE s.is_online = true
|
|
ORDER BY active_session_count DESC
|
|
`
|
|
|
|
const last30dTotal = Number(sensitive_last_30d_total)
|
|
const last30dSensitive = Number(sensitive_last_30d_sensitive)
|
|
|
|
return {
|
|
active_chats: Number(active_chats),
|
|
online_mitras: Number(online_mitras),
|
|
pending_requests: Number(pending_requests),
|
|
customers_per_mitra: customersPerMitra,
|
|
sensitive: {
|
|
total: Number(sensitive_total),
|
|
last_30d_total: last30dTotal,
|
|
last_30d_sensitive: last30dSensitive,
|
|
last_30d_percent: last30dTotal > 0 ? Math.round((last30dSensitive / last30dTotal) * 1000) / 10 : 0,
|
|
},
|
|
}
|
|
}
|