Phase 6: Valkey availability mirror — move read path off Postgres

Mitra-availability state (online flag, deactivated flag, per-mitra session
count, heartbeat liveness) mirrored into Valkey so the customer beacon
+ pairing blast + dashboard counts no longer hit Postgres on the hot path.
Postgres remains the durable source of truth; Valkey state is fully
derivable via seedFromPostgres on startup + reconnect.

Schema
- mitras:online           SET    — mirror of is_online
- mitras:deactivated      SET    — mirror of is_active=false
- mitra:capacity:<id>     STRING — active+pending_payment session count
- mitra💓<id>    STRING — ISO timestamp of last ping
- availability:snapshot   JSON   — beacon cache, TTL 10s, cluster-shared

Write paths (Postgres first, best-effort Valkey)
- setOnline/setOffline mirror SADD/SREM + heartbeat SET/DEL
- updateMitraStatus mirrors mitras:deactivated AND revokes auth_sessions
  on deactivate (bounds the "ghost online" window to access-token TTL)
- heartbeat is Valkey-only on the hot path; the per-ping Postgres UPDATE
  on last_heartbeat_at is eliminated (was 1,200 ops/min at prod scale)
- chat_session lifecycle (accept/end/reroute/extension/expiry) calls
  recomputeCapacityForMitra after each UPDATE — derive-from-truth avoids
  the bookkeeping risk of per-transition INCR/DECR

Read paths (Valkey-first, Postgres fallback on Valkey error)
- isMitraReachable: SISMEMBER mitras:online + heartbeat freshness
- findAvailableMitras: SDIFF + pipelined GETs, filter by capacity + heartbeat
- countAvailableMitrasFromCache: Valkey-driven, cached cluster-wide 10s TTL
- dashboard online count: SCARD
- Each reader wraps Valkey ops in try/catch → Postgres fallback on outage

Heartbeat path on /api/mitra/status/heartbeat
- resolveMitra preHandler replaced with heartbeatGuard: SISMEMBER on
  mitras:deactivated (~0 DB hits per ping). Falls back to full DB
  resolveMitra if Valkey is unreachable so a Valkey outage doesn't
  silently accept heartbeats from deactivated mitras.

Three sweeps, env-configurable cadences
- MITRA_AUTO_OFFLINE_SWEEP_SECONDS (30) — Valkey-driven stale detection
- HEARTBEAT_MIRROR_INTERVAL_SECONDS (60) — batched UPSERT writes
  Valkey timestamps to Postgres last_heartbeat_at via UNNEST (1 statement
  per cycle, idempotent across instances)
- VALKEY_ONLINE_MIRROR_SWEEP_SECONDS (300) — periodic reseed heals drift

Startup
- restoreActiveTimers → seedFromPostgres → bind listeners
- onValkeyReady re-runs the seed on every reconnect (cold start + reseed
  on Valkey restart, no manual intervention)

Failure semantics
- Read fallback: every Valkey read wrapped, falls back to existing
  Postgres JOIN query — system stays correct during Valkey outage,
  performance degrades not breaks
- Write best-effort: Postgres write commits before Valkey is touched;
  Valkey errors log + continue; reconciliation sweep heals drift
- Auto-offline sweep aborts entirely on Valkey error (does NOT mass-
  offline via Postgres scan during Valkey hiccup)

Tests
- New: 32 integration tests in mitra-status.valkey-mirror.test.js
  covering seed, write-through, fallbacks, capacity lifecycle,
  auto-offline sweep, heartbeat mirror, deactivation flow, beacon cache
- Updated: fixtures.js seeds Valkey alongside Postgres when isOnline=true
- Updated: helpers/db.js resetDb also flushes test Valkey
- Fixed 2 pre-existing session-timer flakes (string IDs failed uuid
  parse; vi.advanceTimersByTimeAsync raced real Postgres I/O)
- All 124/124 backend tests pass (was 90/92)

Docs
- requirement/valkey-online-mirror-plan.md — canonical plan
- requirement/valkey-online-mirror-testing.md — manual E2E checklist
- requirement/deployment.md — infra + Valkey persistence guidance for
  prod (Memorystore Standard tier recommended; migration from
  self-hosted Valkey is zero-downtime via reseed-from-Postgres)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-25 18:07:55 +08:00
parent 3fff4b1c6e
commit 553dbac52f
20 changed files with 1839 additions and 82 deletions

View File

@@ -41,6 +41,25 @@ ADMIN_PASSWORD=ChangeMe123!
# Path to Firebase service-account JSON (falls back to backend/firebase-service-account.json)
FIREBASE_SERVICE_ACCOUNT_PATH=
# --- Valkey availability mirror cadences ---
#
# All env-driven per backend/CLAUDE.md Config-Source Convention. Defaults match
# requirement/valkey-online-mirror-plan.md. Floor-clamped by their getters.
# How often the auto-offline sweep checks heartbeat freshness (seconds).
# The staleness threshold itself (`stale_after_seconds`) is CC-tunable via app_config.
MITRA_AUTO_OFFLINE_SWEEP_SECONDS=30
# How often heartbeat timestamps are batched from Valkey → Postgres (seconds).
# Per-ping heartbeat writes go to Valkey only; this preserves forensic
# `last_heartbeat_at` in Postgres with up to <interval> seconds of lag.
HEARTBEAT_MIRROR_INTERVAL_SECONDS=60
# How often Valkey state is re-derived from Postgres to heal drift (seconds).
# Belt-and-braces against failed best-effort Valkey writes, out-of-band Postgres
# mutations, or evictions. Set to 0 to disable (not recommended).
VALKEY_ONLINE_MIRROR_SWEEP_SECONDS=300
# --- Phase 5: Xendit (dev-safe defaults: integration disabled) ---
#
# Flip XENDIT_ENABLED=true in staging/prod once secret + webhook token are populated.

View File

@@ -4,10 +4,31 @@ let pub
let sub
let client
// 'ready' listeners (registered before connect; fire on initial connect AND each
// reconnect). Used by services that need to reseed Valkey state from Postgres.
const readyListeners = new Set()
const attachReadyHandler = (instance) => {
instance.on('ready', () => {
for (const fn of readyListeners) {
// Fire-and-forget; each listener owns its own error handling.
Promise.resolve()
.then(() => fn())
.catch((err) => console.error('[valkey] ready listener failed:', err))
}
})
}
export const onValkeyReady = (fn) => {
readyListeners.add(fn)
return () => readyListeners.delete(fn)
}
export const getValkeyClient = () => {
if (!client) {
const url = process.env.VALKEY_URL || 'redis://localhost:6379'
client = new Redis(url)
attachReadyHandler(client)
}
return client
}
@@ -51,3 +72,25 @@ export const subscribe = (channel, callback) => {
console.log(`[valkey] unsubscribed from ${channel}`)
}
}
// --- Thin wrappers used by the mitra-availability mirror ---
//
// Each wrapper uses the shared `client` (separate ioredis instance from pub/sub
// to keep subscribe state isolated). Callers in services/* wrap these in
// try/catch and fall back to Postgres on error — see the plan doc.
export const sadd = (key, ...members) => getValkeyClient().sadd(key, ...members)
export const srem = (key, ...members) => getValkeyClient().srem(key, ...members)
export const sismember = async (key, member) =>
(await getValkeyClient().sismember(key, member)) === 1
export const smembers = (key) => getValkeyClient().smembers(key)
export const sdiff = (...keys) => getValkeyClient().sdiff(...keys)
export const scard = (key) => getValkeyClient().scard(key)
export const set = (key, value) => getValkeyClient().set(key, value)
export const get = (key) => getValkeyClient().get(key)
export const del = (...keys) => getValkeyClient().del(...keys)
export const incr = (key) => getValkeyClient().incr(key)
export const decr = (key) => getValkeyClient().decr(key)
export const exists = (key) => getValkeyClient().exists(key)
export const pipeline = () => getValkeyClient().pipeline()
export const multi = () => getValkeyClient().multi()

View File

@@ -1,6 +1,8 @@
import { authenticate } from '../../plugins/auth.js'
import { getMitraById } from '../../services/mitra.service.js'
import * as mitraStatusService from '../../services/mitra-status.service.js'
import * as valkey from '../../plugins/valkey.js'
import { VK_MITRAS_DEACTIVATED } from '../../services/mitra-status.service.js'
import { UserType } from '../../constants.js'
export const mitraStatusRoutes = async (app) => {
@@ -27,6 +29,32 @@ export const mitraStatusRoutes = async (app) => {
request.mitra = mitra
}
// Lightweight heartbeat guard: no Postgres SELECT in the hot path. Checks
// `mitras:deactivated` in Valkey (maintained on every updateMitraStatus) and
// falls back to the full resolveMitra/DB check if Valkey is unreachable so a
// Valkey outage doesn't accept heartbeats from deactivated mitras silently.
const heartbeatGuard = async (request, reply) => {
if (request.auth?.userType !== UserType.MITRA) {
return reply.code(403).send({
success: false,
error: { code: 'FORBIDDEN', message: 'Mitra account required' },
})
}
try {
const deactivated = await valkey.sismember(VK_MITRAS_DEACTIVATED, request.auth.userId)
if (deactivated) {
return reply.code(403).send({
success: false,
error: { code: 'ACCOUNT_INACTIVE', message: 'Account is inactive' },
})
}
return
} catch (err) {
console.warn('[heartbeat] valkey check failed, falling back to DB:', err.message)
return resolveMitra(request, reply)
}
}
app.post('/online', { preHandler: [authenticate, resolveMitra] }, async (request, reply) => {
await mitraStatusService.setOnline(request.mitra.id)
return reply.send({ success: true, data: { is_online: true } })
@@ -37,8 +65,8 @@ export const mitraStatusRoutes = async (app) => {
return reply.send({ success: true, data: { is_online: false } })
})
app.post('/heartbeat', { preHandler: [authenticate, resolveMitra] }, async (request, reply) => {
await mitraStatusService.heartbeat(request.mitra.id)
app.post('/heartbeat', { preHandler: [authenticate, heartbeatGuard] }, async (request, reply) => {
await mitraStatusService.heartbeat(request.auth.userId)
return reply.send({ success: true })
})

View File

@@ -1,7 +1,13 @@
import 'dotenv/config'
import { buildPublicApp } from './app.public.js'
import { buildInternalApp } from './app.internal.js'
import { autoOfflineStaleMitras } from './services/mitra-status.service.js'
import { autoOfflineStaleMitras, seedFromPostgres, mirrorHeartbeatsToPostgres } from './services/mitra-status.service.js'
import {
getMitraAutoOfflineSweepSeconds,
getHeartbeatMirrorIntervalSeconds,
getValkeyOnlineMirrorSweepSeconds,
} from './services/config.service.js'
import { initFirebase } from './plugins/firebase.js'
import { restoreActiveTimers } from './services/session-timer.service.js'
import { expireStalePaymentRequests, registerPairingSubscriber } from './services/payment.service.js'
@@ -23,18 +29,22 @@ const start = async () => {
}
initFirebase()
const publicApp = await buildPublicApp()
const internalApp = await buildInternalApp()
// restoreActiveTimers runs bulk UPDATEs on chat_sessions to clean up stale
// ACTIVE/CLOSING rows from before the restart. Run it BEFORE seedFromPostgres
// so the seed sees the post-cleanup state and capacity counters are accurate.
await restoreActiveTimers()
await seedFromPostgres()
await publicApp.listen({ port: PUBLIC_PORT, host: '0.0.0.0' })
console.log(`Public API listening on port ${PUBLIC_PORT}`)
await internalApp.listen({ port: INTERNAL_PORT, host: INTERNAL_HOST })
console.log(`Internal API listening on ${INTERNAL_HOST}:${INTERNAL_PORT}`)
// Restore session timers for active sessions (on server restart)
await restoreActiveTimers()
// Phase 5: wire pairing service as a subscriber to payment_request.confirmed events.
// Must happen AFTER all services are loaded so the subscriber registration sees
// the EventEmitter set up by payment.service.js at module-load time.
@@ -53,7 +63,8 @@ const start = async () => {
console.error('Startup reconciliation failed:', err)
}
// Auto-offline mitras with stale heartbeat (every 30s)
// Auto-offline mitras with stale heartbeat (env-driven cadence, default 30s).
// Valkey-driven per requirement/valkey-online-mirror-plan.md.
setInterval(async () => {
try {
const count = await autoOfflineStaleMitras()
@@ -61,7 +72,32 @@ const start = async () => {
} catch (err) {
console.error('Auto-offline check failed:', err)
}
}, 30_000)
}, getMitraAutoOfflineSweepSeconds() * 1000)
// Batched heartbeat mirror: Valkey heartbeat timestamps → Postgres
// last_heartbeat_at (default 60s). Keeps forensic column current without
// per-ping DB writes. One UNNEST UPDATE per tick; idempotent across instances.
setInterval(async () => {
try {
await mirrorHeartbeatsToPostgres()
} catch (err) {
console.error('Heartbeat mirror failed:', err)
}
}, getHeartbeatMirrorIntervalSeconds() * 1000)
// Reconciliation sweep: heal Valkey/Postgres drift (default 300s; 0 disables).
// Belt-and-braces against failed best-effort Valkey writes, out-of-band
// Postgres mutations, evictions. Idempotent — just runs the seed.
const reconciliationSeconds = getValkeyOnlineMirrorSweepSeconds()
if (reconciliationSeconds > 0) {
setInterval(async () => {
try {
await seedFromPostgres()
} catch (err) {
console.error('Valkey reconciliation sweep failed:', err)
}
}, reconciliationSeconds * 1000)
}
// Expire stale payment_requests + reconcile lost subscriber work (every 60s).
// Pending past expires_at → expired (no failure row).

View File

@@ -3,6 +3,7 @@ import { publish } from '../plugins/valkey.js'
import { clearSessionTimer, clearClosureGraceTimer, startClosureGraceTimer } from './session-timer.service.js'
import { sendToSessionParticipant } from '../plugins/websocket.js'
import { sendPushNotification } from './notification.service.js'
import { recomputeCapacityForMitra } from './mitra-status.service.js'
import { UserType, SessionStatus, EndedBy, WsMessage } from '../constants.js'
const sql = getDb()
@@ -59,6 +60,7 @@ export const completeSession = async (sessionId) => {
RETURNING id, customer_id, mitra_id, status, ended_at
`
if (!session) return null
await recomputeCapacityForMitra(session.mitra_id)
// Notify both parties, FCM fallback if WebSocket is down
const data = { type: WsMessage.SESSION_COMPLETED, session_id: sessionId }
@@ -109,6 +111,7 @@ export const initiateEarlyEnd = async (sessionId, userType) => {
code: 'SESSION_NOT_ACTIVE', statusCode: 409,
})
}
await recomputeCapacityForMitra(session.mitra_id)
clearSessionTimer(sessionId)
startClosureGraceTimer(sessionId)

View File

@@ -149,6 +149,34 @@ export const getMitraHeartbeatCadenceSeconds = () => {
return Number.isFinite(parsed) && parsed >= 5 ? parsed : 30
}
// --- Valkey availability mirror — env-driven cadences ---
//
// Per requirement/valkey-online-mirror-plan.md. All three are operational
// knobs (env, per backend/CLAUDE.md Config-Source Convention), not
// operator-tunable. Defaults match the plan; values are floor-clamped.
export const getMitraAutoOfflineSweepSeconds = () => {
const raw = process.env.MITRA_AUTO_OFFLINE_SWEEP_SECONDS
if (!raw || raw.trim() === '') return 30
const parsed = Number.parseInt(raw, 10)
return Number.isFinite(parsed) && parsed >= 5 ? parsed : 30
}
export const getHeartbeatMirrorIntervalSeconds = () => {
const raw = process.env.HEARTBEAT_MIRROR_INTERVAL_SECONDS
if (!raw || raw.trim() === '') return 60
const parsed = Number.parseInt(raw, 10)
return Number.isFinite(parsed) && parsed >= 10 ? parsed : 60
}
export const getValkeyOnlineMirrorSweepSeconds = () => {
const raw = process.env.VALKEY_ONLINE_MIRROR_SWEEP_SECONDS
if (!raw || raw.trim() === '') return 300
const parsed = Number.parseInt(raw, 10)
if (parsed === 0) return 0 // explicit disable
return Number.isFinite(parsed) && parsed >= 30 ? parsed : 300
}
// --- Phase 5: Xendit integration ---
//
// Env-driven (per backend/CLAUDE.md Config-Source Convention). All five values

View File

@@ -1,19 +1,33 @@
import { getDb } from '../db/client.js'
import * as valkey from '../plugins/valkey.js'
import { VK_MITRAS_ONLINE } from './mitra-status.service.js'
import { SessionStatus, TopicSensitivity } from '../constants.js'
const sql = getDb()
// Valkey-fast SCARD with Postgres fallback. The CC dashboard polls every few
// seconds; SCARD is sub-ms so this keeps the dashboard responsive at any scale.
const getOnlineMitrasCount = async () => {
try {
return await valkey.scard(VK_MITRAS_ONLINE)
} catch (err) {
console.warn('[dashboard] valkey unavailable, falling back to DB:', err.message)
const [{ c }] = await sql`SELECT COUNT(*)::int AS c FROM mitra_online_status WHERE is_online = true`
return c
}
}
export const getDashboardStats = async () => {
const [
[{ active_chats }],
[{ online_mitras }],
online_mitras,
[{ pending_requests }],
[{ sensitive_total }],
[{ sensitive_last_30d_total }],
[{ sensitive_last_30d_sensitive }],
] = await Promise.all([
sql`SELECT COUNT(*) AS active_chats FROM chat_sessions WHERE status IN (${SessionStatus.ACTIVE}, ${SessionStatus.PENDING_PAYMENT})`,
sql`SELECT COUNT(*) AS online_mitras FROM mitra_online_status WHERE is_online = true`,
getOnlineMitrasCount(),
sql`SELECT COUNT(*) AS pending_requests FROM chat_sessions WHERE status IN (${SessionStatus.SEARCHING}, ${SessionStatus.PENDING_ACCEPTANCE})`,
sql`SELECT COUNT(*) AS sensitive_total FROM chat_sessions WHERE topic_sensitivity = ${TopicSensitivity.SENSITIVE}`,
sql`SELECT COUNT(*) AS sensitive_last_30d_total FROM chat_sessions WHERE created_at >= NOW() - INTERVAL '30 days'`,

View File

@@ -1,7 +1,7 @@
import { getDb } from '../db/client.js'
import { sendToSessionParticipant, isUserOnlineWs } from '../plugins/websocket.js'
import { extendSessionTimer, clearClosureGraceTimer, startClosureGraceTimer } from './session-timer.service.js'
import { isMitraReachable } from './mitra-status.service.js'
import { isMitraReachable, recomputeCapacityBySession } from './mitra-status.service.js'
import { consumePaymentSession, failPaymentSession, getPaymentSession } from './payment.service.js'
import { sendPushNotification } from './notification.service.js'
import {
@@ -104,6 +104,7 @@ export const requestExtension = async (sessionId, customerId, { duration_minutes
// Pause the session
await sql`UPDATE chat_sessions SET status = ${SessionStatus.EXTENDING} WHERE id = ${sessionId}`
await recomputeCapacityBySession(sessionId)
// Resolve timeout once so we can both surface it in the WS payload and start the server-side timer.
const timeoutMs = await getExtensionTimeoutMs()
@@ -213,6 +214,7 @@ const finalizeExtension = async (extensionId, sessionId, accepted, viaTimeout) =
// Resume session
await sql`UPDATE chat_sessions SET status = ${SessionStatus.ACTIVE} WHERE id = ${extension.session_id}`
await recomputeCapacityBySession(extension.session_id)
// Record transaction
await sql`
@@ -249,6 +251,7 @@ const finalizeExtension = async (extensionId, sessionId, accepted, viaTimeout) =
}
await sql`UPDATE chat_sessions SET status = ${SessionStatus.CLOSING} WHERE id = ${extension.session_id}`
await recomputeCapacityBySession(extension.session_id)
sendToSessionParticipant(sessionId, UserType.CUSTOMER, {
type: WsMessage.EXTENSION_RESPONSE,
@@ -331,6 +334,7 @@ const timeoutExtension = async (extensionId, sessionId, mitraId) => {
// Move session to closing & notify both parties (matches the explicit-reject UX).
await sql`UPDATE chat_sessions SET status = ${SessionStatus.CLOSING} WHERE id = ${sessionId}`
await recomputeCapacityBySession(sessionId)
sendToSessionParticipant(sessionId, UserType.CUSTOMER, {
type: WsMessage.EXTENSION_RESPONSE,
accepted: false,

View File

@@ -1,24 +1,80 @@
import { getDb } from '../db/client.js'
import { SessionStatus } from '../constants.js'
import { getMitraPingConfig, getMaxCustomersPerMitra } from './config.service.js'
import { subscribe } from '../plugins/valkey.js'
import * as valkey from '../plugins/valkey.js'
import { subscribe, onValkeyReady } from '../plugins/valkey.js'
const sql = getDb()
// --- Short-TTL availability cache for the 5s-poll endpoint ---
// In-memory snapshot { available, count, expiresAt }. The cache:
// - is recomputed at most once per AVAILABILITY_TTL_MS (10s backstop)
// - is invalidated explicitly when CC changes max_customers_per_mitra (call invalidateAvailabilityCache())
// This keeps customer polls off the DB hot path while staying close to real time.
const AVAILABILITY_TTL_MS = 10_000
let availabilityCache = null // { available, count, expiresAt }
// Per requirement/valkey-online-mirror-plan.md § Schema.
export const VK_MITRAS_ONLINE = 'mitras:online'
export const VK_MITRAS_DEACTIVATED = 'mitras:deactivated'
export const vkCapacityKey = (mitraId) => `mitra:capacity:${mitraId}`
export const vkHeartbeatKey = (mitraId) => `mitra:heartbeat:${mitraId}`
export const invalidateAvailabilityCache = () => {
availabilityCache = null
// Rebuilds Valkey availability state from Postgres. Called on backend startup,
// on Valkey reconnect (via onValkeyReady), and by the reconciliation sweep.
// Idempotent — DEL + bulk SADD/SET produces the same final state on every run.
export const seedFromPostgres = async () => {
try {
const [onlineRows, deactRows, capacityRows] = await Promise.all([
sql`SELECT mitra_id FROM mitra_online_status WHERE is_online = true`,
sql`SELECT id FROM mitras WHERE is_active = false`,
sql`
SELECT mitra_id, COUNT(*)::int AS c FROM chat_sessions
WHERE mitra_id IS NOT NULL
AND status IN (${SessionStatus.ACTIVE}, ${SessionStatus.PENDING_PAYMENT})
GROUP BY mitra_id
`,
])
const pipe = valkey.pipeline()
pipe.del(VK_MITRAS_ONLINE)
pipe.del(VK_MITRAS_DEACTIVATED)
const now = new Date().toISOString()
if (onlineRows.length) {
pipe.sadd(VK_MITRAS_ONLINE, ...onlineRows.map((r) => r.mitra_id))
// Seed heartbeats with NOW so the first sweep after restart doesn't
// mass-offline. Mitras refresh on their next ping anyway.
for (const r of onlineRows) pipe.set(vkHeartbeatKey(r.mitra_id), now)
// Reset capacity for currently-online mitras; overlay real counts below.
// Offline mitras' stale capacity keys don't affect reads (SDIFF excludes them).
for (const r of onlineRows) pipe.set(vkCapacityKey(r.mitra_id), 0)
}
if (deactRows.length) {
pipe.sadd(VK_MITRAS_DEACTIVATED, ...deactRows.map((r) => r.id))
}
for (const r of capacityRows) pipe.set(vkCapacityKey(r.mitra_id), r.c)
await pipe.exec()
console.log(
`[valkey-mirror] seed: ${onlineRows.length} online, ${deactRows.length} deactivated, ${capacityRows.length} with active sessions`,
)
} catch (err) {
console.error('[valkey-mirror] seed failed:', err)
}
}
// Subscribe once at module load so other-instance config updates also bust this cache.
// Single-instance: the local mutator already invalidates, so this is a no-op extra.
// Re-seed on every Valkey reconnect (fires on initial connect too).
onValkeyReady(seedFromPostgres)
// --- Beacon snapshot cache (Valkey-backed, cluster-shared) ---
// `availability:snapshot` JSON `{available, count}`, TTL 10s. All backend
// instances share the same cache: one Valkey-driven compute per 10s
// cluster-wide regardless of how many customers are polling.
const AVAILABILITY_CACHE_KEY = 'availability:snapshot'
const AVAILABILITY_TTL_SECONDS = 10
export const invalidateAvailabilityCache = async () => {
try {
await valkey.del(AVAILABILITY_CACHE_KEY)
} catch (err) {
console.error('[valkey-mirror] invalidateAvailabilityCache failed:', err)
}
}
// Bust the shared cache when CC changes max_customers_per_mitra (any instance).
let _subscribed = false
const ensureSubscribed = () => {
if (_subscribed) return
@@ -43,6 +99,39 @@ export const ensureStatusRow = async (mitraId) => {
`
}
// Best-effort Valkey writer. Postgres remains source of truth; a Valkey hiccup
// shouldn't fail the originating request — the reconciliation sweep heals drift.
const tryValkey = async (fn, label) => {
try { await fn() } catch (err) {
console.error(`[valkey-mirror] ${label} failed:`, err)
}
}
// Recompute `mitra:capacity:<id>` from chat_sessions truth. Called after every
// chat_session state change that could affect a mitra's occupied-slot count.
// Recompute-from-truth avoids the bookkeeping risks of per-transition INCR/DECR
// (double-counts, missed transitions across all the UPDATE sites in pairing,
// closure, extension, session-timer, session services).
export const recomputeCapacityForMitra = async (mitraId) => {
if (!mitraId) return
const [row] = await sql`
SELECT COUNT(*)::int AS c FROM chat_sessions
WHERE mitra_id = ${mitraId}
AND status IN (${SessionStatus.ACTIVE}, ${SessionStatus.PENDING_PAYMENT})
`
await tryValkey(
() => valkey.set(vkCapacityKey(mitraId), row.c),
`recomputeCapacity ${mitraId}`,
)
}
// Lookup mitra_id from the session, then recompute. Use this from UPDATE sites
// where the session's mitra_id may not be in local scope.
export const recomputeCapacityBySession = async (sessionId) => {
const [row] = await sql`SELECT mitra_id FROM chat_sessions WHERE id = ${sessionId}`
if (row?.mitra_id) await recomputeCapacityForMitra(row.mitra_id)
}
export const setOnline = async (mitraId) => {
await ensureStatusRow(mitraId)
const now = new Date()
@@ -54,6 +143,14 @@ export const setOnline = async (mitraId) => {
await sql`
INSERT INTO mitra_online_logs (mitra_id, status) VALUES (${mitraId}, 'online')
`
await tryValkey(async () => {
const pipe = valkey.pipeline()
pipe.sadd(VK_MITRAS_ONLINE, mitraId)
pipe.set(vkHeartbeatKey(mitraId), now.toISOString())
await pipe.exec()
}, `setOnline ${mitraId}`)
invalidateAvailabilityCache()
}
@@ -73,16 +170,32 @@ export const setOffline = async (mitraId) => {
await sql`
INSERT INTO mitra_online_logs (mitra_id, status) VALUES (${mitraId}, 'offline')
`
await tryValkey(async () => {
const pipe = valkey.pipeline()
pipe.srem(VK_MITRAS_ONLINE, mitraId)
pipe.del(vkHeartbeatKey(mitraId))
await pipe.exec()
}, `setOffline ${mitraId}`)
invalidateAvailabilityCache()
}
// Heartbeat hot path: Valkey-only. Per-ping Postgres UPDATE eliminated; the
// 60s batched heartbeat-mirror job (mirrorHeartbeatsToPostgres) writes
// `last_heartbeat_at` to Postgres for forensics/restart safety.
//
// NOTE: there is intentionally no `is_online = true` gate here (the old SQL
// UPDATE had one). The Valkey SET is global; if a mitra heartbeats while
// `is_online=false` in Postgres, their TTL key gets refreshed but they're
// still not in `mitras:online`, so blast eligibility is unchanged. The
// reconciliation sweep will clean up the orphan heartbeat key.
export const heartbeat = async (mitraId) => {
const now = new Date()
await sql`
UPDATE mitra_online_status
SET last_heartbeat_at = ${now}, updated_at = ${now}
WHERE mitra_id = ${mitraId} AND is_online = true
`
const now = new Date().toISOString()
await tryValkey(
() => valkey.set(vkHeartbeatKey(mitraId), now),
`heartbeat ${mitraId}`,
)
}
export const getStatus = async (mitraId) => {
@@ -130,39 +243,95 @@ export const getOnlineLogs = async (mitraId, { page = 1, limit = 50 } = {}) => {
return { items, total: Number(count), page, limit }
}
// Valkey-driven: enumerate mitras:online, read each heartbeat timestamp from
// Valkey, find stales, then bulk-flip Postgres + clean up Valkey.
//
// Failure semantics: if any Valkey op throws, the sweep aborts entirely. We
// never mass-offline mitras via a Postgres scan because Valkey is unreachable
// — that would risk false-offlining a fleet during a Valkey hiccup.
export const autoOfflineStaleMitras = async () => {
const pingConfig = await getMitraPingConfig()
// If ping is not required, skip the auto-offline sweep entirely
if (!pingConfig.require_ping) return 0
// stale_after_seconds is the operator-facing knob — what they set is what
// they get. No multiplier, no implicit "tolerate N missed heartbeats"
// contract baked in. The CC PATCH validates that the value is >= the env-
// driven heartbeat cadence so single missed pings can't flip a mitra
// offline.
const staleSeconds = pingConfig.stale_after_seconds
const stale = await sql`
UPDATE mitra_online_status
SET is_online = false, last_offline_at = NOW(), updated_at = NOW()
WHERE is_online = true
AND last_heartbeat_at < NOW() - ${staleSeconds + ' seconds'}::interval
RETURNING mitra_id
`
for (const row of stale) {
await sql`
INSERT INTO mitra_online_logs (mitra_id, status) VALUES (${row.mitra_id}, 'offline')
`
let onlineIds, heartbeatValues
try {
onlineIds = await valkey.smembers(VK_MITRAS_ONLINE)
if (!onlineIds.length) return 0
const pipe = valkey.pipeline()
for (const id of onlineIds) pipe.get(vkHeartbeatKey(id))
const results = await pipe.exec()
heartbeatValues = results.map((r) => r[1])
} catch (err) {
console.warn('[auto-offline] valkey unavailable, skipping this tick:', err.message)
return 0
}
// Capacity may have changed (mitra went offline) — invalidate the customer-facing
// availability cache so the next poll reflects reality.
if (stale.length > 0) invalidateAvailabilityCache()
const cutoff = Date.now() - pingConfig.stale_after_seconds * 1000
const stale = []
for (let i = 0; i < onlineIds.length; i++) {
const ts = heartbeatValues[i]
if (!ts || Date.parse(ts) < cutoff) stale.push(onlineIds[i])
}
if (!stale.length) return 0
await sql`
UPDATE mitra_online_status
SET is_online = false, last_offline_at = NOW(), updated_at = NOW()
WHERE mitra_id = ANY(${sql.array(stale)}::uuid[]) AND is_online = true
`
for (const id of stale) {
await sql`INSERT INTO mitra_online_logs (mitra_id, status) VALUES (${id}, 'offline')`
}
await tryValkey(async () => {
const cleanup = valkey.pipeline()
cleanup.srem(VK_MITRAS_ONLINE, ...stale)
for (const id of stale) cleanup.del(vkHeartbeatKey(id))
await cleanup.exec()
}, `auto-offline cleanup (${stale.length} stale)`)
invalidateAvailabilityCache()
return stale.length
}
// Batched mirror: Valkey heartbeat timestamps → Postgres `last_heartbeat_at`.
// Runs every HEARTBEAT_MIRROR_INTERVAL_SECONDS (default 60). One UNNEST UPDATE
// regardless of online count. Idempotent — latest timestamp wins; multiple
// instances running concurrently is fine (no leader election needed).
export const mirrorHeartbeatsToPostgres = async () => {
let onlineIds, heartbeatValues
try {
onlineIds = await valkey.smembers(VK_MITRAS_ONLINE)
if (!onlineIds.length) return 0
const pipe = valkey.pipeline()
for (const id of onlineIds) pipe.get(vkHeartbeatKey(id))
const results = await pipe.exec()
heartbeatValues = results.map((r) => r[1])
} catch (err) {
console.warn('[heartbeat-mirror] valkey unavailable, skipping:', err.message)
return 0
}
const ids = []
const ts = []
for (let i = 0; i < onlineIds.length; i++) {
if (heartbeatValues[i]) {
ids.push(onlineIds[i])
ts.push(heartbeatValues[i])
}
}
if (!ids.length) return 0
await sql`
UPDATE mitra_online_status m
SET last_heartbeat_at = u.ts::timestamptz, updated_at = NOW()
FROM (
SELECT * FROM UNNEST(${sql.array(ids)}::uuid[], ${sql.array(ts)}::text[]) AS t(mitra_id, ts)
) u
WHERE m.mitra_id = u.mitra_id
`
return ids.length
}
/**
* Customer-home availability check, cached in-memory for AVAILABILITY_TTL_MS.
*
@@ -178,12 +347,33 @@ export const autoOfflineStaleMitras = async () => {
* sets/hashes (matching the existing memory item "Session Timer Scaling"); the contract
* of this function — Valkey/cache reads only on the hot path — stays the same.
*/
export const countAvailableMitrasFromCache = async () => {
const now = Date.now()
if (availabilityCache && availabilityCache.expiresAt > now) {
return { available: availabilityCache.available, count: availabilityCache.count }
}
const computeAvailabilityFromValkey = async () => {
const { max_customers_per_mitra } = await getMaxCustomersPerMitra()
const { stale_after_seconds } = await getMitraPingConfig()
const candidates = await valkey.sdiff(VK_MITRAS_ONLINE, VK_MITRAS_DEACTIVATED)
if (!candidates.length) return { available: false, count: 0 }
const pipe = valkey.pipeline()
for (const id of candidates) {
pipe.get(vkCapacityKey(id))
pipe.get(vkHeartbeatKey(id))
}
const results = await pipe.exec()
const cutoff = Date.now() - stale_after_seconds * 1000
let count = 0
for (let i = 0; i < candidates.length; i++) {
const capacity = Number(results[i * 2][1] ?? 0)
const heartbeat = results[i * 2 + 1][1]
if (capacity >= max_customers_per_mitra) continue
if (!heartbeat || Date.parse(heartbeat) < cutoff) continue
count++
}
return { available: count > 0, count }
}
const computeAvailabilityFromPostgres = async () => {
const { max_customers_per_mitra } = await getMaxCustomersPerMitra()
const [{ count }] = await sql`
SELECT COUNT(*)::int AS count
@@ -197,26 +387,42 @@ export const countAvailableMitrasFromCache = async () => {
AND cs.status IN (${SessionStatus.ACTIVE}, ${SessionStatus.PENDING_PAYMENT})
) < ${max_customers_per_mitra}
`
return { available: count > 0, count }
}
const available = count > 0
availabilityCache = {
available,
count,
expiresAt: now + AVAILABILITY_TTL_MS,
export const countAvailableMitrasFromCache = async () => {
try {
const cached = await valkey.get(AVAILABILITY_CACHE_KEY)
if (cached) return JSON.parse(cached)
const snapshot = await computeAvailabilityFromValkey()
await valkey.getValkeyClient().setex(AVAILABILITY_CACHE_KEY, AVAILABILITY_TTL_SECONDS, JSON.stringify(snapshot))
return snapshot
} catch (err) {
console.warn('[countAvailableMitras] valkey unavailable, falling back to Postgres:', err.message)
return computeAvailabilityFromPostgres()
}
return { available, count }
}
/**
* Mitra-online check for use during pairing/extension safeguards.
* Combines the Valkey-mirrored online flag (Postgres mitra_online_status today) with
* the WebSocket-connected check. Never use "in-session" as a proxy for "online".
* Mitra-reachable check: in `mitras:online` SET AND heartbeat is fresh.
* Falls back to a Postgres `is_online` read if Valkey is unreachable; the
* fallback skips the heartbeat-freshness check (sweep takes care of stale rows
* within `stale_after_seconds + sweep_cadence`).
*/
export const isMitraReachable = async (mitraId) => {
const [row] = await sql`
SELECT is_online FROM mitra_online_status WHERE mitra_id = ${mitraId}
`
return Boolean(row?.is_online)
try {
const inSet = await valkey.sismember(VK_MITRAS_ONLINE, mitraId)
if (!inSet) return false
const heartbeat = await valkey.get(vkHeartbeatKey(mitraId))
if (!heartbeat) return false
const { stale_after_seconds } = await getMitraPingConfig()
return Date.parse(heartbeat) >= Date.now() - stale_after_seconds * 1000
} catch (err) {
console.warn('[isMitraReachable] valkey unavailable, falling back to DB:', err.message)
const [row] = await sql`SELECT is_online FROM mitra_online_status WHERE mitra_id = ${mitraId}`
return Boolean(row?.is_online)
}
}
/**

View File

@@ -1,4 +1,8 @@
import { getDb } from '../db/client.js'
import * as valkey from '../plugins/valkey.js'
import { VK_MITRAS_DEACTIVATED } from './mitra-status.service.js'
import { revokeAllSessionsForUser } from './token.service.js'
import { UserType } from '../constants.js'
const sql = getDb()
@@ -36,6 +40,24 @@ export const updateMitraStatus = async (id, is_active) => {
RETURNING id, is_active
`
if (!mitra) throw Object.assign(new Error('Mitra not found'), { code: 'NOT_FOUND', statusCode: 404 })
// Deactivation also revokes all auth_sessions so the next token refresh fails
// (bounds the "ghost online" window to access-token TTL across all routes,
// not just heartbeat). See requirement/valkey-online-mirror-plan.md.
if (!is_active) {
await revokeAllSessionsForUser({ userType: UserType.MITRA, userId: id })
}
try {
if (is_active) {
await valkey.srem(VK_MITRAS_DEACTIVATED, id)
} else {
await valkey.sadd(VK_MITRAS_DEACTIVATED, id)
}
} catch (err) {
console.error(`[valkey-mirror] updateMitraStatus ${id} failed:`, err)
}
return mitra
}

View File

@@ -1,11 +1,13 @@
import { getDb } from '../db/client.js'
import { getMaxCustomersPerMitra, getPairingBlastTimeoutSeconds, getReturningChatConfirmationTimeoutSeconds } from './config.service.js'
import * as valkey from '../plugins/valkey.js'
import { VK_MITRAS_ONLINE, VK_MITRAS_DEACTIVATED, vkCapacityKey, vkHeartbeatKey } from './mitra-status.service.js'
import { getMaxCustomersPerMitra, getPairingBlastTimeoutSeconds, getReturningChatConfirmationTimeoutSeconds, getMitraPingConfig } from './config.service.js'
import { sendToUser } from '../plugins/websocket.js'
import { sendPushNotification } from './notification.service.js'
import { startSessionTimer } from './session-timer.service.js'
import { startSessionListener } from './chat-handler.service.js'
import { consumePaymentSession, failPaymentSession, getPaymentSession, recordIntermediateFailure } from './payment.service.js'
import { isMitraReachable, isMitraInActiveSessionWithCustomer, getMitraActiveSessionCount } from './mitra-status.service.js'
import { isMitraReachable, isMitraInActiveSessionWithCustomer, getMitraActiveSessionCount, recomputeCapacityForMitra, recomputeCapacityBySession } from './mitra-status.service.js'
import {
UserType,
SessionStatus,
@@ -76,10 +78,37 @@ const notifyCustomer = async (customerId, data) => {
}
}
export const findAvailableMitras = async () => {
// Valkey-driven: SDIFF(online, deactivated) → for each candidate, pipelined
// GET capacity + heartbeat, then filter by capacity gate + heartbeat freshness.
// Postgres fallback runs if any Valkey op throws (full JOIN as before).
const findAvailableMitrasFromValkey = async () => {
const { max_customers_per_mitra } = await getMaxCustomersPerMitra()
const { stale_after_seconds } = await getMitraPingConfig()
const candidates = await valkey.sdiff(VK_MITRAS_ONLINE, VK_MITRAS_DEACTIVATED)
if (!candidates.length) return []
const pipe = valkey.pipeline()
for (const id of candidates) {
pipe.get(vkCapacityKey(id))
pipe.get(vkHeartbeatKey(id))
}
const results = await pipe.exec()
const cutoff = Date.now() - stale_after_seconds * 1000
const eligible = []
for (let i = 0; i < candidates.length; i++) {
const capacity = Number(results[i * 2][1] ?? 0)
const heartbeat = results[i * 2 + 1][1]
if (capacity >= max_customers_per_mitra) continue
if (!heartbeat || Date.parse(heartbeat) < cutoff) continue
eligible.push({ id: candidates[i], active_session_count: capacity })
}
return eligible
}
const findAvailableMitrasFromPostgres = async () => {
const { max_customers_per_mitra } = await getMaxCustomersPerMitra()
// Project active_session_count alongside the mitra row so the blast loop doesn't
// need a per-mitra COUNT roundtrip later.
const mitras = await sql`
SELECT m.id, m.display_name, sub.active_session_count
FROM mitras m
@@ -96,6 +125,15 @@ export const findAvailableMitras = async () => {
return mitras
}
export const findAvailableMitras = async () => {
try {
return await findAvailableMitrasFromValkey()
} catch (err) {
console.warn('[findAvailableMitras] valkey unavailable, falling back to Postgres:', err.message)
return findAvailableMitrasFromPostgres()
}
}
/**
* Validate that a payment session is owned by the customer, confirmed, and not yet consumed.
* Throws on mismatch. Returns the loaded payment session row.
@@ -414,6 +452,10 @@ export const acceptPairingRequest = async (sessionId, mitraId) => {
})
}
// Mitra now occupies a capacity slot (PENDING_PAYMENT counts per
// findAvailableMitras predicate). Mirror to Valkey.
await recomputeCapacityForMitra(mitraId)
// Mark this mitra's notification as accepted
await sql`
UPDATE chat_request_notifications

View File

@@ -2,6 +2,7 @@ import { getDb } from '../db/client.js'
import { publish } from '../plugins/valkey.js'
import { sendToSessionParticipant } from '../plugins/websocket.js'
import { sendPushNotification } from './notification.service.js'
import { recomputeCapacityForMitra } from './mitra-status.service.js'
import { UserType, SessionStatus, WsMessage, EndedBy } from '../constants.js'
const sql = getDb()
@@ -152,6 +153,7 @@ const onSessionExpired = async (sessionId) => {
RETURNING id, customer_id, mitra_id
`
if (!session) return
await recomputeCapacityForMitra(session.mitra_id)
// Notify customer — sees extend/close dialog; FCM fallback if WebSocket is down
const expiredData = { type: WsMessage.SESSION_EXPIRED, session_id: sessionId }
@@ -207,9 +209,10 @@ const autoCompleteIfStillClosing = async (sessionId) => {
ended_at = COALESCE(ended_at, NOW()),
ended_by = ${EndedBy.SYSTEM_AUTO_CLOSE}
WHERE id = ${sessionId} AND status = ${SessionStatus.CLOSING}
RETURNING id
RETURNING id, mitra_id
`
if (!updated) return
await recomputeCapacityForMitra(updated.mitra_id)
const data = { type: WsMessage.SESSION_COMPLETED, session_id: sessionId }
sendToSessionParticipant(sessionId, UserType.CUSTOMER, data)

View File

@@ -1,5 +1,6 @@
import { getDb } from '../db/client.js'
import { publish } from '../plugins/valkey.js'
import { recomputeCapacityForMitra } from './mitra-status.service.js'
import { UserType, SessionStatus, MessageStatus, WsMessage } from '../constants.js'
const sql = getDb()
@@ -48,6 +49,7 @@ export const endSession = async (sessionId, endedBy, userId) => {
code: 'SESSION_NOT_ACTIVE', statusCode: 409,
})
}
await recomputeCapacityForMitra(session.mitra_id)
// Notify both parties
await publish(`session:${sessionId}:status`, {
@@ -91,6 +93,11 @@ export const rerouteSession = async (sessionId, newMitraId) => {
WHERE id = ${sessionId}
RETURNING id, customer_id, mitra_id, status
`
// Both mitras' capacity flipped — recompute both.
await Promise.all([
recomputeCapacityForMitra(oldMitraId),
recomputeCapacityForMitra(newMitraId),
])
const [newMitra] = await sql`
SELECT display_name FROM mitras WHERE id = ${newMitraId}

View File

@@ -1,4 +1,5 @@
import { getDb } from '../../src/db/client.js'
import { flushTestDb } from './valkey.js'
/**
* Single shared sql client used by tests. Same singleton the services use, since
@@ -37,6 +38,9 @@ export const resetDb = async () => {
const sql = db()
// RESTART IDENTITY is a no-op for UUID PKs but cheap; CASCADE handles any future FK additions.
await sql.unsafe(`TRUNCATE TABLE ${TRUNCATE_TABLES.join(', ')} RESTART IDENTITY CASCADE`)
// Flush Valkey availability state so each test starts hermetic. Fixtures
// (createMitra etc.) re-seed Valkey alongside their Postgres writes.
await flushTestDb()
}
/**

View File

@@ -1,5 +1,6 @@
import { randomUUID } from 'node:crypto'
import { db, resetAppConfig } from './db.js'
import { getTestValkey } from './valkey.js'
/**
* Insert a customer row. Defaults to the schema after the Phase 3.4 auth rewrite
@@ -47,6 +48,19 @@ export const createMitra = async ({
ON CONFLICT (mitra_id) DO UPDATE
SET is_online = true, last_online_at = ${now}, last_heartbeat_at = ${now}, updated_at = ${now}
`
// Mirror to Valkey so findAvailableMitras (Valkey-driven) sees this mitra.
// resetDb already FLUSHDBs Valkey, so seeding here per-mitra keeps tests
// hermetic without depending on production's startup seed.
const v = getTestValkey()
await v.multi()
.sadd('mitras:online', id)
.set(`mitra:heartbeat:${id}`, now.toISOString())
.set(`mitra:capacity:${id}`, 0)
.exec()
}
if (!isActive) {
const v = getTestValkey()
await v.sadd('mitras:deactivated', id)
}
return row
}

View File

@@ -0,0 +1,494 @@
import { describe, it, expect, beforeAll, beforeEach, afterAll, vi } from 'vitest'
/**
* Integration tests for the Valkey availability mirror
* (requirement/valkey-online-mirror-plan.md).
*
* Real Postgres + real Valkey via test/setup.js — no mocks. We assert on both
* stores' state after each operation to catch missed mirrors or order bugs.
*/
vi.mock('../../src/plugins/websocket.js', () => ({
sendToUser: vi.fn(() => false),
sendToSessionParticipant: vi.fn(() => false),
registerWebSocketPlugin: vi.fn(),
registerWebSocketRoute: vi.fn(),
isUserOnlineWs: vi.fn(() => false),
getSessionConnections: vi.fn(() => ({})),
}))
vi.mock('../../src/services/notification.service.js', () => ({
sendPushNotification: vi.fn(async () => true),
registerDeviceToken: vi.fn(async () => {}),
}))
const valkey = await import('../../src/plugins/valkey.js')
const {
setOnline,
setOffline,
heartbeat,
isMitraReachable,
recomputeCapacityForMitra,
recomputeCapacityBySession,
seedFromPostgres,
autoOfflineStaleMitras,
mirrorHeartbeatsToPostgres,
countAvailableMitrasFromCache,
invalidateAvailabilityCache,
VK_MITRAS_ONLINE,
VK_MITRAS_DEACTIVATED,
vkCapacityKey,
vkHeartbeatKey,
} = await import('../../src/services/mitra-status.service.js')
const { updateMitraStatus } = await import('../../src/services/mitra.service.js')
const { findAvailableMitras, acceptPairingRequest, createPairingRequest } = await import('../../src/services/pairing.service.js')
const { createPaymentSession, confirmPaymentSession } = await import('../../src/services/payment.service.js')
const { db, resetDb, resetAppConfig } = await import('../helpers/db.js')
const { getTestValkey } = await import('../helpers/valkey.js')
const { createCustomer, createMitra } = await import('../helpers/fixtures.js')
const { SessionStatus, UserType } = await import('../../src/constants.js')
const v = () => getTestValkey()
describe('mitra-status valkey mirror', () => {
beforeAll(async () => {
await resetAppConfig()
})
beforeEach(async () => {
await resetDb()
})
// ---------- Seed ----------
describe('seedFromPostgres', () => {
it('populates mitras:online from is_online=true rows', async () => {
const m1 = await createMitra({ callName: 'M1', isOnline: true })
const m2 = await createMitra({ callName: 'M2', isOnline: true })
await createMitra({ callName: 'M3', isOnline: false })
await v().flushdb()
await seedFromPostgres()
const members = await v().smembers(VK_MITRAS_ONLINE)
expect(members.sort()).toEqual([m1.id, m2.id].sort())
})
it('seeds mitras:deactivated from is_active=false', async () => {
const m = await createMitra({ callName: 'Dead', isActive: false })
await createMitra({ callName: 'Alive', isActive: true })
await v().flushdb()
await seedFromPostgres()
const members = await v().smembers(VK_MITRAS_DEACTIVATED)
expect(members).toEqual([m.id])
})
it('seeds heartbeat keys for online mitras with current timestamp', async () => {
const m = await createMitra({ callName: 'Live', isOnline: true })
await v().flushdb()
const before = Date.now()
await seedFromPostgres()
const after = Date.now()
const ts = await v().get(vkHeartbeatKey(m.id))
expect(ts).toBeTruthy()
const seeded = Date.parse(ts)
expect(seeded).toBeGreaterThanOrEqual(before)
expect(seeded).toBeLessThanOrEqual(after)
})
it('seeds capacity counters from chat_sessions', async () => {
const c = await createCustomer({ callName: 'C' })
const m = await createMitra({ callName: 'M', isOnline: true })
const sql = db()
await sql`
INSERT INTO chat_sessions (customer_id, mitra_id, status)
VALUES (${c.id}, ${m.id}, ${SessionStatus.ACTIVE})
`
await v().flushdb()
await seedFromPostgres()
expect(await v().get(vkCapacityKey(m.id))).toBe('1')
})
it('is idempotent — running twice yields the same state', async () => {
const m = await createMitra({ callName: 'Idem', isOnline: true })
await seedFromPostgres()
const first = {
online: (await v().smembers(VK_MITRAS_ONLINE)).sort(),
heartbeat: await v().get(vkHeartbeatKey(m.id)),
}
await seedFromPostgres()
const second = {
online: (await v().smembers(VK_MITRAS_ONLINE)).sort(),
heartbeat: await v().get(vkHeartbeatKey(m.id)),
}
expect(second.online).toEqual(first.online)
// Heartbeat is reseeded with NOW each call — must be >= first
expect(Date.parse(second.heartbeat)).toBeGreaterThanOrEqual(Date.parse(first.heartbeat))
})
})
// ---------- setOnline / setOffline ----------
describe('setOnline / setOffline write-through', () => {
it('setOnline adds to mitras:online + writes heartbeat key', async () => {
const m = await createMitra({ callName: 'Toggle', isOnline: false })
await v().flushdb()
await setOnline(m.id)
expect(await v().sismember(VK_MITRAS_ONLINE, m.id)).toBe(1)
expect(await v().get(vkHeartbeatKey(m.id))).toBeTruthy()
})
it('setOffline removes from mitras:online + deletes heartbeat key', async () => {
const m = await createMitra({ callName: 'Toggle', isOnline: true })
await setOnline(m.id) // ensure heartbeat key exists
await setOffline(m.id)
expect(await v().sismember(VK_MITRAS_ONLINE, m.id)).toBe(0)
expect(await v().get(vkHeartbeatKey(m.id))).toBeNull()
})
it('setOffline is no-op when mitra was already offline', async () => {
const m = await createMitra({ callName: 'OffAlready', isOnline: false })
const sql = db()
const beforeLogs = await sql`SELECT COUNT(*)::int AS c FROM mitra_online_logs WHERE mitra_id=${m.id}`
await setOffline(m.id)
const afterLogs = await sql`SELECT COUNT(*)::int AS c FROM mitra_online_logs WHERE mitra_id=${m.id}`
expect(afterLogs[0].c).toBe(beforeLogs[0].c)
})
})
// ---------- heartbeat ----------
describe('heartbeat (Valkey-only)', () => {
it('writes Valkey timestamp without touching Postgres last_heartbeat_at', async () => {
const m = await createMitra({ callName: 'Pinger', isOnline: true })
const sql = db()
const [before] = await sql`SELECT last_heartbeat_at FROM mitra_online_status WHERE mitra_id=${m.id}`
const pgBefore = before.last_heartbeat_at
// Make sure subsequent NOW() would differ
await new Promise(r => setTimeout(r, 50))
await heartbeat(m.id)
const [after] = await sql`SELECT last_heartbeat_at FROM mitra_online_status WHERE mitra_id=${m.id}`
// Postgres untouched
expect(after.last_heartbeat_at).toEqual(pgBefore)
// Valkey updated
const ts = await v().get(vkHeartbeatKey(m.id))
expect(ts).toBeTruthy()
expect(Date.parse(ts)).toBeGreaterThan(pgBefore.getTime())
})
it('advances the heartbeat timestamp on each call', async () => {
const m = await createMitra({ callName: 'P', isOnline: true })
await heartbeat(m.id)
const t1 = await v().get(vkHeartbeatKey(m.id))
await new Promise(r => setTimeout(r, 20))
await heartbeat(m.id)
const t2 = await v().get(vkHeartbeatKey(m.id))
expect(Date.parse(t2)).toBeGreaterThan(Date.parse(t1))
})
})
// ---------- isMitraReachable ----------
describe('isMitraReachable', () => {
it('returns true for online mitra with fresh heartbeat', async () => {
const m = await createMitra({ callName: 'Reach', isOnline: true })
expect(await isMitraReachable(m.id)).toBe(true)
})
it('returns false when mitra is not in mitras:online', async () => {
const m = await createMitra({ callName: 'NoReach', isOnline: false })
expect(await isMitraReachable(m.id)).toBe(false)
})
it('returns false when heartbeat is stale', async () => {
const m = await createMitra({ callName: 'Stale', isOnline: true })
// Force stale heartbeat (one hour ago)
const ancient = new Date(Date.now() - 3_600_000).toISOString()
await v().set(vkHeartbeatKey(m.id), ancient)
expect(await isMitraReachable(m.id)).toBe(false)
})
})
// ---------- recomputeCapacity ----------
describe('recomputeCapacityForMitra', () => {
it('counts ACTIVE + PENDING_PAYMENT sessions', async () => {
const c = await createCustomer({ callName: 'C' })
const c2 = await createCustomer({ callName: 'C2' })
const m = await createMitra({ callName: 'Cap', isOnline: true })
const sql = db()
await sql`
INSERT INTO chat_sessions (customer_id, mitra_id, status)
VALUES (${c.id}, ${m.id}, ${SessionStatus.ACTIVE}),
(${c2.id}, ${m.id}, ${SessionStatus.PENDING_PAYMENT})
`
await recomputeCapacityForMitra(m.id)
expect(await v().get(vkCapacityKey(m.id))).toBe('2')
})
it('excludes ended/closing/extending sessions', async () => {
const c = await createCustomer({ callName: 'C' })
const m = await createMitra({ callName: 'Cap', isOnline: true })
const sql = db()
await sql`
INSERT INTO chat_sessions (customer_id, mitra_id, status)
VALUES (${c.id}, ${m.id}, ${SessionStatus.COMPLETED})
`
await recomputeCapacityForMitra(m.id)
expect(await v().get(vkCapacityKey(m.id))).toBe('0')
})
it('no-op when mitraId is null/undefined', async () => {
await recomputeCapacityForMitra(null) // should not throw
await recomputeCapacityForMitra(undefined)
})
})
// ---------- findAvailableMitras ----------
describe('findAvailableMitras (Valkey-driven)', () => {
it('returns online + not-deactivated + under-capacity + fresh-heartbeat mitras', async () => {
const ok = await createMitra({ callName: 'OK', isOnline: true })
const deact = await createMitra({ callName: 'Deact', isOnline: true, isActive: false })
const offline = await createMitra({ callName: 'Off', isOnline: false })
const stale = await createMitra({ callName: 'Stale', isOnline: true })
await v().set(vkHeartbeatKey(stale.id), new Date(Date.now() - 3_600_000).toISOString())
const result = await findAvailableMitras()
const ids = result.map(r => r.id).sort()
expect(ids).toEqual([ok.id].sort())
expect(result.find(r => r.id === ok.id).active_session_count).toBe(0)
})
it('excludes a mitra whose capacity is at max', async () => {
const m = await createMitra({ callName: 'AtCap', isOnline: true })
// max_customers_per_mitra default is 3
await v().set(vkCapacityKey(m.id), 3)
const result = await findAvailableMitras()
expect(result.find(r => r.id === m.id)).toBeUndefined()
})
it('returns capacity in the result for the blast loop', async () => {
const m = await createMitra({ callName: 'WithCap', isOnline: true })
await v().set(vkCapacityKey(m.id), 2)
const result = await findAvailableMitras()
expect(result.find(r => r.id === m.id).active_session_count).toBe(2)
})
})
// ---------- countAvailableMitrasFromCache ----------
describe('countAvailableMitrasFromCache (beacon)', () => {
it('caches the snapshot in Valkey with TTL', async () => {
await createMitra({ callName: 'On', isOnline: true })
await v().del('availability:snapshot')
const first = await countAvailableMitrasFromCache()
expect(first.available).toBe(true)
expect(first.count).toBe(1)
const cached = await v().get('availability:snapshot')
expect(cached).toBeTruthy()
expect(JSON.parse(cached)).toEqual(first)
const ttl = await v().ttl('availability:snapshot')
expect(ttl).toBeGreaterThan(0)
expect(ttl).toBeLessThanOrEqual(10)
})
it('returns cached snapshot on subsequent calls without recompute', async () => {
await createMitra({ callName: 'On', isOnline: true })
await countAvailableMitrasFromCache() // primes cache
// Manually corrupt SET to prove subsequent call reads cache, not Valkey state
await v().flushdb()
await v().set('availability:snapshot', JSON.stringify({ available: true, count: 42 }), 'EX', 10)
const result = await countAvailableMitrasFromCache()
expect(result.count).toBe(42)
})
it('invalidateAvailabilityCache deletes the snapshot', async () => {
await v().set('availability:snapshot', JSON.stringify({ available: true, count: 1 }), 'EX', 10)
await invalidateAvailabilityCache()
expect(await v().get('availability:snapshot')).toBeNull()
})
})
// ---------- autoOfflineStaleMitras ----------
describe('autoOfflineStaleMitras', () => {
it('flips Postgres + cleans Valkey for mitras with stale heartbeat', async () => {
const m = await createMitra({ callName: 'WillStale', isOnline: true })
const sql = db()
// Force stale heartbeat
await v().set(vkHeartbeatKey(m.id), new Date(Date.now() - 3_600_000).toISOString())
const count = await autoOfflineStaleMitras()
expect(count).toBe(1)
const [row] = await sql`SELECT is_online FROM mitra_online_status WHERE mitra_id=${m.id}`
expect(row.is_online).toBe(false)
expect(await v().sismember(VK_MITRAS_ONLINE, m.id)).toBe(0)
expect(await v().get(vkHeartbeatKey(m.id))).toBeNull()
const [log] = await sql`
SELECT status FROM mitra_online_logs
WHERE mitra_id=${m.id} ORDER BY timestamp DESC LIMIT 1
`
expect(log.status).toBe('offline')
})
it('no-op when no mitras are stale', async () => {
await createMitra({ callName: 'Fresh', isOnline: true })
const count = await autoOfflineStaleMitras()
expect(count).toBe(0)
})
it('no-op when require_ping=false', async () => {
const sql = db()
await sql`
UPDATE app_config SET value=${sql.json({ value: false })}
WHERE key='require_mitra_ping'
`
const m = await createMitra({ callName: 'WouldBeStale', isOnline: true })
await v().set(vkHeartbeatKey(m.id), new Date(Date.now() - 3_600_000).toISOString())
const count = await autoOfflineStaleMitras()
expect(count).toBe(0)
// Restore for other tests
await sql`
UPDATE app_config SET value=${sql.json({ value: true })}
WHERE key='require_mitra_ping'
`
})
})
// ---------- mirrorHeartbeatsToPostgres ----------
describe('mirrorHeartbeatsToPostgres', () => {
it('writes Valkey heartbeat timestamps to Postgres last_heartbeat_at in one batch', async () => {
const m1 = await createMitra({ callName: 'P1', isOnline: true })
const m2 = await createMitra({ callName: 'P2', isOnline: true })
const sql = db()
const ts = new Date(Date.now() - 2_000).toISOString()
await v().set(vkHeartbeatKey(m1.id), ts)
await v().set(vkHeartbeatKey(m2.id), ts)
const count = await mirrorHeartbeatsToPostgres()
expect(count).toBe(2)
const rows = await sql`
SELECT mitra_id, last_heartbeat_at FROM mitra_online_status
WHERE mitra_id IN (${m1.id}, ${m2.id})
`
for (const row of rows) {
expect(row.last_heartbeat_at.toISOString()).toBe(ts)
}
})
it('no-op when no mitras are online', async () => {
await v().del(VK_MITRAS_ONLINE)
const count = await mirrorHeartbeatsToPostgres()
expect(count).toBe(0)
})
})
// ---------- updateMitraStatus / revokeAllSessions ----------
describe('updateMitraStatus + auth_session revocation', () => {
it('deactivation adds to mitras:deactivated AND revokes all auth_sessions', async () => {
const m = await createMitra({ callName: 'Banned', isActive: true })
const sql = db()
const tokenHash = '$2b$10$abcdefghijklmnopqrstuv'
await sql`
INSERT INTO auth_sessions (user_type, user_id, refresh_token_hash, expires_at)
VALUES (${UserType.MITRA}, ${m.id}, ${tokenHash}, NOW() + INTERVAL '30 days')
`
await updateMitraStatus(m.id, false)
expect(await v().sismember(VK_MITRAS_DEACTIVATED, m.id)).toBe(1)
const [auth] = await sql`SELECT revoked_at FROM auth_sessions WHERE user_id=${m.id}`
expect(auth.revoked_at).not.toBeNull()
})
it('reactivation removes from mitras:deactivated', async () => {
const m = await createMitra({ callName: 'Pardoned', isActive: false })
await v().sadd(VK_MITRAS_DEACTIVATED, m.id)
await updateMitraStatus(m.id, true)
expect(await v().sismember(VK_MITRAS_DEACTIVATED, m.id)).toBe(0)
})
})
// ---------- E2E: blast lifecycle ----------
describe('end-to-end: blast lifecycle drives capacity counter', () => {
it('mitra accept → capacity++; session end is covered separately', async () => {
const c = await createCustomer({ callName: 'BlastC' })
const m = await createMitra({ callName: 'BlastM', isOnline: true })
const pay = await createPaymentSession({
customerId: c.id,
durationMinutes: 15,
amount: 30000,
})
await confirmPaymentSession(pay.id, c.id)
const session = await createPairingRequest(c.id, { paymentRequestId: pay.id })
expect(await v().get(vkCapacityKey(m.id))).toBe('0') // no accept yet
await acceptPairingRequest(session.id, m.id)
expect(await v().get(vkCapacityKey(m.id))).toBe('1')
})
})
// ---------- Reader fallback when Valkey is unavailable ----------
describe('reader fallback', () => {
it('isMitraReachable falls back to Postgres on Valkey error', async () => {
const m = await createMitra({ callName: 'Fallback', isOnline: true })
// Stub sismember to throw
const spy = vi.spyOn(valkey, 'sismember').mockRejectedValue(new Error('valkey down'))
try {
// Postgres has is_online=true → fallback returns true
const result = await isMitraReachable(m.id)
expect(result).toBe(true)
} finally {
spy.mockRestore()
}
})
it('findAvailableMitras falls back to Postgres JOIN when Valkey sdiff throws', async () => {
const m = await createMitra({ callName: 'FallbackBlast', isOnline: true })
const spy = vi.spyOn(valkey, 'sdiff').mockRejectedValue(new Error('valkey down'))
try {
const result = await findAvailableMitras()
expect(result.find(r => r.id === m.id)).toBeDefined()
} finally {
spy.mockRestore()
}
})
})
})

View File

@@ -1,4 +1,5 @@
import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest'
import { randomUUID } from 'node:crypto'
// Capture calls to sendToSessionParticipant so we can assert the 3-min warning event.
vi.mock('../../src/plugins/websocket.js', () => ({
@@ -15,10 +16,42 @@ vi.mock('../../src/services/notification.service.js', () => ({
registerDeviceToken: vi.fn(async () => {}),
}))
vi.mock('../../src/plugins/valkey.js', () => ({
publish: vi.fn(async () => {}),
subscribe: vi.fn(() => () => {}),
}))
// Real DB queries don't settle under fake timers (they're real socket I/O, not
// microtasks). Stub getDb() with a tagged-template-compatible mock so onThreeMinuteWarning's
// `SELECT expires_at FROM chat_sessions WHERE id = ${sessionId}` resolves synchronously.
vi.mock('../../src/db/client.js', () => {
const fakeSql = () => Promise.resolve([{ expires_at: null }])
fakeSql.unsafe = () => Promise.resolve([])
fakeSql.array = (arr) => arr
fakeSql.json = (v) => v
return { getDb: () => fakeSql }
})
vi.mock('../../src/plugins/valkey.js', () => {
const noopPipeline = { sadd: () => noopPipeline, srem: () => noopPipeline, set: () => noopPipeline, get: () => noopPipeline, del: () => noopPipeline, exec: async () => [] }
return {
publish: vi.fn(async () => {}),
subscribe: vi.fn(() => () => {}),
onValkeyReady: vi.fn(),
getValkeyClient: vi.fn(() => ({ setex: vi.fn(async () => 'OK') })),
getValkeyPub: vi.fn(),
getValkeySub: vi.fn(),
sadd: vi.fn(async () => 1),
srem: vi.fn(async () => 1),
sismember: vi.fn(async () => false),
smembers: vi.fn(async () => []),
sdiff: vi.fn(async () => []),
scard: vi.fn(async () => 0),
set: vi.fn(async () => 'OK'),
get: vi.fn(async () => null),
del: vi.fn(async () => 1),
incr: vi.fn(async () => 1),
decr: vi.fn(async () => 0),
exists: vi.fn(async () => 0),
pipeline: vi.fn(() => noopPipeline),
multi: vi.fn(() => noopPipeline),
}
})
const { sendToSessionParticipant } = await import('../../src/plugins/websocket.js')
const { startSessionTimer, clearSessionTimer } = await import('../../src/services/session-timer.service.js')
@@ -35,7 +68,9 @@ describe('session-timer 3-minute warning (Phase 4)', () => {
})
it('emits session_warning kind:three_minutes_left exactly once at the 3-min mark', async () => {
const sessionId = 'sess-3min-test'
// Real UUID — onThreeMinuteWarning runs a Postgres SELECT against chat_sessions.id
// which is uuid-typed; string ids throw a parse error before we hit the row check.
const sessionId = randomUUID()
const expiresAt = new Date(Date.now() + 5 * 60_000) // 5 minutes from now
startSessionTimer(sessionId, expiresAt)
@@ -65,7 +100,7 @@ describe('session-timer 3-minute warning (Phase 4)', () => {
})
it('does NOT re-fire the 3-min warning when the timer is rescheduled (e.g. extension)', async () => {
const sessionId = 'sess-rescheduled'
const sessionId = randomUUID()
const initial = new Date(Date.now() + 5 * 60_000)
startSessionTimer(sessionId, initial)