Files
halobestie-clone/backend/src/services/config.service.js
Ramadhan Sjamsani 553dbac52f Phase 6: Valkey availability mirror — move read path off Postgres
Mitra-availability state (online flag, deactivated flag, per-mitra session
count, heartbeat liveness) mirrored into Valkey so the customer beacon
+ pairing blast + dashboard counts no longer hit Postgres on the hot path.
Postgres remains the durable source of truth; Valkey state is fully
derivable via seedFromPostgres on startup + reconnect.

Schema
- mitras:online           SET    — mirror of is_online
- mitras:deactivated      SET    — mirror of is_active=false
- mitra:capacity:<id>     STRING — active+pending_payment session count
- mitra💓<id>    STRING — ISO timestamp of last ping
- availability:snapshot   JSON   — beacon cache, TTL 10s, cluster-shared

Write paths (Postgres first, best-effort Valkey)
- setOnline/setOffline mirror SADD/SREM + heartbeat SET/DEL
- updateMitraStatus mirrors mitras:deactivated AND revokes auth_sessions
  on deactivate (bounds the "ghost online" window to access-token TTL)
- heartbeat is Valkey-only on the hot path; the per-ping Postgres UPDATE
  on last_heartbeat_at is eliminated (was 1,200 ops/min at prod scale)
- chat_session lifecycle (accept/end/reroute/extension/expiry) calls
  recomputeCapacityForMitra after each UPDATE — derive-from-truth avoids
  the bookkeeping risk of per-transition INCR/DECR

Read paths (Valkey-first, Postgres fallback on Valkey error)
- isMitraReachable: SISMEMBER mitras:online + heartbeat freshness
- findAvailableMitras: SDIFF + pipelined GETs, filter by capacity + heartbeat
- countAvailableMitrasFromCache: Valkey-driven, cached cluster-wide 10s TTL
- dashboard online count: SCARD
- Each reader wraps Valkey ops in try/catch → Postgres fallback on outage

Heartbeat path on /api/mitra/status/heartbeat
- resolveMitra preHandler replaced with heartbeatGuard: SISMEMBER on
  mitras:deactivated (~0 DB hits per ping). Falls back to full DB
  resolveMitra if Valkey is unreachable so a Valkey outage doesn't
  silently accept heartbeats from deactivated mitras.

Three sweeps, env-configurable cadences
- MITRA_AUTO_OFFLINE_SWEEP_SECONDS (30) — Valkey-driven stale detection
- HEARTBEAT_MIRROR_INTERVAL_SECONDS (60) — batched UPSERT writes
  Valkey timestamps to Postgres last_heartbeat_at via UNNEST (1 statement
  per cycle, idempotent across instances)
- VALKEY_ONLINE_MIRROR_SWEEP_SECONDS (300) — periodic reseed heals drift

Startup
- restoreActiveTimers → seedFromPostgres → bind listeners
- onValkeyReady re-runs the seed on every reconnect (cold start + reseed
  on Valkey restart, no manual intervention)

Failure semantics
- Read fallback: every Valkey read wrapped, falls back to existing
  Postgres JOIN query — system stays correct during Valkey outage,
  performance degrades not breaks
- Write best-effort: Postgres write commits before Valkey is touched;
  Valkey errors log + continue; reconciliation sweep heals drift
- Auto-offline sweep aborts entirely on Valkey error (does NOT mass-
  offline via Postgres scan during Valkey hiccup)

Tests
- New: 32 integration tests in mitra-status.valkey-mirror.test.js
  covering seed, write-through, fallbacks, capacity lifecycle,
  auto-offline sweep, heartbeat mirror, deactivation flow, beacon cache
- Updated: fixtures.js seeds Valkey alongside Postgres when isOnline=true
- Updated: helpers/db.js resetDb also flushes test Valkey
- Fixed 2 pre-existing session-timer flakes (string IDs failed uuid
  parse; vi.advanceTimersByTimeAsync raced real Postgres I/O)
- All 124/124 backend tests pass (was 90/92)

Docs
- requirement/valkey-online-mirror-plan.md — canonical plan
- requirement/valkey-online-mirror-testing.md — manual E2E checklist
- requirement/deployment.md — infra + Valkey persistence guidance for
  prod (Memorystore Standard tier recommended; migration from
  self-hosted Valkey is zero-downtime via reseed-from-Postgres)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 18:07:55 +08:00

393 lines
16 KiB
JavaScript

import { getDb } from '../db/client.js'
import { ExtensionTimeoutAction } from '../constants.js'
const sql = getDb()
export const getAnonymityConfig = async () => {
const [row] = await sql`SELECT value FROM app_config WHERE key = 'anonymity'`
return { anonymity_enabled: row?.value?.enabled ?? false }
}
export const setAnonymityConfig = async (enabled) => {
await sql`
INSERT INTO app_config (key, value, updated_at)
VALUES ('anonymity', ${sql.json({ enabled })}, NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = NOW()
`
return { anonymity_enabled: enabled }
}
export const getMaxCustomersPerMitra = async () => {
const [row] = await sql`SELECT value FROM app_config WHERE key = 'max_customers_per_mitra'`
return { max_customers_per_mitra: row?.value?.value ?? 3 }
}
export const setMaxCustomersPerMitra = async (value) => {
await sql`
INSERT INTO app_config (key, value, updated_at)
VALUES ('max_customers_per_mitra', ${sql.json({ value })}, NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = NOW()
`
// Capacity changed → drop cached availability snapshot.
// Imported lazily to avoid a circular import (mitra-status.service uses config).
const { invalidateAvailabilityCache } = await import('./mitra-status.service.js')
invalidateAvailabilityCache()
return { max_customers_per_mitra: value }
}
// --- Phase 4: First-session discount config (back-compat shim) ---
//
// The canonical source of truth for the first-session discount lives in the
// `pricing_promotions` table (eligibility = 'first_session'). The CC settings
// page still calls `/internal/config/free-trial`, which exposes a slim
// {enabled, duration_minutes} view — kept as a back-compat shim until the CC
// UI is migrated to the richer /internal/config/first-session-discount handler.
// Reads and writes go directly against `pricing_promotions` so operator edits
// stay in sync with the customer-facing pricing payload.
//
// The legacy `first_session_discount_*` keys in `app_config` were retired in
// Stage 5 (deleted by migrate.js) — do NOT reintroduce them.
export const getFreeTrialConfig = async () => {
const [row] = await sql`
SELECT enabled, duration_minutes FROM pricing_promotions
WHERE eligibility = 'first_session'
`
return {
enabled: row?.enabled ?? true,
duration_minutes: row?.duration_minutes ?? 12,
}
}
export const setFreeTrialConfig = async ({ enabled, duration_minutes }) => {
// Build a sparse UPDATE so undefined fields are left alone (matches the prior
// semantics where missing patch fields were no-ops). Use COALESCE on each
// column with the sentinel-when-undefined pattern; postgres.js parameterizes
// null/undefined identically, so we branch on which fields the caller sent.
if (enabled === undefined && duration_minutes === undefined) {
return getFreeTrialConfig()
}
await sql`
UPDATE pricing_promotions
SET enabled = ${enabled === undefined ? sql`enabled` : enabled},
duration_minutes = ${duration_minutes === undefined ? sql`duration_minutes` : duration_minutes},
updated_at = NOW()
WHERE eligibility = 'first_session'
`
return getFreeTrialConfig()
}
// --- Phase 4: Support handles ---
export const getSupportHandles = async () => {
const [row] = await sql`SELECT value FROM app_config WHERE key = 'support_handles_json'`
// Stored shape: { wa: {...}, telegram: {...} }. Fall back to a safe empty payload
// so the client renders an empty Tanya Admin sheet rather than crashing.
return row?.value ?? {
wa: { label: 'WhatsApp', deeplink: '' },
telegram: { label: 'Telegram', deeplink: '' },
}
}
export const setSupportHandles = async ({ wa, telegram }) => {
const current = await getSupportHandles()
const next = {
wa: { ...current.wa, ...(wa || {}) },
telegram: { ...current.telegram, ...(telegram || {}) },
}
await sql`
INSERT INTO app_config (key, value, updated_at)
VALUES ('support_handles_json', ${sql.json(next)}, NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = NOW()
`
return next
}
export const getExtensionTimeoutConfig = async () => {
const [row] = await sql`SELECT value FROM app_config WHERE key = 'extension_timeout_seconds'`
// Default 10s pairs with the auto-approve-on-timeout flow; raise this if you change the policy to auto-reject.
return { extension_timeout_seconds: row?.value?.value ?? 10 }
}
export const setExtensionTimeoutConfig = async (seconds) => {
await sql`
INSERT INTO app_config (key, value, updated_at)
VALUES ('extension_timeout_seconds', ${sql.json({ value: seconds })}, NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = NOW()
`
return { extension_timeout_seconds: seconds }
}
export const getEarlyEndConfig = async () => {
const [mitraRow] = await sql`SELECT value FROM app_config WHERE key = 'early_end_mitra_enabled'`
const [customerRow] = await sql`SELECT value FROM app_config WHERE key = 'early_end_customer_enabled'`
return {
mitra_enabled: mitraRow?.value?.value ?? false,
customer_enabled: customerRow?.value?.value ?? false,
}
}
// --- Mitra reachability config ---
//
// Two separate concerns, deliberately decoupled:
// - heartbeat_cadence_seconds: how often the mitra app sends a heartbeat.
// Fixed per backend deployment via the MITRA_HEARTBEAT_CADENCE_SECONDS
// env (default 30). The mitra app reads this from /api/mitra/status and
// uses it directly as its Timer.periodic interval.
// - stale_after_seconds: how long the backend tolerates silence before
// marking a mitra offline. DB-stored, CC-tunable. Must be >= the
// heartbeat cadence (CC PATCH validates this).
//
// `require_ping` stays as the master switch — when false, the auto-offline
// sweep is skipped entirely and mitras stay online forever once they toggle.
export const getMitraHeartbeatCadenceSeconds = () => {
const raw = process.env.MITRA_HEARTBEAT_CADENCE_SECONDS
if (!raw || raw.trim() === '') return 30
const parsed = Number.parseInt(raw, 10)
return Number.isFinite(parsed) && parsed >= 5 ? parsed : 30
}
// --- Valkey availability mirror — env-driven cadences ---
//
// Per requirement/valkey-online-mirror-plan.md. All three are operational
// knobs (env, per backend/CLAUDE.md Config-Source Convention), not
// operator-tunable. Defaults match the plan; values are floor-clamped.
export const getMitraAutoOfflineSweepSeconds = () => {
const raw = process.env.MITRA_AUTO_OFFLINE_SWEEP_SECONDS
if (!raw || raw.trim() === '') return 30
const parsed = Number.parseInt(raw, 10)
return Number.isFinite(parsed) && parsed >= 5 ? parsed : 30
}
export const getHeartbeatMirrorIntervalSeconds = () => {
const raw = process.env.HEARTBEAT_MIRROR_INTERVAL_SECONDS
if (!raw || raw.trim() === '') return 60
const parsed = Number.parseInt(raw, 10)
return Number.isFinite(parsed) && parsed >= 10 ? parsed : 60
}
export const getValkeyOnlineMirrorSweepSeconds = () => {
const raw = process.env.VALKEY_ONLINE_MIRROR_SWEEP_SECONDS
if (!raw || raw.trim() === '') return 300
const parsed = Number.parseInt(raw, 10)
if (parsed === 0) return 0 // explicit disable
return Number.isFinite(parsed) && parsed >= 30 ? parsed : 300
}
// --- Phase 5: Xendit integration ---
//
// Env-driven (per backend/CLAUDE.md Config-Source Convention). All five values
// read from process.env at call time so test setups can inject via vi.stubEnv.
// When `enabled` is true, payment.service.js mints a real Xendit invoice on
// requestPayment(); when false, invoice creation is skipped and the dev/Maestro
// stub /internal/_test/force-confirm-payment plays the role of the webhook.
// See requirement/phase5-xendit-plan.md D6/D9.
export const getXenditConfig = () => ({
enabled: process.env.XENDIT_ENABLED === 'true',
secretKey: process.env.XENDIT_SECRET_KEY ?? '',
webhookToken: process.env.XENDIT_WEBHOOK_TOKEN ?? '',
successRedirectUrl: process.env.XENDIT_SUCCESS_REDIRECT_URL ?? '',
failureRedirectUrl: process.env.XENDIT_FAILURE_REDIRECT_URL ?? '',
})
export const getMitraPingConfig = async () => {
const [requireRow] = await sql`SELECT value FROM app_config WHERE key = 'require_mitra_ping'`
const [staleRow] = await sql`SELECT value FROM app_config WHERE key = 'mitra_stale_after_seconds'`
return {
require_ping: requireRow?.value?.value ?? true,
stale_after_seconds: staleRow?.value?.value ?? 45,
heartbeat_cadence_seconds: getMitraHeartbeatCadenceSeconds(),
}
}
export const setMitraPingConfig = async ({ require_ping, stale_after_seconds }) => {
if (require_ping !== undefined) {
await sql`
INSERT INTO app_config (key, value, updated_at)
VALUES ('require_mitra_ping', ${sql.json({ value: require_ping })}, NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = NOW()
`
}
if (stale_after_seconds !== undefined) {
await sql`
INSERT INTO app_config (key, value, updated_at)
VALUES ('mitra_stale_after_seconds', ${sql.json({ value: stale_after_seconds })}, NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = NOW()
`
}
return getMitraPingConfig()
}
export const setEarlyEndConfig = async ({ mitra_enabled, customer_enabled }) => {
if (mitra_enabled !== undefined) {
await sql`
INSERT INTO app_config (key, value, updated_at)
VALUES ('early_end_mitra_enabled', ${sql.json({ value: mitra_enabled })}, NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = NOW()
`
}
if (customer_enabled !== undefined) {
await sql`
INSERT INTO app_config (key, value, updated_at)
VALUES ('early_end_customer_enabled', ${sql.json({ value: customer_enabled })}, NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = NOW()
`
}
return getEarlyEndConfig()
}
// --- Phase 3.3: Session Topic Sensitivity ---
export const getSensitivityConfig = async () => {
const [confirmRow] = await sql`SELECT value FROM app_config WHERE key = 'sensitive_flip_confirmation_enabled'`
const [latchRow] = await sql`SELECT value FROM app_config WHERE key = 'sensitive_flag_one_way_latch'`
return {
flip_confirmation_enabled: confirmRow?.value?.value ?? true,
one_way_latch: latchRow?.value?.value ?? false,
}
}
export const setSensitivityConfig = async ({ flip_confirmation_enabled, one_way_latch }) => {
if (flip_confirmation_enabled !== undefined) {
await sql`
INSERT INTO app_config (key, value, updated_at)
VALUES ('sensitive_flip_confirmation_enabled', ${sql.json({ value: flip_confirmation_enabled })}, NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = NOW()
`
}
if (one_way_latch !== undefined) {
await sql`
INSERT INTO app_config (key, value, updated_at)
VALUES ('sensitive_flag_one_way_latch', ${sql.json({ value: one_way_latch })}, NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = NOW()
`
}
return getSensitivityConfig()
}
// --- Phase 3.4: Self-Managed Auth ---
export const getOtpRateLimits = async () => {
const [phoneRow] = await sql`SELECT value FROM app_config WHERE key = 'otp_max_per_phone_per_hour'`
const [ipRow] = await sql`SELECT value FROM app_config WHERE key = 'otp_max_per_ip_per_hour'`
const [resendRow] = await sql`SELECT value FROM app_config WHERE key = 'otp_resend_cooldown_seconds'`
const [attemptsRow] = await sql`SELECT value FROM app_config WHERE key = 'otp_verify_max_attempts'`
return {
max_per_phone_per_hour: phoneRow?.value?.value ?? 3,
max_per_ip_per_hour: ipRow?.value?.value ?? 10,
resend_cooldown_seconds: resendRow?.value?.value ?? 60,
verify_max_attempts: attemptsRow?.value?.value ?? 5,
}
}
export const setOtpRateLimits = async ({
max_per_phone_per_hour,
max_per_ip_per_hour,
resend_cooldown_seconds,
verify_max_attempts,
}) => {
const pairs = [
['otp_max_per_phone_per_hour', max_per_phone_per_hour],
['otp_max_per_ip_per_hour', max_per_ip_per_hour],
['otp_resend_cooldown_seconds', resend_cooldown_seconds],
['otp_verify_max_attempts', verify_max_attempts],
]
for (const [key, value] of pairs) {
if (value === undefined) continue
await sql`
INSERT INTO app_config (key, value, updated_at)
VALUES (${key}, ${sql.json({ value })}, NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = NOW()
`
}
return getOtpRateLimits()
}
export const getCcLoginLockoutConfig = async () => {
const [attemptsRow] = await sql`SELECT value FROM app_config WHERE key = 'cc_login_max_attempts'`
const [minutesRow] = await sql`SELECT value FROM app_config WHERE key = 'cc_login_lockout_minutes'`
return {
max_attempts: attemptsRow?.value?.value ?? 5,
lockout_minutes: minutesRow?.value?.value ?? 15,
}
}
export const setCcLoginLockoutConfig = async ({ max_attempts, lockout_minutes }) => {
if (max_attempts !== undefined) {
await sql`
INSERT INTO app_config (key, value, updated_at)
VALUES ('cc_login_max_attempts', ${sql.json({ value: max_attempts })}, NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = NOW()
`
}
if (lockout_minutes !== undefined) {
await sql`
INSERT INTO app_config (key, value, updated_at)
VALUES ('cc_login_lockout_minutes', ${sql.json({ value: lockout_minutes })}, NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = NOW()
`
}
return getCcLoginLockoutConfig()
}
// --- Paid Pairing Flow + Returning-Chat + Extension Flip ---
export const getPaymentRequestTimeoutMinutes = async () => {
const [row] = await sql`SELECT value FROM app_config WHERE key = 'payment_request_timeout_minutes'`
return { payment_request_timeout_minutes: row?.value?.value ?? 20 }
}
export const setPaymentRequestTimeoutMinutes = async (value) => {
await sql`
INSERT INTO app_config (key, value, updated_at)
VALUES ('payment_request_timeout_minutes', ${sql.json({ value })}, NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = NOW()
`
return { payment_request_timeout_minutes: value }
}
export const getReturningChatConfirmationTimeoutSeconds = async () => {
const [row] = await sql`SELECT value FROM app_config WHERE key = 'returning_chat_confirmation_timeout_seconds'`
return { returning_chat_confirmation_timeout_seconds: row?.value?.value ?? 20 }
}
export const setReturningChatConfirmationTimeoutSeconds = async (value) => {
await sql`
INSERT INTO app_config (key, value, updated_at)
VALUES ('returning_chat_confirmation_timeout_seconds', ${sql.json({ value })}, NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = NOW()
`
return { returning_chat_confirmation_timeout_seconds: value }
}
export const getExtensionDefaultActionOnTimeout = async () => {
const [row] = await sql`SELECT value FROM app_config WHERE key = 'extension_default_action_on_timeout'`
return { extension_default_action_on_timeout: row?.value?.value ?? ExtensionTimeoutAction.AUTO_APPROVE }
}
export const setExtensionDefaultActionOnTimeout = async (value) => {
await sql`
INSERT INTO app_config (key, value, updated_at)
VALUES ('extension_default_action_on_timeout', ${sql.json({ value })}, NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = NOW()
`
return { extension_default_action_on_timeout: value }
}
export const getPairingBlastTimeoutSeconds = async () => {
const [row] = await sql`SELECT value FROM app_config WHERE key = 'pairing_blast_timeout_seconds'`
return { pairing_blast_timeout_seconds: row?.value?.value ?? 60 }
}
export const setPairingBlastTimeoutSeconds = async (value) => {
await sql`
INSERT INTO app_config (key, value, updated_at)
VALUES ('pairing_blast_timeout_seconds', ${sql.json({ value })}, NOW())
ON CONFLICT (key) DO UPDATE SET value = EXCLUDED.value, updated_at = NOW()
`
return { pairing_blast_timeout_seconds: value }
}