553dbac52f
Phase 6: Valkey availability mirror — move read path off Postgres
...
Mitra-availability state (online flag, deactivated flag, per-mitra session
count, heartbeat liveness) mirrored into Valkey so the customer beacon
+ pairing blast + dashboard counts no longer hit Postgres on the hot path.
Postgres remains the durable source of truth; Valkey state is fully
derivable via seedFromPostgres on startup + reconnect.
Schema
- mitras:online SET — mirror of is_online
- mitras:deactivated SET — mirror of is_active=false
- mitra:capacity:<id> STRING — active+pending_payment session count
- mitra💓 <id> STRING — ISO timestamp of last ping
- availability:snapshot JSON — beacon cache, TTL 10s, cluster-shared
Write paths (Postgres first, best-effort Valkey)
- setOnline/setOffline mirror SADD/SREM + heartbeat SET/DEL
- updateMitraStatus mirrors mitras:deactivated AND revokes auth_sessions
on deactivate (bounds the "ghost online" window to access-token TTL)
- heartbeat is Valkey-only on the hot path; the per-ping Postgres UPDATE
on last_heartbeat_at is eliminated (was 1,200 ops/min at prod scale)
- chat_session lifecycle (accept/end/reroute/extension/expiry) calls
recomputeCapacityForMitra after each UPDATE — derive-from-truth avoids
the bookkeeping risk of per-transition INCR/DECR
Read paths (Valkey-first, Postgres fallback on Valkey error)
- isMitraReachable: SISMEMBER mitras:online + heartbeat freshness
- findAvailableMitras: SDIFF + pipelined GETs, filter by capacity + heartbeat
- countAvailableMitrasFromCache: Valkey-driven, cached cluster-wide 10s TTL
- dashboard online count: SCARD
- Each reader wraps Valkey ops in try/catch → Postgres fallback on outage
Heartbeat path on /api/mitra/status/heartbeat
- resolveMitra preHandler replaced with heartbeatGuard: SISMEMBER on
mitras:deactivated (~0 DB hits per ping). Falls back to full DB
resolveMitra if Valkey is unreachable so a Valkey outage doesn't
silently accept heartbeats from deactivated mitras.
Three sweeps, env-configurable cadences
- MITRA_AUTO_OFFLINE_SWEEP_SECONDS (30) — Valkey-driven stale detection
- HEARTBEAT_MIRROR_INTERVAL_SECONDS (60) — batched UPSERT writes
Valkey timestamps to Postgres last_heartbeat_at via UNNEST (1 statement
per cycle, idempotent across instances)
- VALKEY_ONLINE_MIRROR_SWEEP_SECONDS (300) — periodic reseed heals drift
Startup
- restoreActiveTimers → seedFromPostgres → bind listeners
- onValkeyReady re-runs the seed on every reconnect (cold start + reseed
on Valkey restart, no manual intervention)
Failure semantics
- Read fallback: every Valkey read wrapped, falls back to existing
Postgres JOIN query — system stays correct during Valkey outage,
performance degrades not breaks
- Write best-effort: Postgres write commits before Valkey is touched;
Valkey errors log + continue; reconciliation sweep heals drift
- Auto-offline sweep aborts entirely on Valkey error (does NOT mass-
offline via Postgres scan during Valkey hiccup)
Tests
- New: 32 integration tests in mitra-status.valkey-mirror.test.js
covering seed, write-through, fallbacks, capacity lifecycle,
auto-offline sweep, heartbeat mirror, deactivation flow, beacon cache
- Updated: fixtures.js seeds Valkey alongside Postgres when isOnline=true
- Updated: helpers/db.js resetDb also flushes test Valkey
- Fixed 2 pre-existing session-timer flakes (string IDs failed uuid
parse; vi.advanceTimersByTimeAsync raced real Postgres I/O)
- All 124/124 backend tests pass (was 90/92)
Docs
- requirement/valkey-online-mirror-plan.md — canonical plan
- requirement/valkey-online-mirror-testing.md — manual E2E checklist
- requirement/deployment.md — infra + Valkey persistence guidance for
prod (Memorystore Standard tier recommended; migration from
self-hosted Valkey is zero-downtime via reseed-from-Postgres)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com >
2026-05-25 18:07:55 +08:00
780cade3db
Phase 3.3: topic sensitivity + Phase 3.4: auth foundation
...
Phase 3.3 — Session Topic Sensitivity (complete):
- Backend: topic_sensitivity column + session_sensitivity_log, sensitivity service
(flip with one-way-latch + audit), PATCH /api/shared/chat/sessions/:id/topic,
topic carried in pairing + extension WS payloads, CC filter + sensitive stats
+ per-mitra sensitive columns on activity page
- client_app: TopicSelectionBottomSheet before pricing, topic flows through
pairing request, silent WS handler for session_topic_updated
- mitra_app: SensitivityBadge + SensitivityTheme + sensitivityConfigProvider,
overlay badge + yellow accent, chat screen app-bar toggle with configurable
confirmation + latch, extension card shows current flag, history + transcript
yellow theme
- control_center: Sensitivitas Topik settings section, topic filter + column
with inline audit log, sensitive stats dashboard card, mitra activity
sensitive columns with QC flag
Phase 3.4 — Self-Managed Auth (foundation only):
- Migration: auth_sessions + otp_requests tables, social identity columns on
customers, password_hash + lockout on control_center_users, OTP + CC lockout
app_config keys
- New services: password (bcrypt + complexity), token (JWT HS256 + refresh
rotation, session_id claim pre-wires future Valkey revocation),
social-identity (Google + Apple JWKS), OTP (Fazpass stub — real API TBD)
- Constants: AuthProvider + OtpChannel
- Middleware, auth route rewrites, WS auth update, Firebase → FCM isolation
still pending (next chunk); Fazpass docs + Apple Developer setup still
required before E2E testing
Docs:
- requirement/phase3.3.md, phase3.3-plan.md, phase3.3-testing.md
- requirement/phase3.4.md, phase3.4-plan.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com >
2026-04-24 10:15:12 +08:00
b0502ac92b
Phase 3 testing fixes: Fastify 5, SSE→WebSocket+FCM, enums, security, session lifecycle
...
- Upgrade Fastify 4→5 with all plugins (@fastify/websocket 11, cors 11, sensible 6)
- Migrate all SSE endpoints to WebSocket + FCM push (mitra chat requests, customer pairing status)
- Add flutter_local_notifications for foreground push notifications with sound
- Add splash screen to both apps (hide auth loading flash)
- Introduce constants/enums across entire codebase (no raw string literals)
- Move price tiers from hardcoded array to app_config DB (data-driven, includes 1-min test tier)
- Add session ownership validation on all shared chat routes
- Add ownership checks on endSession, respondToExtension, requestExtension
- Fix session timer: auto-complete expired/stale sessions on server restart
- Add 5-min grace period for abandoned closing sessions
- Fix extension flow: proper session_resumed handling, clearExtensionRequest, closure grace timer cleanup
- Fix chat screens: ConnectChat in initState, session status check on connect
- Fix customer expired view: 5-min countdown, closure state priority over expired state
- Fix mitra extension UI: loading spinner, disable buttons, handle EXTENSION_RESOLVED error
- Fix GoRouter navigation consistency (no more Navigator.pushNamed)
- Fix goodbye view keyboard overflow (SingleChildScrollView)
- Add active session card on customer home screen with refresh on navigate back
- Fix PricingBottomSheet extension mode (RequestExtension instead of new pairing)
- Send session_resumed to both parties on extension accept
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-04-09 00:17:25 +08:00
d668112edd
Phase 2 scaffold: mitra online status & pairing logic
...
Add mitra online/offline status with heartbeat-based auto-offline,
customer-mitra pairing via Valkey pub/sub blast, session management,
and control center dashboard with real-time stats.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-04-05 23:17:49 +08:00