Commit Graph

8 Commits

Author SHA1 Message Date
553dbac52f Phase 6: Valkey availability mirror — move read path off Postgres
Mitra-availability state (online flag, deactivated flag, per-mitra session
count, heartbeat liveness) mirrored into Valkey so the customer beacon
+ pairing blast + dashboard counts no longer hit Postgres on the hot path.
Postgres remains the durable source of truth; Valkey state is fully
derivable via seedFromPostgres on startup + reconnect.

Schema
- mitras:online           SET    — mirror of is_online
- mitras:deactivated      SET    — mirror of is_active=false
- mitra:capacity:<id>     STRING — active+pending_payment session count
- mitra💓<id>    STRING — ISO timestamp of last ping
- availability:snapshot   JSON   — beacon cache, TTL 10s, cluster-shared

Write paths (Postgres first, best-effort Valkey)
- setOnline/setOffline mirror SADD/SREM + heartbeat SET/DEL
- updateMitraStatus mirrors mitras:deactivated AND revokes auth_sessions
  on deactivate (bounds the "ghost online" window to access-token TTL)
- heartbeat is Valkey-only on the hot path; the per-ping Postgres UPDATE
  on last_heartbeat_at is eliminated (was 1,200 ops/min at prod scale)
- chat_session lifecycle (accept/end/reroute/extension/expiry) calls
  recomputeCapacityForMitra after each UPDATE — derive-from-truth avoids
  the bookkeeping risk of per-transition INCR/DECR

Read paths (Valkey-first, Postgres fallback on Valkey error)
- isMitraReachable: SISMEMBER mitras:online + heartbeat freshness
- findAvailableMitras: SDIFF + pipelined GETs, filter by capacity + heartbeat
- countAvailableMitrasFromCache: Valkey-driven, cached cluster-wide 10s TTL
- dashboard online count: SCARD
- Each reader wraps Valkey ops in try/catch → Postgres fallback on outage

Heartbeat path on /api/mitra/status/heartbeat
- resolveMitra preHandler replaced with heartbeatGuard: SISMEMBER on
  mitras:deactivated (~0 DB hits per ping). Falls back to full DB
  resolveMitra if Valkey is unreachable so a Valkey outage doesn't
  silently accept heartbeats from deactivated mitras.

Three sweeps, env-configurable cadences
- MITRA_AUTO_OFFLINE_SWEEP_SECONDS (30) — Valkey-driven stale detection
- HEARTBEAT_MIRROR_INTERVAL_SECONDS (60) — batched UPSERT writes
  Valkey timestamps to Postgres last_heartbeat_at via UNNEST (1 statement
  per cycle, idempotent across instances)
- VALKEY_ONLINE_MIRROR_SWEEP_SECONDS (300) — periodic reseed heals drift

Startup
- restoreActiveTimers → seedFromPostgres → bind listeners
- onValkeyReady re-runs the seed on every reconnect (cold start + reseed
  on Valkey restart, no manual intervention)

Failure semantics
- Read fallback: every Valkey read wrapped, falls back to existing
  Postgres JOIN query — system stays correct during Valkey outage,
  performance degrades not breaks
- Write best-effort: Postgres write commits before Valkey is touched;
  Valkey errors log + continue; reconciliation sweep heals drift
- Auto-offline sweep aborts entirely on Valkey error (does NOT mass-
  offline via Postgres scan during Valkey hiccup)

Tests
- New: 32 integration tests in mitra-status.valkey-mirror.test.js
  covering seed, write-through, fallbacks, capacity lifecycle,
  auto-offline sweep, heartbeat mirror, deactivation flow, beacon cache
- Updated: fixtures.js seeds Valkey alongside Postgres when isOnline=true
- Updated: helpers/db.js resetDb also flushes test Valkey
- Fixed 2 pre-existing session-timer flakes (string IDs failed uuid
  parse; vi.advanceTimersByTimeAsync raced real Postgres I/O)
- All 124/124 backend tests pass (was 90/92)

Docs
- requirement/valkey-online-mirror-plan.md — canonical plan
- requirement/valkey-online-mirror-testing.md — manual E2E checklist
- requirement/deployment.md — infra + Valkey persistence guidance for
  prod (Memorystore Standard tier recommended; migration from
  self-hosted Valkey is zero-downtime via reseed-from-Postgres)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 18:07:55 +08:00
3fff4b1c6e Phase 5 Xendit: Stages 1-7 (XENDIT_ENABLED=false; Stage 8 pending creds)
Backend
- payment_sessions → payment_requests rename across DB schema + 29 files
- payment.service.js becomes product-agnostic owner: EventEmitter +
  Xendit wrapper + requestPayment / confirmPayment public API; legacy
  aliases retained for existing chat callers
- Webhook handler at POST /api/shared/payment/webhooks/xendit, with
  constant-time token verification (8 vitest cases)
- Server-driven pairing: payment.service emits
  payment_request.confirmed → pairing subscriber starts the blast.
  Legacy POST /chat/request still works during the cutover.
- Reconciliation sweeper extended (re-emits events for confirmed rows
  with no chat session)
- SIGTERM drain + startup reconciliation pass in server.js

Customer app
- waiting_payment_screen opens xendit_invoice_url via
  LaunchMode.inAppBrowserView
- searching / no-bestie / targeted-waiting / pairing-notifier updated
  to consume the new payment_request_id contract
- pending_payments_provider + bestie-unavailable dialog migrated

Dev / testing
- XENDIT_ENABLED=false is the safe default; .env.example documents the
  four new vars
- backend/.dev/xendit-fake-webhook.sh exercises the handler without
  ngrok
- 90/92 backend tests pass (two pre-existing session-timer flakes,
  unrelated); client_app analyzer clean
- requirement/phase5-xendit-plan.md is the canonical reference

Stage 8 (live E2E) blocked on Xendit test-mode keys. The dashboard's
single-webhook-URL constraint will be worked around via a self-poll
script next session.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 12:52:33 +08:00
1c9d81d81d Pricing: migrate from app_config JSON to relational tables
Replaces the two `pricing_*_tiers_json` blobs and five `first_session_discount_*`
keys in app_config with dedicated `pricing_tiers` and `pricing_promotions`
tables plus matching `_history` audit tables. UUID PKs, UNIQUE(mode, minutes)
natural-key constraint, optimistic-lock via `updated_at` token returning 409
STALE_WRITE on conflicts. Every mutation writes a history row capturing the
operator (changed_by from request.auth.userId) and change_kind.

CC SettingsPage replaces the JSON-textarea editors with per-row tables —
add / edit / soft-delete / reactivate / reorder, plus a buffered first-session
discount form with the same optimistic-lock contract. `minutes` and `mode` are
read-only on edit since they form the natural key; operators soft-delete and
recreate to change duration.

Stage 5 fixes a latent leak: `client.payment.routes.js` had its own local
`readDiscountConfig` that still read from app_config — would have silently
fallen to hardcoded defaults once the legacy rows were deleted. Now reads from
pricing_promotions via the shared service helper, so CC edits to the first-
session discount affect actual payment pricing on the next request.

Customer-facing GET /api/client/chat/pricing shape unchanged (id values are
now UUIDs instead of "5"/"12"/"60" but lookups happen by (mode, minutes), so
no app changes needed). 27 new backend tests, all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 00:12:11 +08:00
a09f37135c Phase 4 checkpoint: chat-screen perf refactor + retryable blast-failure + repo-wide dispose-ref guardrail
Chat-screen performance (customer + mitra):
- Parent screens have zero `ref.watch` — only `ref.listen` for side effects
- Body extracted into its own `ConsumerStatefulWidget`; AppBar parts split
  into narrow `.select` consumers (mode, sensitivity, timer)
- Per-second timer ticks routed to dedicated providers
  (`chatRemainingSecondsProvider` + new `mitraChatRemainingSecondsProvider`)
  so WS `session_tick` frames don't invalidate the rest of the chat state

Dispose-in-ref bug fix:
- `home_screen.dart`, `payment_screen.dart`, `mitra_chat_screen.dart` —
  ref-using cleanup moved from `dispose()` to `deactivate()`. Modern
  Riverpod invalidates `ref` the moment `dispose()` runs; the resulting
  silent error corrupts the widget-tree finalize and the next screen
  appears frozen
- `halo_lints` package added at repo root with `no_ref_in_dispose` rule
  to catch this pattern in CI / IDE analysis
- `custom_lint` activated in both apps' `analysis_options.yaml`
  (was installed but never wired in — also brings `riverpod_lint`'s
  `avoid_ref_inside_state_dispose` online)
- CLAUDE.md Pitfalls section added to client_app + mitra_app

Phase 4 §3 retryable blast-failure (Option A):
- Backend `expirePairingRequest` + all-rejected use
  `recordIntermediateFailure` instead of `failPaymentSession` so the
  payment session stays `confirmed` for re-blast
- WS `pairing_failed` payload carries `is_terminal: false` on the
  retryable paths; client parses the flag and exposes `retryBlast()`
- "Coba cari lagi" CTA on S7 Timeout now re-blasts on the same payment
- Pairing service test updated to reflect the new semantics

Customer waiting-payment screen navigation patch:
- `_navigateTerminal` uses `Future.microtask` + `addPostFrameCallback`
  redundancy after a release-mode bug where polling stopped but
  `context.go` never fired, leaving the screen visually stuck on
  "menunggu pembayaran"

See requirement/resume-2026-05-15.md for next-day pickup checklist
(mitra release rebuild + S21 Ultra install + retest is the gating item).

Bundles unrelated in-flight Phase 4 §2.x work that was already on disk
(ESP screen removal, USP one-time gate scaffolding, bestie-availability
public route, OTP service edits, Maestro flow tweaks) — kept together
to avoid a partial-rebase mess.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 19:12:34 +08:00
a48f108fc0 Phase 4 §2.1: anonymous → existing-user merge breadcrumb
Adds `customers.account_belongs_to UUID NULL` and refactors customer
sign-in (phone/Google/Apple) so an anon row that re-verifies into an
existing customer no longer 409s. Instead the anon row stays intact
with a breadcrumb pointing at the real customer; tokens are issued
for the existing user. Actual data reconciliation onto the existing
row (chat_sessions, customer_transactions, payment_sessions,
pairing_failures) is deferred.

Backend
- migrate.js: ADD COLUMN account_belongs_to UUID REFERENCES customers(id)
  ON DELETE SET NULL.
- customer.service.js: stampAccountBelongsTo helper; account_belongs_to
  exposed in CUSTOMER_SELECT.
- auth.service.js: new shared resolveCustomerForIdentity (4-case logic);
  normalizeIdentityConflict + IDENTITY_ALREADY_LINKED 409 deleted;
  completeCustomerPhoneSignIn / signInWithGoogle / signInWithApple all
  route through the shared helper.
- client.auth.routes.js: new resolveAnonymousCustomerId picks the anon
  prefix ONLY from a verified Bearer JWT — closes the UUID-leak attack
  where a tamper-able body field could mis-route someone else's
  transactions. /otp/verify, /google, /apple all use it; the body field
  `anonymous_customer_id` is no longer accepted on any of them.
- test/services/auth.service.test.js: 9 Vitest cases covering phone +
  Google + Apple, all 4 logic cases + multi-merge accumulation.

Customer app
- auth_notifier.dart::verifyOtp: drop `skipAuth: true` and the dead
  body field so ApiClient auto-attaches the anon's Bearer from
  AuthBridge. Survives the AuthOtpSentData state transition (the
  earlier `_currentAnonymousCustomerId()` state-drop bug is bypassed by
  sourcing the id from the bridge instead of state).
- Google + Apple client paths remain unchanged (gated on provider
  creds; mirror this fix when wiring lands).

Docs
- flow_customer.mermaid.md: new §2.1 sub-section with the merge
  diagram, schema note, replaces-current-behaviour paragraph, and
  Bearer-only security callout.
- phase3.4-testing.md: §1.5 line 76 simplified (no more per-path
  split); new §1.5.1 with the 5-step operator scenario + DB invariants
  + curl recipe + Vitest pointer; new §1.5.2 covering Google/Apple
  parity (deferred client work flagged).

Verification (against live dev backend, before this commit):
- Vitest: 9/9 in auth.service.test.js; 49/51 overall (2 unrelated
  pre-existing failures in session-timer.service.test.js).
- Operator Node smoke: 14/14 in the §1.5.1 scenario; 11/11 in the
  Bearer-precedence cases.
- Real-device UI walkthrough on SM-A530F still pending — see resume
  memory `project_phase4_2_1_resume_test`.

Sister WIP bundled in migrate.js + customer.service.js: `usp_seen`
column + `markCustomerUspSeen` helper (Phase 4 USP one-time gate, was
already uncommitted in the working tree).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 23:57:53 +08:00
350b92f1f3 Phase 4 Stage 10 backend: Chat-tab feeds (pending payments + cursor history)
Backend half of Stage 10 — the new Chat tab in the customer app that
replaces /chat/history with a 3-sub-tab list (Aktif / Pembayaran /
Selesai).

- New GET /api/client/payment-sessions/pending — returns the customer's
  pending initial + extension payment sessions. Filter is status='pending'
  AND expires_at > NOW(). Mitra info comes from session_extensions →
  chat_sessions for extension rows, payment_sessions.targeted_mitra_id
  for targeted-curhat-lagi initial rows. TTL reuses the existing
  payment_session_timeout_minutes app_config row (default 20m) — no new
  config row needed since payment is still mocked.

- getCustomerHistory migrated from offset (page/limit) to cursor
  pagination. Cursor is base64url(`<endedAtIso>|<id>`) with id-tiebreak
  in ORDER BY so rows with identical timestamps don't duplicate or skip
  across pages. SELECT now JOINs payment_sessions to surface `mode`
  (chat/call) for the Selesai-row voice-call pill.

- requirement/flow_customer.mermaid.md: new §7 Chat Tab subgraph + Figma
  cross-ref entry for SChatList.

- requirement/phase4-customer-flow-plan.md: Stage 10 plan section. Also
  carries forward earlier uncommitted "Post-Stage-8 corrections" notes
  from the Stage 9 sweep (boot path / SHome1st / onboarding fixes).

Tests: +7 for getCustomerPendingPayments (initial null mitra,
targeted-mitra fill, extension-via-session JOIN, mixed-newest-first,
expired excluded, non-pending excluded, customer scoping). +10 for
cursor history (empty, exact-fit, multi-page walk, same-timestamp
tiebreak, limit clamp, customer scoping, CLOSING+COMPLETED only).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 20:04:58 +08:00
d33d4419ea Phase 4 Stage 1: backend foundation (additive endpoints + schema)
Schema (idempotent migration):
- payment_sessions.is_free_trial -> is_first_session_discount (data copied)
- payment_sessions.mode TEXT NOT NULL DEFAULT 'chat' CHECK (chat|call)
- chat_sessions.topics TEXT[] for ESP picks (info-only)

New endpoints:
- GET /api/client/onboarding-state (drives verif sheet + S6 paywall gate)
- GET /api/client/chat-pricing (rewrite: chat+call groups + first-session
  discount block, per-customer eligibility)
- GET /api/shared/auth-providers (env-probed; replaces ENABLE_SOCIAL_AUTH
  build flag — frontend cutover lands in stage 2)
- GET /api/client/support-handles (Tanya Admin handles, CC-config-driven)

session_warning WS event fires once at 180s remaining.

app_config seeds (mock pricing tiers, first-session discount, support
handles, payment method order, end-session 2-step toggle).

CC SettingsPage: 3 new sections (first-session discount, pricing tiers
JSON editors, support handles).

15/15 Vitest passing. chat_sessions.is_free_trial also renamed for
consistency (plan only specified payment_sessions; pairing.service.js
read both).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 15:56:28 +08:00
d09e50af55 Phase 3.7: paid pairing flow + returning chat + extension flip
- Backend: payment_sessions + pairing_failures tables; payment.service.js
  and pairing-failure.service.js (new); rewritten pairing.service.js
  (payment-gated blast + targeted "Curhat lagi" + cancel + fallback);
  rewritten extension.service.js (data-driven auto-approve with offline
  safeguard, charge-at-approval); pricing.service.js (extension tiers
  without free trial); mitra-status.service.js (countAvailableMitras
  cached path); 60s sweeper for stale payment sessions
- Backend routes: client.payment.routes, client.mitra-availability.routes,
  internal/failed-pairings.routes; client.chat.routes rewritten for
  payment-gated start + /returning + /cancel + /fallback-to-blast;
  internal/config.routes adds 4 new keys with Valkey invalidate publish
- client_app: mitra-availability poll, payment screen + notifier, pairing
  notifier rewrite (PairingTargetedWaiting + PairingFailed states),
  targeted-waiting overlay + bestie-unavailable dialog, "Curhat lagi"
  CTA, failed-pairing terminal, extension via payment-session
- mitra_app: PairingRequestType enum, returning-chat 20s countdown
  auto-dismiss, extension card "otomatis disetujui" copy
- control_center: 4 new config rows in Settings, Failed Pairings page
  (filter + paginate + action menu), sidebar + route registered
- Test infrastructure: Vitest backend (7/7 pass), Playwright CC (4/4
  pass), Maestro mobile scaffold (CLI install pending)
- Bugs found via Playwright + fixed: LoginPage labels not associated
  with inputs (a11y); backend internal CORS missing PATCH/PUT/DELETE
  in allow-methods (silent settings breakage in browsers since Stage 4)
- Docs: phase3.7.md PRD, phase3.7-plan.md, phase3.7-questions.md (Q&A),
  phase3.7-testing.md (E2E checklist), phase3.7-test-run-2026-05-03.md
  (today's run results)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 23:02:49 +08:00