luma — try the WhatsApp demo

WHAT YOU'RE LOOKING AT

This is the same backend that powers a real CHW's WhatsApp.

Every message you send hits the same pipeline as a real worker texting from a clinic in Maseru: safety filter → RAG retrieval over ministry protocols → Claude → post-LLM safety check → ministry log. The only thing missing is Twilio — you're going through HTTP, they go through WhatsApp Business API.

Watch the Audit Log while you chat — your conversations show up there in real time, tagged as demo entries. Open the Ministry view to see how those interactions aggregate into operational dashboards. Or hit the API directly if you'd rather skip the browser.

CHANNEL

WhatsApp Business API, routed through Twilio. In production the number is registered to the partner ministry's account, not luma's.

CORPUS

11 ministry protocol documents indexed via OpenAI text-embedding-3-small, retrieved by cosine similarity. ART, TB, MNCH, PrEP, PMTCT, HIV testing, family planning, immunization, malnutrition, STI screening.

MODEL

Claude Sonnet 4.6 with a system prompt that grounds responses in retrieved chunks and refuses dosing / diagnostic / prescription requests regardless of phrasing.

VOICE

Real CHWs send voice memos in Sesotho. OpenAI Whisper transcribes; the rest of the pipeline is identical. (This demo is text-only.)

SAFETY

Two-layer filter: regex pre-check refuses anything that looks like a dosing or diagnostic request before the LLM is invoked; a post-LLM scrubber catches anything the model may have slipped through.

LOGGING

Every conversation lands in conversations + case_tags tables. Severity, district, topic, condition extracted by a second Claude call (Haiku) and shown in the Ministry and Public Health views.

RATE LIMIT

Demo capped at 8 messages per minute per IP so the page stays cheap to run. Real CHWs aren't rate-limited.

What happens between send and reply

POST /api/demo-chat — your message hits Express
Pre-LLM safety filter checks for dosing/diagnostic patterns
OpenAI embeds your query, cosine-similarity over the corpus
Top-3 chunks bundled into Claude's context
Claude answers, grounded — or routes to a "general knowledge" fallback if the corpus didn't cover it
Post-LLM scrubber catches any unsafe output
Response logged with sources, severity tags extracted async