No AWS. No Vercel. No containers. Every component — inference, storage, sessions, queues — runs on Cloudflare's global edge across 300+ points of presence. Here's exactly how it works.
Every request enters through Cloudflare Workers and stays on the edge. No round-trips to a centralised cloud region.
Kimi K2.6 inference — MoE model with native function-calling. Runs on Cloudflare GPUs at the edge with zero cold starts.
SQLite at the edge. 7 tables with multi-tenant indexes, foreign keys, and CHECK constraints. Global replication.
Stateful agent sessions per (marina_id, conversation_id). Maintains context across multi-turn voice conversations.
Globally replicated key-value store for hot-path data: rate cards, agent configs. Sub-ms reads from 300+ PoPs.
S3-compatible object storage for contracts (PDF), call recordings, and email attachments. Zero egress fees.
Vector database for RAG over marina policies, FAQ docs, and historical interactions. Enables policy citation.
Sits in front of all LLM calls. Response caching, rate limiting, cost tracking, and automatic fallback routing.
Async task processing: DockMaster sync, email dispatch, Slack notifications, contract generation.
Three models working together: one thinks, one listens, one speaks.
Primary reasoning model. MoE architecture activates only relevant expert sub-networks per token, keeping inference cost low at high volume.
Speech-to-text for all voice interactions. Processes Twilio Media Stream audio chunks in near real-time with speaker diarisation.
Text-to-speech for voice responses. ElevenLabs for production-quality voices; Workers AI MeloTTS as a zero-latency fallback.
Kimi K2.6 calls these tools via native function-calling. The model receives JSON-schema definitions and returns structured tool_call objects.
Queries D1 for matching slips with date-overlap exclusion and vessel dimension filtering.
Deterministic pricing engine: base × season × DOW × occupancy × events. Never LLM-generated.
Generates PDF rental agreement from template, stores in R2, returns DocuSign e-sign link.
Creates Stripe Checkout session with booking amount. Returns payment link to guest.
Writes confirmed booking to D1, syncs to DockMaster PMS via API, logs audit event.
A 6th tool the agent can call at any point to route the conversation to a human. Triggered automatically when confidence drops below threshold, dollar cap is exceeded, or max turns is reached.
7 tables, all scoped by marina_id for strict multi-tenant isolation.
marinas
Tenant root table. One row per marina property.
slips
Physical slip inventory with dimensions and amenities.
rate_cards
Pricing configuration with JSON curve definitions.
agent_configs
Per-marina AI agent personality and guardrails.
inquiries
Every inbound interaction across all channels.
bookings
Confirmed reservations with PMS sync status.
events
Full audit trail — every action the agent takes.
Every D1 table, KV key prefix, Vectorize namespace, R2 bucket prefix, and Durable Object ID includes marina_id as a scoping dimension. Zero cross-tenant data leakage by design.
The LLM never generates prices. Every dollar amount comes from this formula, executed deterministically on the Worker.
LLMs are great at conversation but unreliable at arithmetic. A hallucinated price creates legal liability and erodes guest trust. By running pricing as a pure function on the Worker, the agent can confidently quote exact rates that match your published rate card.
Production AI needs more than vibes. These are hard constraints, not suggestions.
From phone ring to spoken response in under 1.5 seconds.
Guest calls marina number. Twilio opens a Media Stream WebSocket.
~100msChannel Router receives audio chunks via WebSocket on Cloudflare edge.
~5msAudio chunks transcribed to text in near real-time. Partial results streamed.
~300msTranscript → agent reasoning → tool calls → response text.
~600msResponse text → natural speech audio. First byte in < 200ms.
~200msAudio streamed back to caller via Media Stream.
~100msEverything that powers Harbourmaster AI, in one table.
| Layer | Technology | Purpose |
|---|---|---|
| Framework | Hono 4 | Lightweight, fast web framework for Workers |
| Build | Vite + @hono/vite-build | SSR bundle for Cloudflare Pages |
| Runtime | Cloudflare Workers | V8 isolates at 300+ global PoPs |
| LLM | Kimi K2.6 (MoE) | Reasoning + native function-calling |
| STT | Whisper Large v3 Turbo | Real-time speech transcription |
| TTS | ElevenLabs / MeloTTS | Natural voice synthesis |
| Database | Cloudflare D1 (SQLite) | Relational data, multi-tenant |
| KV Store | Cloudflare Workers KV | Config, rate cards, session cache |
| Object Storage | Cloudflare R2 | Contracts, recordings, attachments |
| Vector DB | Cloudflare Vectorize | RAG over marina policies |
| Sessions | Durable Objects | Stateful multi-turn agent sessions |
| Gateway | Cloudflare AI Gateway | LLM caching, rate limits, fallback |
| Queues | Cloudflare Queues | Async PMS sync, notifications |
| Auth | Google OAuth 2.0 + JWT | SSO for dashboard with session cookies |
| Voice | Twilio Media Streams | Telephony ingress/egress |
| Resend + CF Email Routing | Inbound/outbound email | |
| SMS | Twilio Messaging | Text message channel |
| Payments | Stripe Checkout | Guest payment collection |
| Contracts | DocuSign | E-signature for rental agreements |
| PMS | DockMaster API | Property management sync |
| Alerts | Slack API | Staff notifications & escalations |
| Frontend | Tailwind CSS + Space Grotesk | Utility-first styling, Abyssal Intelligence theme |
| TypeScript | ES2022 target | Type-safe Workers code |
Open the dashboard, try the live chat, explore the API. Everything's running.