Architecture
System Overview
flowchart TD A[Frontend React App] --> B[Supabase Edge Functions] A --> C[AWS Transcribe] B --> D[(Supabase PostgreSQL)] B --> E[OpenAI GPT] subgraph Frontend A end subgraph Backend Services B C E end subgraph Database D end
Here's a high-level breakdown of the main components:
Data Flow
flowchart LR A[Audio/Text Input] --> B[Preprocessing] B --> C[Vector Processing] C --> D[Storage] E[User Query] --> F[Search] F --> G[Results] D --> F subgraph Input Processing A B end subgraph Vector Operations C F end subgraph Data Layer D end subgraph Output G end
1. Frontend (/src
)
flowchart TD A[Pages] --> B[Components] B --> C[Utilities] C --> D[External Services] subgraph Frontend Components direction LR B1[TranscriptDisplay] B2[SearchResults] B3[RecordingControls] end subgraph Utilities direction LR C1[TranscriptionService] C2[OpenAIClient] C3[SupabaseClient] end B --> B1 B --> B2 B --> B3 C --> C1 C --> C2 C --> C3
- Pages:
Transcription.tsx
SearchResults.tsx
- Components:
TranscriptDisplay.tsx
SearchResultCard.tsx
- Zustand Store:
useMeetingStore.ts
handles global state:- Recording status
- Current transcript text
- Notification system
2. Supabase Backend (/supabase/functions
)
- insert_transcript: Takes raw transcript → cleans → splits → detects chunks → calls process_chunk
- process_chunk: Embeds chunk text, extracts GPT-based metadata, stores in chunks table
- search_chunks: Handles user's query → keywords + topics → fetch relevant chunks → final GPT summary
3. AWS Transcribe
src/utils/audio/
+src/utils/aws/
handle real-time audio streamingTranscribeClient
callsstartTranscription
, streams text back
4. OpenAI
supabase/functions/_shared/openAIClient.ts
for chunk analysissrc/utils/openai/summaryService.ts
for generating quick titles- Used in both Edge Functions and minimal client-side operations
Data Flow
1. Real-time or Pasted Transcript
- Live:
TranscriptionWorker.ts
captures mic data → AWS Transcribe → text appended to state - Pasted: Text stored in state directly
2. Insert Transcript
- User triggers
saveMeeting
inuseMeetingStore
- Calls
insert_transcript
Edge Function insert_transcript
callsprocess_chunk
for each chunk- Chunk data & embeddings stored in DB
3. Search
- User queries in SearchHeader → triggers
search_chunks
- Embed query, find relevant chunk embeddings
- GPT produces "AI Generated Summary"
- UI displays summary plus expandable chunks
Key Design Decisions
- Chunking Strategy: Balance between context preservation and search precision
- Edge Functions: Serverless approach for scalability and parallel processing
- Vector Search: pgvector for efficient similarity matching
- Real-time Processing: Stream-based approach for live transcription
Note: The architecture prioritizes both performance and cost-effectiveness, with careful consideration given to token usage and processing overhead.