Implementation Details
Transcript Upload Flow
flowchart TD A[Raw Transcript] --> B[Clean Text] B --> C[Split into Sentences] C --> D[Generate Embeddings] D --> E[Group into Chunks] E --> F[Distribute Chunks] F --> |Chunk 1| W1[Worker 1] F --> |Chunk 2| W2[Worker 2] F --> |Chunk 3| W3[Worker 3] F --> |Chunk N| W4[Worker N] W1 --> DB[(Database)] W2 --> DB W3 --> DB W4 --> DB W1 --> F2[Combine Results] W2 --> F2 W3 --> F2 W4 --> F2 F2 --> G[Store Transcript] G --> DB subgraph Preprocessing B C D end subgraph Chunk Processing E F W1 W2 W3 W4 F2 end
Frontend Components
TranscriptDisplay.tsx
+RecordingControls.tsx
handle user actions (start, stop, upload)- On stop,
saveMeeting
is called
Backend Processing
insert_transcript/index.ts
is the main insertion pipeline:
- Cleans transcript (removes filler words)
- Splits sentences → embeddings
- Groups sentences into chunks (by topic similarity)
- For each chunk, calls
process_chunk
Chunk Metadata
process_chunk/index.ts
uses GPT to generate:
top_3_keywords
top_3_topics
- Short summary
- Sentiment
- To-dos
Each chunk is inserted into the chunks
table with these fields.
Search & Retrieval Logic
flowchart TD A[User Query] --> B[Extract Keywords & Topics] B --> C[Generate Query Embeddings] C --> D[Find Similar Keywords/Topics] D --> E[Get Associated Chunks] E --> F[Generate Answer with Context] F --> G[Return Results] A --> F subgraph OpenAI B F end subgraph Vector DB C[Generate Query Embeddings] D[Find Similar Keywords/Topics<br/>via Cosine Similarity] E[Get Associated Chunks<br/>from Keyword/Topic Mappings] end subgraph Database H[(Chunk ID to<br/>Keyword/Topic Mappings)] end E <--> H
Backend (search_chunks/index.ts
)
- Extract keywords/topics from the user query
- Embed them
- Use
find_similar_keywords
RPC to find chunk IDs above similarity threshold - Sort chunk results by similarity
- Generate final summary with GPT
Frontend Integration
SearchResults.tsx
receives the server response (SearchResult
type) and renders:
- Top "AI Generated Summary"
- List of chunk cards (expandable sections)
Audio Processing
flowchart TD A[Audio Input] --> B[Downsample to 16kHz] B --> C[Stream to AWS] C --> D[Real-time Transcription] D --> E[Update UI] E --> F[Save Transcript] subgraph Audio Processing B end subgraph AWS Transcribe C D end subgraph Frontend E end
Frontend Architecture
1. Zustand Store (useMeetingStore.ts
)
- Central source for app state:
- Recording state
- User ID
- Transcript text
- Provides methods:
startRecording()
stopRecording()
saveMeeting()
2. Transcription Service
src/utils/transcription/TranscriptionService.ts
orchestrates:
- Audio capturing
- Chunk buffering
- Sending to AWS Transcribe
- Storing transcripts in Supabase
3. Key Components
TranscriptDisplay.tsx
: Shows current transcriptSummaryDisplay.tsx
: Shows meeting score & GPT-based summarySearchHeader.tsx
: For entering queriesSearchResults.tsx
: Renders final results
Challenges & Solutions
1. Token Costs
Implemented chunk-based retrieval to reduce GPT calls
2. Rate Limits / Parallel Processing
Chose Supabase's pgvector over Upstash Redis due to daily command limits
3. Local vs. Production
- Use
supabase start
to test Edge Functions locally - Environment variables in
.env
for dev, Supabase project settings for production
4. Audio Processing
AWS Transcribe expects 16kHz PCM - implemented AudioWorklet for downsampling from user's default device sample rate
5. Team Collaboration
- JIRA/Trello for pipeline tasks
- Separate environment for user interviews