Features
Vector Search (Semantic Retrieval)
File References
supabase/functions/search_chunks/index.ts
src/utils/transcription/TranscriptionService.ts
(for calling the search function)
What It Does
The vector search system:
- Takes user input
- Uses GPT to parse into keywords & topics
- Retrieves relevant chunk IDs by comparing embeddings
- Sorts chunk results by similarity
- Optionally merges results with GPT for final summary
flowchart TD A[User Input] --> B[GPT Processing] B --> C[Keywords] B --> D[Topics] C --> E[Vector Embeddings] D --> E E --> F[Compare with Chunk Embeddings] F --> G[Sort by Similarity] G --> H[Relevant Chunks] H --> I{Merge with GPT?} I -->|Yes| J[Final Summary] I -->|No| K[Raw Results]
Key Points
- Query is broken down into "top_3_keywords" and "top_3_topics"
- Chunk embeddings stored in pgvector columns for fast KNN lookups
- Thresholds/tuning (similarity) can be adjusted in the Edge Function
Document Processing / Chunking
File References
supabase/functions/insert_transcript/index.ts
supabase/functions/process_chunk/index.ts
supabase/functions/_shared/chunks.ts
src/utils/audio/
(for capturing audio, if using live mode)
Process Flow
- Transcript Ingestion: User either pastes a transcript or records one live
- Clean & Split: Removes filler words, splits into sentences
- DetectTopicChunks: Uses sliding window + cosine similarity threshold for chunk boundaries
- Process & Store:
- Each chunk sent to
process_chunk
Edge Function - GPT-based metadata extraction (keywords, topics, sentiment, summary)
- Storage in PostgreSQL with embeddings
- Each chunk sent to
flowchart TD A[Transcript Input] -->|Paste or Record| B[Transcript Ingestion] B --> C[Clean Text] C --> D[Split into Sentences] D --> E[Detect Topic Chunks] E -->|Sliding Window| F[Chunk Boundaries] F --> G[Process Each Chunk] G --> H[GPT Metadata Extraction] G --> I[Generate Embeddings] H --> J[Store in PostgreSQL] I --> J subgraph Metadata H --> K[Keywords] H --> L[Topics] H --> M[Sentiment] H --> N[Summary] end
Query Understanding
File References
src/components/Search/SearchHeader.tsx
src/pages/SearchResults.tsx
supabase/functions/search_chunks/index.ts
Result Ranking & Summaries
File References
src/pages/SearchResults.tsx
supabase/functions/search_chunks/index.ts
Implementation
- Chunks sorted by vector similarity
- GPT used for final "ragPrompt" to produce user-friendly summary
- Expandable chunks with highlighted relevant lines
Note: Threshold tuning is crucial - too high might miss context, too low may include noise.