Feature: Vocabulary System
Status: ⏳ Planned
Priority: High
Complexity: High
Estimate: 8-12 hours
Assignee: -
Created: May 31, 2025
Target Completion: -
PR: -
Related Features: Infrastructure Setup, Lesson Management, AI Services (TTS)
📌 Overview
Purpose
Implement a comprehensive vocabulary management system that includes word storage, retrieval, audio generation, and integration with lessons.
User Story
As a learner, I want to study German vocabulary with translations, articles (der/die/das), and audio pronunciations so that I can build my word knowledge effectively.
Acceptance Criteria
📋 Requirements
Functional Requirements
| ID |
Requirement |
Priority |
| FR-001 |
CRUD operations for vocabulary words |
High |
| FR-002 |
Associate words with lessons |
High |
| FR-003 |
Store article information (der/die/das) |
High |
| FR-004 |
Generate audio for each word |
High |
| FR-005 |
Import vocabulary from Goethe Institut |
High |
| FR-006 |
Import vocabulary from DW Learn German |
High |
| FR-007 |
Filter vocabulary by lesson/level/source |
Medium |
| FR-008 |
Vocabulary exercises (flashcards, matching) |
Medium |
| FR-009 |
Admin bulk import functionality |
Medium |
Non-Functional Requirements
- Performance: Vocabulary listing < 100ms
- Storage: Audio files ~50-100KB per word
- Security: Only admins can create/edit/delete words
- Data Integrity: Word-article combinations must be valid
🏗️ Technical Design
Components Involved
- Backend: VocabularyController, VocabularyService, VocabularyImportService
- Database: Vocabulary table
- Models: Vocabulary, VocabularyDto
- External: Coqui TTS for audio generation
- Frontend: VocabularyTab, VocabularyCard, VocabularyExercise components
Data Flow
1. Admin imports vocabulary from Goethe/DW
2. System scrapes word list from source
3. For each word:
a. Store in Vocabulary table
b. Generate audio using Coqui TTS
c. Save audio file and update AudioUrl
4. User views lesson vocabulary
5. Frontend displays words with audio playback
6. User can click to hear pronunciation
API Endpoints
| Endpoint |
Method |
Description |
Auth Required |
/api/vocabulary |
GET |
List all vocabulary (with filters) |
Yes |
/api/vocabulary/{id} |
GET |
Get specific vocabulary word |
Yes |
/api/vocabulary |
POST |
Create new vocabulary word (Admin) |
Yes |
/api/vocabulary/{id} |
PUT |
Update vocabulary word (Admin) |
Yes |
/api/vocabulary/{id} |
DELETE |
Delete vocabulary word (Admin) |
Yes |
/api/vocabulary/lesson/{lessonId} |
GET |
Get vocabulary for a lesson |
Yes |
/api/vocabulary/import |
POST |
Import from Goethe/DW (Admin) |
Yes |
/api/vocabulary/audio/{id} |
GET |
Get audio file for word |
Yes |
Database Schema (from application-plan.md)
CREATE TABLE Vocabulary (
Id SERIAL PRIMARY KEY,
LessonId INT REFERENCES Lessons(Id) ON DELETE CASCADE,
Word VARCHAR(50) NOT NULL,
Translation VARCHAR(100) NOT NULL,
Article VARCHAR(10) CHECK (Article IN ('der', 'die', 'das', '')),
AudioUrl VARCHAR(255),
ImageUrl VARCHAR(255),
Source VARCHAR(50) CHECK (Source IN ('Goethe', 'DW'))
);
Vocabulary Word Structure
public class Vocabulary
{
public int Id { get; set; }
public int LessonId { get; set; }
public string Word { get; set; } // e.g., "Buch"
public string Translation { get; set; } // e.g., "book"
public string? Article { get; set; } // "der", "die", "das", or null
public string? AudioUrl { get; set; } // URL to audio file
public string? ImageUrl { get; set; } // Optional image URL
public string Source { get; set; } // "Goethe" or "DW"
}
🚀 Implementation Plan
Phase 1: Database & Models (2 hours)
Phase 2: Core CRUD Operations (2-3 hours)
Phase 3: Audio Generation (2-3 hours)
Phase 4: Vocabulary Import (2-3 hours)
Phase 5: Frontend Integration (1-2 hours)
Milestones
| Milestone |
Date |
Status |
| Database & Models |
- |
⏳ |
| Core CRUD |
- |
⏳ |
| Audio Generation |
- |
⏳ |
| Vocabulary Import |
- |
⏳ |
| Frontend Integration |
- |
⏳ |
✅ Tasks
Backend
Database
Audio Generation
Vocabulary Import
Frontend
🔗 Dependencies
Feature Dependencies
Technical Dependencies
- Coqui TTS Python library
- HTML Agility Pack or similar for web scraping
- AutoMapper (optional)
Blockers
✅ Definition of Done
General Criteria (All Features)
Vocabulary-Specific Criteria
🧪 Testing Strategy
Testing Approach
| Test Type |
Coverage |
Tools |
Responsibility |
| Unit Tests |
80%+ code coverage |
MsTest, Moq |
Backend Dev |
| Integration Tests |
All service interactions |
MsTest, TestContainers |
Backend Dev |
| API Tests |
All endpoints |
MsTest, HttpClient |
Backend Dev |
| Frontend Unit Tests |
Component logic |
Vitest |
Frontend Dev |
| Frontend Integration |
Service integration |
Vitest |
Frontend Dev |
| E2E Tests |
Critical user journeys |
Playwright |
QA/Dev |
| Manual Testing |
Exploratory, edge cases |
BrowserStack |
QA |
Vocabulary-Specific Tests
Backend Tests
Audio Tests
Integration Tests
📝 Notes & Decisions
Decisions Made
| Date |
Decision |
Rationale |
| May 31, 2025 |
Generate audio for all vocabulary |
Essential for pronunciation practice |
| May 31, 2025 |
Store audio files on filesystem |
Simple for MVP, can migrate to CDN later |
| May 31, 2025 |
Import from Goethe and DW |
High-quality, trusted sources |
| May 31, 2025 |
Include article information |
Critical for German language learning |
Technical Notes
- Audio files should be stored with consistent naming:
/audio/vocabulary/{id}.wav
- Vocabulary table has CHECK constraint for Article (only der/die/das or empty)
- Vocabulary table has CHECK constraint for Source (only Goethe or DW)
- Consider adding phonetic transcription (IPA) in the future
- Audio generation can be resource-intensive - consider queueing for bulk operations
Gotchas
- ⚠️ Coqui TTS may have issues with some German words - need fallback mechanism
- ⚠️ Web scraping may break if Goethe/DW change their HTML structure
- ⚠️ Audio files can be large - consider compression
- ⚠️ Need to handle duplicate words across different lessons
- ⚠️ Article may be empty for verbs, adjectives, etc.
Article Rules Reference
- der: Masculine nouns (e.g., der Mann, der Tag)
- die: Feminine nouns (e.g., die Frau, die Stadt)
- das: Neuter nouns (e.g., das Kind, das Haus)
- empty: Verbs (e.g., gehen, sein), adjectives, adverbs, prepositions
📊 Progress History
| Date |
Status Change |
Notes |
| May 31, 2025 |
Created |
Initial plan based on application-plan.md |
📎 Related Files & Links
Feature created from application-plan.md