- Add comprehensive documentation in docs/ (architecture, features, roadmap) - Add german-app-frontend with Vite, TypeScript, ESLint configuration - Add AGENTS.md and .gitignore Generated by Mistral Vibe. Co-Authored-By: Mistral Vibe <vibe@mistral.ai>
368 lines
13 KiB
Markdown
368 lines
13 KiB
Markdown
# Feature: Vocabulary System
|
||
|
||
> **Status**: ⏳ Planned
|
||
> **Priority**: High
|
||
> **Complexity**: High
|
||
> **Estimate**: 8-12 hours
|
||
> **Assignee**: -
|
||
> **Created**: May 31, 2025
|
||
> **Target Completion**: -
|
||
> **PR**: -
|
||
> **Related Features**: Infrastructure Setup, Lesson Management, AI Services (TTS)
|
||
|
||
---
|
||
|
||
## 📌 Overview
|
||
|
||
### Purpose
|
||
Implement a comprehensive vocabulary management system that includes word storage, retrieval, audio generation, and integration with lessons.
|
||
|
||
### User Story
|
||
As a learner, I want to study German vocabulary with translations, articles (der/die/das), and audio pronunciations so that I can build my word knowledge effectively.
|
||
|
||
### Acceptance Criteria
|
||
- [ ] Vocabulary words can be created, read, updated, and deleted (Admin)
|
||
- [ ] Each word has: German text, English translation, article (optional), audio URL
|
||
- [ ] Vocabulary is associated with specific lessons
|
||
- [ ] Words can be filtered by lesson, level, or source (Goethe/DW)
|
||
- [ ] Audio is generated for each word using Coqui TTS
|
||
- [ ] Users can practice vocabulary through various exercises
|
||
|
||
---
|
||
|
||
## 📋 Requirements
|
||
|
||
### Functional Requirements
|
||
| ID | Requirement | Priority |
|
||
|----|-------------|----------|
|
||
| FR-001 | CRUD operations for vocabulary words | High |
|
||
| FR-002 | Associate words with lessons | High |
|
||
| FR-003 | Store article information (der/die/das) | High |
|
||
| FR-004 | Generate audio for each word | High |
|
||
| FR-005 | Import vocabulary from Goethe Institut | High |
|
||
| FR-006 | Import vocabulary from DW Learn German | High |
|
||
| FR-007 | Filter vocabulary by lesson/level/source | Medium |
|
||
| FR-008 | Vocabulary exercises (flashcards, matching) | Medium |
|
||
| FR-009 | Admin bulk import functionality | Medium |
|
||
|
||
### Non-Functional Requirements
|
||
- Performance: Vocabulary listing < 100ms
|
||
- Storage: Audio files ~50-100KB per word
|
||
- Security: Only admins can create/edit/delete words
|
||
- Data Integrity: Word-article combinations must be valid
|
||
|
||
---
|
||
|
||
## 🏗️ Technical Design
|
||
|
||
### Components Involved
|
||
- **Backend**: VocabularyController, VocabularyService, VocabularyImportService
|
||
- **Database**: Vocabulary table
|
||
- **Models**: Vocabulary, VocabularyDto
|
||
- **External**: Coqui TTS for audio generation
|
||
- **Frontend**: VocabularyTab, VocabularyCard, VocabularyExercise components
|
||
|
||
### Data Flow
|
||
```
|
||
1. Admin imports vocabulary from Goethe/DW
|
||
2. System scrapes word list from source
|
||
3. For each word:
|
||
a. Store in Vocabulary table
|
||
b. Generate audio using Coqui TTS
|
||
c. Save audio file and update AudioUrl
|
||
4. User views lesson vocabulary
|
||
5. Frontend displays words with audio playback
|
||
6. User can click to hear pronunciation
|
||
```
|
||
|
||
### API Endpoints
|
||
| Endpoint | Method | Description | Auth Required |
|
||
|----------|--------|-------------|----------------|
|
||
| `/api/vocabulary` | GET | List all vocabulary (with filters) | Yes |
|
||
| `/api/vocabulary/{id}` | GET | Get specific vocabulary word | Yes |
|
||
| `/api/vocabulary` | POST | Create new vocabulary word (Admin) | Yes |
|
||
| `/api/vocabulary/{id}` | PUT | Update vocabulary word (Admin) | Yes |
|
||
| `/api/vocabulary/{id}` | DELETE | Delete vocabulary word (Admin) | Yes |
|
||
| `/api/vocabulary/lesson/{lessonId}` | GET | Get vocabulary for a lesson | Yes |
|
||
| `/api/vocabulary/import` | POST | Import from Goethe/DW (Admin) | Yes |
|
||
| `/api/vocabulary/audio/{id}` | GET | Get audio file for word | Yes |
|
||
|
||
### Database Schema (from application-plan.md)
|
||
```sql
|
||
CREATE TABLE Vocabulary (
|
||
Id SERIAL PRIMARY KEY,
|
||
LessonId INT REFERENCES Lessons(Id) ON DELETE CASCADE,
|
||
Word VARCHAR(50) NOT NULL,
|
||
Translation VARCHAR(100) NOT NULL,
|
||
Article VARCHAR(10) CHECK (Article IN ('der', 'die', 'das', '')),
|
||
AudioUrl VARCHAR(255),
|
||
ImageUrl VARCHAR(255),
|
||
Source VARCHAR(50) CHECK (Source IN ('Goethe', 'DW'))
|
||
);
|
||
```
|
||
|
||
### Vocabulary Word Structure
|
||
```csharp
|
||
public class Vocabulary
|
||
{
|
||
public int Id { get; set; }
|
||
public int LessonId { get; set; }
|
||
public string Word { get; set; } // e.g., "Buch"
|
||
public string Translation { get; set; } // e.g., "book"
|
||
public string? Article { get; set; } // "der", "die", "das", or null
|
||
public string? AudioUrl { get; set; } // URL to audio file
|
||
public string? ImageUrl { get; set; } // Optional image URL
|
||
public string Source { get; set; } // "Goethe" or "DW"
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 🚀 Implementation Plan
|
||
|
||
### Phase 1: Database & Models (2 hours)
|
||
- [ ] Create Vocabulary entity
|
||
- [ ] Create VocabularyDto for API responses
|
||
- [ ] Create VocabularyRepository interface
|
||
- [ ] Create VocabularyRepository implementation
|
||
- [ ] Create migration for Vocabulary table
|
||
- [ ] Add vocabulary to Lesson entity (one-to-many relationship)
|
||
|
||
### Phase 2: Core CRUD Operations (2-3 hours)
|
||
- [ ] Create VocabularyService with basic CRUD
|
||
- [ ] Create VocabularyController
|
||
- [ ] Implement filtering (by lesson, level, source)
|
||
- [ ] Add validation for word data
|
||
- [ ] Add authorization (Admin for write operations)
|
||
- [ ] Write unit tests for VocabularyService
|
||
|
||
### Phase 3: Audio Generation (2-3 hours)
|
||
- [ ] Integrate with Coqui TTS service
|
||
- [ ] Create audio generation queue/background job
|
||
- [ ] Configure audio storage location
|
||
- [ ] Implement audio file serving endpoint
|
||
- [ ] Add audio URL to vocabulary DTOs
|
||
- [ ] Create migration to add AudioUrl column
|
||
|
||
### Phase 4: Vocabulary Import (2-3 hours)
|
||
- [ ] Create VocabularyImportService
|
||
- [ ] Implement Goethe Institut scraper
|
||
- [ ] Implement DW Learn German scraper
|
||
- [ ] Create bulk import endpoint
|
||
- [ ] Add validation for imported words
|
||
- [ ] Generate audio for imported words
|
||
- [ ] Create admin UI for import (optional)
|
||
|
||
### Phase 5: Frontend Integration (1-2 hours)
|
||
- [ ] Create VocabularyTab component
|
||
- [ ] Create VocabularyCard component with audio playback
|
||
- [ ] Create VocabularyExercise component
|
||
- [ ] Add vocabulary to LessonPage
|
||
|
||
### Milestones
|
||
| Milestone | Date | Status |
|
||
|-----------|------|--------|
|
||
| Database & Models | - | ⏳ |
|
||
| Core CRUD | - | ⏳ |
|
||
| Audio Generation | - | ⏳ |
|
||
| Vocabulary Import | - | ⏳ |
|
||
| Frontend Integration | - | ⏳ |
|
||
|
||
---
|
||
|
||
## ✅ Tasks
|
||
|
||
### Backend
|
||
- [ ] Create Domain/Entities/Vocabulary.cs
|
||
- [ ] Create Application/DTOs/VocabularyDto.cs
|
||
- [ ] Create Domain/Interfaces/IVocabularyRepository.cs
|
||
- [ ] Create Infrastructure/Data/Repositories/VocabularyRepository.cs
|
||
- [ ] Update Lesson entity to include Vocabulary collection
|
||
- [ ] Create Application/Services/VocabularyService.cs
|
||
- [ ] Create Presentation/Controllers/VocabularyController.cs
|
||
- [ ] Create VocabularyImportService
|
||
- [ ] Create endpoints for audio serving
|
||
- [ ] Integrate with Coqui TTS service
|
||
- [ ] Register services in Program.cs
|
||
- [ ] Write unit tests
|
||
- [ ] Write integration tests
|
||
|
||
### Database
|
||
- [ ] Create migration for Vocabulary table
|
||
- [ ] Add foreign key to Lessons table
|
||
- [ ] Add indexes for LessonId, Source
|
||
- [ ] Apply migration
|
||
|
||
### Audio Generation
|
||
- [ ] Set up Coqui TTS configuration
|
||
- [ ] Create audio file storage directory
|
||
- [ ] Implement audio generation for new words
|
||
- [ ] Implement background job for bulk audio generation
|
||
- [ ] Create audio file cleanup mechanism
|
||
|
||
### Vocabulary Import
|
||
- [ ] Research Goethe Institut vocabulary structure
|
||
- [ ] Research DW Learn German vocabulary structure
|
||
- [ ] Implement web scraping for Goethe
|
||
- [ ] Implement web scraping for DW
|
||
- [ ] Create bulk import API endpoint
|
||
- [ ] Add rate limiting to scrapers
|
||
|
||
### Frontend
|
||
- [ ] Create components/VocabularyTab.tsx
|
||
- [ ] Create components/VocabularyCard.tsx
|
||
- [ ] Create components/VocabularyExercise.tsx
|
||
- [ ] Create hooks/useVocabulary.ts
|
||
- [ ] Integrate with LessonPage
|
||
- [ ] Add audio playback functionality
|
||
|
||
---
|
||
|
||
## 🔗 Dependencies
|
||
|
||
### Feature Dependencies
|
||
- [Infrastructure Setup](infrastructure-setup.md) - Required
|
||
- [Lesson Management](lesson-management.md) - Required (vocabulary associated with lessons)
|
||
- [AI Services - TTS](ai-services.md) - Required (for audio generation)
|
||
|
||
### Technical Dependencies
|
||
- Coqui TTS Python library
|
||
- HTML Agility Pack or similar for web scraping
|
||
- AutoMapper (optional)
|
||
|
||
### Blockers
|
||
- [ ] Infrastructure Setup must be complete
|
||
- [ ] Lesson Management must be complete for vocabulary-lesson association
|
||
- [ ] Coqui TTS service must be configured
|
||
|
||
---
|
||
|
||
## ✅ Definition of Done
|
||
|
||
### General Criteria (All Features)
|
||
- [ ] All acceptance criteria met and verified
|
||
- [ ] All tasks in this document completed
|
||
- [ ] Code follows Clean Architecture principles
|
||
- [ ] Code reviewed and approved by at least 1 team member
|
||
- [ ] All tests passing (unit, integration)
|
||
- [ ] Documentation updated (README, AGENTS.md if applicable)
|
||
- [ ] Feature works in development environment
|
||
- [ ] Feature deployed to staging environment
|
||
- [ ] Performance meets defined targets
|
||
- [ ] Security review completed
|
||
- [ ] No critical bugs or blockers
|
||
|
||
### Vocabulary-Specific Criteria
|
||
- [ ] Vocabulary words can be created, read, updated, and deleted
|
||
- [ ] Each word has: German text, translation, article, audio URL
|
||
- [ ] Words are correctly associated with lessons
|
||
- [ ] Words can be filtered by lesson, level, source
|
||
- [ ] Audio is generated for all vocabulary words
|
||
- [ ] Audio files are accessible and playable
|
||
- [ ] Goethe Institut vocabulary import works
|
||
- [ ] DW Learn German vocabulary import works
|
||
- [ ] Vocabulary exercises (flashcards, matching) are functional
|
||
|
||
---
|
||
|
||
## 🧪 Testing Strategy
|
||
|
||
### Testing Approach
|
||
|
||
| Test Type | Coverage | Tools | Responsibility |
|
||
|-----------|----------|-------|----------------|
|
||
| Unit Tests | 80%+ code coverage | MsTest, Moq | Backend Dev |
|
||
| Integration Tests | All service interactions | MsTest, TestContainers | Backend Dev |
|
||
| API Tests | All endpoints | MsTest, HttpClient | Backend Dev |
|
||
| Frontend Unit Tests | Component logic | Vitest | Frontend Dev |
|
||
| Frontend Integration | Service integration | Vitest | Frontend Dev |
|
||
| E2E Tests | Critical user journeys | Playwright | QA/Dev |
|
||
| Manual Testing | Exploratory, edge cases | BrowserStack | QA |
|
||
|
||
### Vocabulary-Specific Tests
|
||
|
||
#### Backend Tests
|
||
- [ ] Create word with valid data → success
|
||
- [ ] Create word with missing required fields → error
|
||
- [ ] Create word with invalid article → error
|
||
- [ ] Create word with invalid source → error
|
||
- [ ] Get word by ID → returns correct word
|
||
- [ ] Get words by lesson → returns correct list
|
||
- [ ] Get words by level → returns correct list
|
||
- [ ] Get words by source → returns correct list
|
||
- [ ] Update word → success
|
||
- [ ] Update word with invalid data → error
|
||
- [ ] Delete word → success
|
||
- [ ] Bulk import from Goethe → creates words correctly
|
||
- [ ] Bulk import from DW → creates words correctly
|
||
- [ ] Audio generation for word → creates audio file
|
||
|
||
#### Audio Tests
|
||
- [ ] Audio file generated for new word
|
||
- [ ] Audio file accessible via endpoint
|
||
- [ ] Audio file is valid WAV format
|
||
- [ ] Audio file quality is acceptable
|
||
- [ ] Audio file size is within limits
|
||
|
||
#### Integration Tests
|
||
- [ ] Create lesson with vocabulary → both created
|
||
- [ ] Get lesson → includes vocabulary
|
||
- [ ] Delete lesson → cascades to vocabulary
|
||
- [ ] Import vocabulary → audio generated for all words
|
||
|
||
---
|
||
|
||
## 📝 Notes & Decisions
|
||
|
||
### Decisions Made
|
||
| Date | Decision | Rationale |
|
||
|------|----------|-----------|
|
||
| May 31, 2025 | Generate audio for all vocabulary | Essential for pronunciation practice |
|
||
| May 31, 2025 | Store audio files on filesystem | Simple for MVP, can migrate to CDN later |
|
||
| May 31, 2025 | Import from Goethe and DW | High-quality, trusted sources |
|
||
| May 31, 2025 | Include article information | Critical for German language learning |
|
||
|
||
### Technical Notes
|
||
- Audio files should be stored with consistent naming: `/audio/vocabulary/{id}.wav`
|
||
- Vocabulary table has CHECK constraint for Article (only der/die/das or empty)
|
||
- Vocabulary table has CHECK constraint for Source (only Goethe or DW)
|
||
- Consider adding phonetic transcription (IPA) in the future
|
||
- Audio generation can be resource-intensive - consider queueing for bulk operations
|
||
|
||
### Gotchas
|
||
- ⚠️ Coqui TTS may have issues with some German words - need fallback mechanism
|
||
- ⚠️ Web scraping may break if Goethe/DW change their HTML structure
|
||
- ⚠️ Audio files can be large - consider compression
|
||
- ⚠️ Need to handle duplicate words across different lessons
|
||
- ⚠️ Article may be empty for verbs, adjectives, etc.
|
||
|
||
### Article Rules Reference
|
||
- **der**: Masculine nouns (e.g., der Mann, der Tag)
|
||
- **die**: Feminine nouns (e.g., die Frau, die Stadt)
|
||
- **das**: Neuter nouns (e.g., das Kind, das Haus)
|
||
- **empty**: Verbs (e.g., gehen, sein), adjectives, adverbs, prepositions
|
||
|
||
---
|
||
|
||
## 📊 Progress History
|
||
|
||
| Date | Status Change | Notes |
|
||
|------|---------------|-------|
|
||
| May 31, 2025 | Created | Initial plan based on application-plan.md |
|
||
|
||
---
|
||
|
||
## 📎 Related Files & Links
|
||
|
||
- Architecture: [Backend Structure](../architecture/backend-structure.md)
|
||
- Architecture: [Application Plan](../architecture/application-plan.md)
|
||
- Database Schema: [Initial Database Schema](../database/initial-database-schema.sql)
|
||
- Feature: [Lesson Management](lesson-management.md)
|
||
- Feature: [AI Services](ai-services.md)
|
||
- Reference: [Goethe Institut Vocabulary](https://www.goethe.de/en/spr/ueb.html)
|
||
- Reference: [DW Learn German Vocabulary](https://learngerman.dw.com/en/learn-german/s-9528)
|
||
- Reference: [Coqui TTS](https://github.com/coqui-ai/TTS)
|
||
|
||
---
|
||
|
||
*Feature created from application-plan.md*
|