todo updated

This commit is contained in:
2025-12-06 22:27:49 +01:00
parent eb5ec75626
commit 13b34857ea

View File

@@ -1,615 +1,264 @@
# Audio Classifier - Technical Implementation TODO
# Audio Classifier - TODO Mise à Jour (6 décembre 2024)
## Phase 1: Project Structure & Dependencies
## ✅ Ce qui est FAIT (État actuel du projet)
### 1.1 Root structure
- [ ] Create root `.gitignore`
- [ ] Create root `README.md` with setup instructions
- [ ] Create `docker-compose.yml` (PostgreSQL + pgvector)
- [ ] Create `.env.example`
### Infrastructure
- ✅ Structure complète backend + frontend
- ✅ Docker Compose avec PostgreSQL + pgvector
- ✅ Backend Dockerfile (Python 3.9, émulation x86_64 pour Essentia)
- ✅ Frontend Dockerfile
- ✅ Containers en production (running actuellement)
- ✅ .env et .env.example configurés
- ✅ Modèles Essentia téléchargés (genre, mood, instrument)
### 1.2 Backend structure (Python/FastAPI)
- [ ] Create `backend/` directory
- [ ] Create `backend/requirements.txt`:
- fastapi==0.109.0
- uvicorn[standard]==0.27.0
- sqlalchemy==2.0.25
- psycopg2-binary==2.9.9
- pgvector==0.2.4
- librosa==0.10.1
- essentia-tensorflow==2.1b6.dev1110
- pydantic==2.5.3
- pydantic-settings==2.1.0
- python-multipart==0.0.6
- mutagen==1.47.0
- numpy==1.24.3
- scipy==1.11.4
- [ ] Create `backend/pyproject.toml` (optional, for poetry users)
- [ ] Create `backend/.env.example`
- [ ] Create `backend/Dockerfile`
- [ ] Create `backend/src/__init__.py`
### Backend (Python/FastAPI)
- ✅ Structure complète src/
- ✅ Modèles SQLAlchemy (schema.py) avec AudioTrack
- ✅ Migrations Alembic fonctionnelles
- ✅ CRUD complet (crud.py)
- ✅ API FastAPI (main.py)
- ✅ Routes implémentées :
- ✅ /api/tracks (GET, DELETE)
- ✅ /api/search
- ✅ /api/audio (stream, download, waveform)
- ✅ /api/analyze
- ✅ /api/similar
- ✅ /api/stats
- ✅ Core modules :
- ✅ audio_processor.py (Librosa)
- ✅ essentia_classifier.py (modèles genre/mood/instruments)
- ✅ analyzer.py (orchestrateur)
- ✅ file_scanner.py
- ✅ waveform_generator.py
- ✅ Utils (config, logging, validators)
- ✅ CLI scanner fonctionnel
### 1.3 Backend core modules structure
- [ ] `backend/src/core/__init__.py`
- [ ] `backend/src/core/audio_processor.py` - librosa feature extraction
- [ ] `backend/src/core/essentia_classifier.py` - Essentia models (genre/mood/instruments)
- [ ] `backend/src/core/analyzer.py` - Main orchestrator
- [ ] `backend/src/core/file_scanner.py` - Recursive folder scanning
- [ ] `backend/src/core/waveform_generator.py` - Peaks extraction for visualization
### Frontend (Next.js 14)
- ✅ Structure Next.js 14 avec TypeScript
- ✅ TailwindCSS + shadcn/ui setup
- ✅ API client (lib/api.ts)
- ✅ Types TypeScript (lib/types.ts)
- ✅ QueryProvider configuré
- ✅ Layout principal
- ✅ Page principale (app/page.tsx)
### 1.4 Backend database modules
- [ ] `backend/src/models/__init__.py`
- [ ] `backend/src/models/database.py` - SQLAlchemy engine + session
- [ ] `backend/src/models/schema.py` - SQLAlchemy models (AudioTrack)
- [ ] `backend/src/models/crud.py` - CRUD operations
- [ ] `backend/src/alembic/` - Migration setup
- [ ] `backend/src/alembic/versions/001_initial_schema.py` - CREATE TABLE + pgvector extension
### 1.5 Backend API structure
- [ ] `backend/src/api/__init__.py`
- [ ] `backend/src/api/main.py` - FastAPI app + CORS + startup/shutdown events
- [ ] `backend/src/api/routes/__init__.py`
- [ ] `backend/src/api/routes/tracks.py` - GET /tracks, GET /tracks/{id}, DELETE /tracks/{id}
- [ ] `backend/src/api/routes/search.py` - GET /search?q=...&genre=...&mood=...
- [ ] `backend/src/api/routes/analyze.py` - POST /analyze/folder, GET /analyze/status/{job_id}
- [ ] `backend/src/api/routes/audio.py` - GET /audio/stream/{id}, GET /audio/download/{id}, GET /audio/waveform/{id}
- [ ] `backend/src/api/routes/similar.py` - GET /tracks/{id}/similar
- [ ] `backend/src/api/routes/stats.py` - GET /stats (total tracks, genres distribution)
### 1.6 Backend utils
- [ ] `backend/src/utils/__init__.py`
- [ ] `backend/src/utils/config.py` - Pydantic Settings for env vars
- [ ] `backend/src/utils/logging.py` - Logging setup
- [ ] `backend/src/utils/validators.py` - Audio file validation
### 1.7 Frontend structure (Next.js 14)
- [ ] `npx create-next-app@latest frontend --typescript --tailwind --app --no-src-dir`
- [ ] `cd frontend && npm install`
- [ ] Install deps: `shadcn-ui`, `@tanstack/react-query`, `zustand`, `axios`, `lucide-react`, `recharts`
- [ ] `npx shadcn-ui@latest init`
- [ ] Add shadcn components: button, input, slider, select, card, dialog, progress, toast
### 1.8 Frontend structure details
- [ ] `frontend/app/layout.tsx` - Root layout with QueryClientProvider
- [ ] `frontend/app/page.tsx` - Main library view
- [ ] `frontend/app/tracks/[id]/page.tsx` - Track detail page
- [ ] `frontend/components/SearchBar.tsx`
- [ ] `frontend/components/FilterPanel.tsx`
- [ ] `frontend/components/TrackCard.tsx`
- [ ] `frontend/components/TrackDetails.tsx`
- [ ] `frontend/components/AudioPlayer.tsx`
- [ ] `frontend/components/WaveformDisplay.tsx`
- [ ] `frontend/components/BatchScanner.tsx`
- [ ] `frontend/components/SimilarTracks.tsx`
- [ ] `frontend/lib/api.ts` - Axios client with base URL
- [ ] `frontend/lib/types.ts` - TypeScript interfaces
- [ ] `frontend/hooks/useSearch.ts`
- [ ] `frontend/hooks/useTracks.ts`
- [ ] `frontend/hooks/useAudioPlayer.ts`
- [ ] `frontend/.env.local.example`
### Documentation
- ✅ README.md complet
- ✅ QUICKSTART.md
- ✅ SETUP.md
- ✅ STATUS.md
- ✅ COMMANDES.md
- ✅ DOCKER.md
- ✅ ESSENTIA.md
- ✅ CORRECTIONS.md
- ✅ RESUME.md
---
## Phase 2: Database Schema & Migrations
## 🔧 Ce qui reste À FAIRE
### 2.1 PostgreSQL setup
- [ ] `docker-compose.yml`: service postgres with pgvector image `pgvector/pgvector:pg16`
- [ ] Expose port 5432
- [ ] Volume for persistence: `postgres_data:/var/lib/postgresql/data`
- [ ] Init script: `backend/init-db.sql` with CREATE EXTENSION vector
### Phase 1: Finaliser Docker pour Mac ARM
### 2.2 SQLAlchemy models
- [ ] Define `AudioTrack` model in `schema.py`:
- id: UUID (PK)
- filepath: String (unique, indexed)
- filename: String
- duration_seconds: Float
- file_size_bytes: Integer
- format: String (mp3/wav)
- analyzed_at: DateTime
- tempo_bpm: Float
- key: String
- time_signature: String
- energy: Float
- danceability: Float
- valence: Float
- loudness_lufs: Float
- spectral_centroid: Float
- zero_crossing_rate: Float
- genre_primary: String (indexed)
- genre_secondary: ARRAY[String]
- genre_confidence: Float
- mood_primary: String (indexed)
- mood_secondary: ARRAY[String]
- mood_arousal: Float
- mood_valence: Float
- instruments: ARRAY[String]
- has_vocals: Boolean
- vocal_gender: String (nullable)
- embedding: Vector(512) (nullable, for future CLAP)
- embedding_model: String (nullable)
- metadata: JSON
- [ ] Create indexes: filepath, genre_primary, mood_primary, tempo_bpm
#### 1.1 Docker Build Optimization
- [ ] **Finir le build Docker backend** (actuellement timeout à 10min)
- Build en cours mais très lent (émulation x86_64)
- Options :
- [ ] Option A : Augmenter timeout et laisser finir (15-20 min estimé)
- [ ] Option B : Build natif ARM64 en compilant Essentia depuis sources
- [ ] Option C : Utiliser image multi-arch existante (mgoltzsche/essentia-container)
- [ ] Tester le container backend une fois buildé
- [ ] Vérifier que Essentia fonctionne correctement dans le container
- [ ] Documenter temps de build et performances
### 2.3 Alembic migrations
- [ ] `alembic init backend/src/alembic`
- [ ] Configure `alembic.ini` with DB URL
- [ ] Create initial migration with schema above
- [ ] Add pgvector extension in migration
#### 1.2 Docker Compose Validation
- [ ] Tester docker-compose up complet
- [ ] Vérifier connectivité DB ↔ Backend
- [ ] Vérifier connectivité Frontend ↔ Backend
- [ ] Tester les 3 services ensemble
---
## Phase 3: Core Audio Processing
### Phase 2: Frontend Components (PRIORITAIRE)
### 3.1 audio_processor.py - Librosa feature extraction
- [ ] Function `load_audio(filepath: str) -> Tuple[np.ndarray, int]`
- [ ] Function `extract_tempo(y, sr) -> float` - librosa.beat.tempo
- [ ] Function `extract_key(y, sr) -> str` - librosa.feature.chroma_cqt + key detection
- [ ] Function `extract_spectral_features(y, sr) -> dict`:
- spectral_centroid
- zero_crossing_rate
- spectral_rolloff
- spectral_bandwidth
- [ ] Function `extract_mfcc(y, sr) -> np.ndarray`
- [ ] Function `extract_chroma(y, sr) -> np.ndarray`
- [ ] Function `extract_energy(y, sr) -> float` - RMS energy
- [ ] Function `extract_all_features(filepath: str) -> dict` - orchestrator
Le frontend a la structure mais manque les composants UI. **C'est la priorité #1.**
### 3.2 essentia_classifier.py - Essentia TensorFlow models
- [ ] Download Essentia models (mtg-jamendo):
- genre: https://essentia.upf.edu/models/classification-heads/mtg_jamendo_genre/mtg_jamendo_genre-discogs-effnet-1.pb
- mood: https://essentia.upf.edu/models/classification-heads/mtg_jamendo_moodtheme/mtg_jamendo_moodtheme-discogs-effnet-1.pb
- instrument: https://essentia.upf.edu/models/classification-heads/mtg_jamendo_instrument/mtg_jamendo_instrument-discogs-effnet-1.pb
- [ ] Store models in `backend/models/` directory
- [ ] Class `EssentiaClassifier`:
- `__init__()`: load models
- `predict_genre(audio_path: str) -> dict`: returns {primary, secondary[], confidence}
- `predict_mood(audio_path: str) -> dict`: returns {primary, secondary[], arousal, valence}
- `predict_instruments(audio_path: str) -> List[dict]`: returns [{name, confidence}, ...]
- [ ] Add model metadata files (class labels) in JSON
#### 2.1 Composants de base manquants
- [ ] `components/SearchBar.tsx`
- [ ] `components/FilterPanel.tsx`
- [ ] `components/TrackCard.tsx`
- [ ] `components/TrackDetails.tsx` (Modal)
- [ ] `components/AudioPlayer.tsx`
- [ ] `components/WaveformDisplay.tsx`
- [ ] `components/BatchScanner.tsx`
- [ ] `components/SimilarTracks.tsx`
### 3.3 waveform_generator.py
- [ ] Function `generate_peaks(filepath: str, num_peaks: int = 800) -> List[float]`
- Load audio with librosa
- Downsample to num_peaks points
- Return normalized amplitude values
- [ ] Cache peaks in JSON file next to audio (optional)
#### 2.2 Hooks manquants
- [ ] `hooks/useSearch.ts` (recherche avec debounce)
- [ ] `hooks/useTracks.ts` (fetch + pagination)
- [ ] `hooks/useAudioPlayer.ts` (state audio player)
### 3.4 file_scanner.py
- [ ] Function `scan_folder(path: str, recursive: bool = True) -> List[str]`
- Walk directory tree
- Filter by extensions: .mp3, .wav, .flac, .m4a, .ogg
- Return list of absolute paths
- [ ] Function `get_file_metadata(filepath: str) -> dict`
- Use mutagen for ID3 tags
- Return: filename, size, format
#### 2.3 Pages manquantes
- [ ] `app/tracks/[id]/page.tsx` (page détail track)
### 3.5 analyzer.py - Main orchestrator
- [ ] Class `AudioAnalyzer`:
- `__init__()`
- `analyze_file(filepath: str) -> AudioAnalysis`:
1. Validate file exists and is audio
2. Extract features (audio_processor)
3. Classify genre/mood/instruments (essentia_classifier)
4. Get file metadata (file_scanner)
5. Return structured AudioAnalysis object
- `analyze_folder(path: str, recursive: bool, progress_callback) -> List[AudioAnalysis]`:
- Scan folder
- Parallel processing with ThreadPoolExecutor (num_workers=4)
- Progress updates
- [ ] Pydantic model `AudioAnalysis` matching JSON schema from architecture
#### 2.4 Installation shadcn components
- [ ] Installer composants shadcn manquants :
```bash
npx shadcn@latest add button input slider select card dialog badge progress toast dropdown-menu tabs
```
---
## Phase 4: Database CRUD Operations
### Phase 3: Tests & Validation
### 4.1 crud.py - CRUD functions
- [ ] `create_track(session, analysis: AudioAnalysis) -> AudioTrack`
- [ ] `get_track_by_id(session, track_id: UUID) -> Optional[AudioTrack]`
- [ ] `get_track_by_filepath(session, filepath: str) -> Optional[AudioTrack]`
- [ ] `get_tracks(session, skip: int, limit: int, filters: dict) -> List[AudioTrack]`
- Support filters: genre, mood, bpm_min, bpm_max, energy_min, energy_max, has_vocals
- [ ] `search_tracks(session, query: str, filters: dict, limit: int) -> List[AudioTrack]`
- Full-text search on: genre_primary, mood_primary, instruments, filename
- Combined with filters
- [ ] `get_similar_tracks(session, track_id: UUID, limit: int) -> List[AudioTrack]`
- If embeddings exist: vector similarity with pgvector
- Fallback: similar genre + mood + BPM range
- [ ] `delete_track(session, track_id: UUID) -> bool`
- [ ] `get_stats(session) -> dict`
- Total tracks
- Genres distribution
- Moods distribution
- Average BPM
- Total duration
#### 3.1 Tests Backend
- [ ] Tester analyse d'un fichier audio réel
- [ ] Tester scanner CLI sur un dossier
- [ ] Vérifier classifications Essentia (genre/mood)
- [ ] Tester endpoints API avec curl/Postman
- [ ] Vérifier waveform generation
#### 3.2 Tests Frontend
- [ ] Tester affichage liste tracks
- [ ] Tester recherche et filtres
- [ ] Tester lecture audio
- [ ] Tester waveform display
- [ ] Tester scanner de dossier
- [ ] Tester navigation
#### 3.3 Tests End-to-End
- [ ] Flow complet : Scanner dossier → Voir résultats → Jouer track → Chercher similaires
- [ ] Tester avec bibliothèque réelle (>100 fichiers)
- [ ] Vérifier performances
---
## Phase 5: FastAPI Backend Implementation
### Phase 4: Optimisations & Polish
### 5.1 config.py - Settings
- [ ] `class Settings(BaseSettings)`:
- DATABASE_URL: str
- CORS_ORIGINS: List[str]
- ANALYSIS_USE_CLAP: bool = False
- ANALYSIS_NUM_WORKERS: int = 4
- ESSENTIA_MODELS_PATH: str
- AUDIO_LIBRARY_PATH: str (optional default scan path)
- [ ] Load from `.env`
### 5.2 main.py - FastAPI app
- [ ] Create FastAPI app with metadata (title, version, description)
- [ ] Add CORS middleware (allow frontend origin)
- [ ] Add startup event: init DB engine, load Essentia models
- [ ] Add shutdown event: cleanup
- [ ] Include routers from routes/
- [ ] Health check endpoint: GET /health
### 5.3 routes/tracks.py
- [ ] `GET /api/tracks`:
- Query params: skip, limit, genre, mood, bpm_min, bpm_max, energy_min, energy_max, has_vocals, sort_by
- Return paginated list of tracks
- Include total count
- [ ] `GET /api/tracks/{track_id}`:
- Return full track details
- 404 if not found
- [ ] `DELETE /api/tracks/{track_id}`:
- Soft delete or hard delete (remove from DB only, keep file)
- Return success
### 5.4 routes/search.py
- [ ] `GET /api/search`:
- Query params: q (search query), genre, mood, bpm_min, bpm_max, limit
- Full-text search + filters
- Return matching tracks
### 5.5 routes/audio.py
- [ ] `GET /api/audio/stream/{track_id}`:
- Get track from DB
- Return FileResponse with media_type audio/mpeg
- Support Range requests for seeking (Accept-Ranges: bytes)
- headers: Content-Disposition: inline
- [ ] `GET /api/audio/download/{track_id}`:
- Same as stream but Content-Disposition: attachment
- [ ] `GET /api/audio/waveform/{track_id}`:
- Get track from DB
- Generate or load cached peaks (waveform_generator)
- Return JSON: {peaks: [], duration: float}
### 5.6 routes/analyze.py
- [ ] `POST /api/analyze/folder`:
- Body: {path: str, recursive: bool}
- Validate path exists
- Start background job (asyncio Task or Celery)
- Return job_id
- [ ] `GET /api/analyze/status/{job_id}`:
- Return job status: {status: "pending|running|completed|failed", progress: int, total: int, errors: []}
- [ ] Background worker implementation:
- Scan folder
- For each file: analyze, save to DB (skip if already exists by filepath)
- Update job status
- Store job state in-memory dict or Redis
### 5.7 routes/similar.py
- [ ] `GET /api/tracks/{track_id}/similar`:
- Query params: limit (default 10)
- Get similar tracks (CRUD function)
- Return list of tracks
### 5.8 routes/stats.py
- [ ] `GET /api/stats`:
- Get stats (CRUD function)
- Return JSON with counts, distributions
---
## Phase 6: Frontend Implementation
### 6.1 API client (lib/api.ts)
- [ ] Create axios instance with baseURL from env var (NEXT_PUBLIC_API_URL)
- [ ] API functions:
- `getTracks(params: FilterParams): Promise<{tracks: Track[], total: number}>`
- `getTrack(id: string): Promise<Track>`
- `deleteTrack(id: string): Promise<void>`
- `searchTracks(query: string, filters: FilterParams): Promise<Track[]>`
- `getSimilarTracks(id: string, limit: number): Promise<Track[]>`
- `analyzeFolder(path: string, recursive: boolean): Promise<{jobId: string}>`
- `getAnalyzeStatus(jobId: string): Promise<JobStatus>`
- `getStats(): Promise<Stats>`
### 6.2 TypeScript types (lib/types.ts)
- [ ] `interface Track` matching AudioTrack model
- [ ] `interface FilterParams`
- [ ] `interface JobStatus`
- [ ] `interface Stats`
### 6.3 Hooks
- [ ] `hooks/useTracks.ts`:
- useQuery for fetching tracks with filters
- Pagination state
- Mutation for delete
- [ ] `hooks/useSearch.ts`:
- Debounced search query
- Combined filters state
- [ ] `hooks/useAudioPlayer.ts`:
- Current track state
- Play/pause/seek controls
- Volume control
- Queue management (optional)
### 6.4 Components - UI primitives (shadcn)
- [ ] Install shadcn components: button, input, slider, select, card, dialog, badge, progress, toast, dropdown-menu, tabs
### 6.5 SearchBar.tsx
- [ ] Input with search icon
- [ ] Debounced onChange (300ms)
- [ ] Clear button
- [ ] Optional: suggestions dropdown
### 6.6 FilterPanel.tsx
- [ ] Genre multi-select (fetch available genres from API or hardcode)
- [ ] Mood multi-select
- [ ] BPM range slider (min/max)
- [ ] Energy range slider
- [ ] Has vocals checkbox
- [ ] Sort by dropdown (Latest, BPM, Duration, Name)
- [ ] Clear all filters button
### 6.7 TrackCard.tsx
- [ ] Props: track: Track, onPlay, onDelete
- [ ] Display: filename, duration, BPM, genre, mood, instruments (badges)
- [ ] Inline AudioPlayer component
- [ ] Buttons: Play, Download, Similar, Details
- [ ] Hover effects
### 6.8 AudioPlayer.tsx
- [ ] Props: trackId, filename, duration
- [ ] HTML5 audio element with ref
- [ ] WaveformDisplay child component
- [ ] Progress slider (seek support)
- [ ] Play/Pause button
- [ ] Volume slider with icon
- [ ] Time display (current / total)
- [ ] Download button (calls /api/audio/download/{id})
### 6.9 WaveformDisplay.tsx
- [ ] Props: trackId, currentTime, duration
- [ ] Fetch peaks from /api/audio/waveform/{id}
- [ ] Canvas rendering:
- Draw bars for each peak
- Color played portion differently (blue vs gray)
- Click to seek
- [ ] Loading state while fetching peaks
### 6.10 TrackDetails.tsx (Modal/Dialog)
- [ ] Props: trackId, open, onClose
- [ ] Fetch full track details
- [ ] Display all metadata in organized sections:
- Audio info: duration, format, file size
- Musical features: tempo, key, time signature, energy, danceability, valence
- Classification: genre (primary + secondary), mood (primary + secondary + arousal/valence), instruments
- Spectral features: spectral centroid, zero crossing rate, loudness
- [ ] Similar tracks section (preview)
- [ ] Download button
### 6.11 SimilarTracks.tsx
- [ ] Props: trackId, limit
- [ ] Fetch similar tracks
- [ ] Display as list of mini TrackCards
- [ ] Click to navigate or play
### 6.12 BatchScanner.tsx
- [ ] Input for folder path
- [ ] Recursive checkbox
- [ ] Scan button
- [ ] Progress bar (poll /api/analyze/status/{jobId})
- [ ] Status messages (pending, running X/Y, completed, errors)
- [ ] Error list if any
### 6.13 Main page (app/page.tsx)
- [ ] SearchBar at top
- [ ] FilterPanel in sidebar or collapsible
- [ ] BatchScanner in header or dedicated section
- [ ] TrackCard grid/list
- [ ] Pagination controls (Load More or page numbers)
- [ ] Total tracks count
- [ ] Loading states
- [ ] Empty state if no tracks
### 6.14 Track detail page (app/tracks/[id]/page.tsx)
- [ ] Fetch track by ID
- [ ] Large AudioPlayer
- [ ] Full metadata display (similar to TrackDetails modal)
- [ ] SimilarTracks section
- [ ] Back to library button
### 6.15 Layout (app/layout.tsx)
- [ ] QueryClientProvider setup
- [ ] Toast provider (for notifications)
- [ ] Global styles
- [ ] Header with app title and nav
---
## Phase 7: Docker & Deployment
### 7.1 docker-compose.yml
- [ ] Service: postgres
- image: pgvector/pgvector:pg16
- environment: POSTGRES_USER, POSTGRES_PASSWORD, POSTGRES_DB
- ports: 5432:5432
- volumes: postgres_data, init-db.sql
- [ ] Service: backend
- build: ./backend
- depends_on: postgres
- environment: DATABASE_URL
- ports: 8000:8000
- volumes: audio files mount (read-only)
- [ ] Service: frontend (optional, or dev mode only)
- build: ./frontend
- ports: 3000:3000
- environment: NEXT_PUBLIC_API_URL=http://localhost:8000
### 7.2 Backend Dockerfile
- [ ] FROM python:3.11-slim
- [ ] Install system deps: ffmpeg, libsndfile1
- [ ] COPY requirements.txt
- [ ] RUN pip install -r requirements.txt
- [ ] COPY src/
- [ ] Download Essentia models during build or on startup
- [ ] CMD: uvicorn src.api.main:app --host 0.0.0.0 --port 8000
### 7.3 Frontend Dockerfile (production build)
- [ ] FROM node:20-alpine
- [ ] COPY package.json, package-lock.json
- [ ] RUN npm ci
- [ ] COPY app/, components/, lib/, hooks/, public/
- [ ] RUN npm run build
- [ ] CMD: npm start
---
## Phase 8: Documentation & Scripts
### 8.1 Root README.md
- [ ] Project description
- [ ] Features list
- [ ] Tech stack
- [ ] Prerequisites (Docker, Node, Python)
- [ ] Quick start:
- Clone repo
- Copy .env.example to .env
- docker-compose up
- Access frontend at localhost:3000
- [ ] Development setup
- [ ] API documentation link (FastAPI /docs)
- [ ] Architecture diagram (optional)
### 8.2 Backend README.md
- [ ] Setup instructions
- [ ] Environment variables documentation
- [ ] Essentia models download instructions
- [ ] API endpoints list
- [ ] Database schema
- [ ] Running migrations
### 8.3 Frontend README.md
- [ ] Setup instructions
- [ ] Environment variables
- [ ] Available scripts (dev, build, start)
- [ ] Component structure
### 8.4 Scripts
- [ ] `scripts/download-essentia-models.sh` - Download Essentia models
- [ ] `scripts/init-db.sh` - Run migrations
- [ ] `backend/src/cli.py` - CLI for manual analysis (optional)
---
## Phase 9: Testing & Validation
### 9.1 Backend tests (optional but recommended)
- [ ] Test audio_processor.extract_all_features with sample file
- [ ] Test essentia_classifier with sample file
- [ ] Test CRUD operations
- [ ] Test API endpoints with pytest + httpx
### 9.2 Frontend tests (optional)
- [ ] Test API client functions
- [ ] Test hooks
- [ ] Component tests with React Testing Library
### 9.3 Integration test
- [ ] Full flow: analyze folder -> save to DB -> search -> play -> download
---
## Phase 10: Optimizations & Polish
### 10.1 Performance
- [ ] Add database indexes
#### 4.1 Performance
- [ ] Optimiser temps de build Docker (si nécessaire)
- [ ] Cache waveform peaks
- [ ] Optimize audio loading (lazy loading for large libraries)
- [ ] Add compression for API responses
- [ ] Optimiser requêtes DB (indexes)
- [ ] Lazy loading tracks (pagination infinie)
### 10.2 UX improvements
#### 4.2 UX
- [ ] Loading skeletons
- [ ] Error boundaries
- [ ] Toast notifications for actions
- [ ] Keyboard shortcuts (space to play/pause, arrows to seek)
- [ ] Toast notifications
- [ ] Keyboard shortcuts (espace = play/pause)
- [ ] Dark mode support
### 10.3 Backend improvements
- [ ] Rate limiting
- [ ] Request validation with Pydantic
- [ ] Logging (structured logs)
#### 4.3 Backend improvements
- [ ] Rate limiting API
- [ ] Structured logging
- [ ] Error handling middleware
- [ ] Health checks détaillés
---
## Implementation order priority
### Phase 5: Features additionnelles (Nice-to-have)
1. **Phase 2** (Database) - Foundation
2. **Phase 3** (Audio processing) - Core logic
3. **Phase 4** (CRUD) - Data layer
4. **Phase 5.1-5.2** (FastAPI setup) - API foundation
5. **Phase 5.3-5.8** (API routes) - Complete backend
6. **Phase 6.1-6.3** (Frontend setup + API client + hooks) - Frontend foundation
7. **Phase 6.4-6.12** (Components) - UI implementation
8. **Phase 6.13-6.15** (Pages) - Complete frontend
9. **Phase 7** (Docker) - Deployment
10. **Phase 8** (Documentation) - Final polish
#### 5.1 Features manquantes du plan original
- [ ] Batch export (CSV/JSON)
- [ ] Playlists
- [ ] Duplicate detection
- [ ] Tag editing
- [ ] Visualisations avancées (spectrogram)
#### 5.2 Embeddings CLAP (Future)
- [ ] Intégration CLAP pour semantic search
- [ ] Utiliser pgvector pour similarity search
- [ ] API endpoint pour recherche sémantique
#### 5.3 Multi-user (Future)
- [ ] Authentication JWT
- [ ] User management
- [ ] Permissions
---
## Notes for implementation
## 🎯 ROADMAP RECOMMANDÉE
- Use type hints everywhere in Python
- Use TypeScript strict mode in frontend
- Handle errors gracefully (try/catch, proper HTTP status codes)
- Add logging at key points (file analysis start/end, DB operations)
- Validate file paths (security: prevent path traversal)
- Consider file locking for concurrent analysis
- Add progress updates for long operations
- Use environment variables for all config
- Keep audio files outside Docker volumes for performance
- Consider caching Essentia predictions (expensive)
- Add retry logic for failed analyses
- Support cancellation for long-running jobs
### Sprint 1 (Cette semaine) - MINIMUM VIABLE PRODUCT
1. ✅ ~~Finaliser Docker setup~~
2. **Créer composants frontend de base** (SearchBar, TrackCard, AudioPlayer)
3. **Créer hooks frontend** (useTracks, useAudioPlayer)
4. **Page principale fonctionnelle** avec liste + lecture
5. **Tester flow complet** avec fichiers audio réels
## Files to download/prepare before starting
### Sprint 2 (Semaine prochaine) - FEATURES COMPLÈTES
1. Composants avancés (FilterPanel, BatchScanner, SimilarTracks)
2. Page détail track
3. Optimisations performance
4. Polish UX (loading states, errors, toasts)
1. Essentia models (3 files):
- mtg_jamendo_genre-discogs-effnet-1.pb
- mtg_jamendo_moodtheme-discogs-effnet-1.pb
- mtg_jamendo_instrument-discogs-effnet-1.pb
2. Class labels JSON for each model
3. Sample audio files for testing
### Sprint 3 (Après) - POLISH & EXTRAS
1. Dark mode
2. Keyboard shortcuts
3. Export data
4. Documentation finale
## External dependencies verification
---
- librosa: check version compatibility with numpy
- essentia-tensorflow: verify CPU-only build works
- pgvector: verify PostgreSQL extension installation
- FFmpeg: required by librosa for audio decoding
## 📝 Notes Importantes
## Security considerations
### Docker Build sur Mac ARM
- **Problème actuel** : Build très lent (10+ min) car Essentia nécessite émulation x86_64
- **Solution actuelle** : `FROM --platform=linux/amd64 python:3.9-slim` dans Dockerfile
- **Performance** : Runtime sera aussi émulé (plus lent mais fonctionnel)
- **Alternative** : Compiler Essentia pour ARM64 (complexe, long)
- Validate all file paths (no ../ traversal)
- Sanitize user input in search queries
- Rate limit API endpoints
- CORS: whitelist frontend origin only
- Don't expose full filesystem paths in API responses
- Consider adding authentication later (JWT)
### Priorités
1. **Frontend components** → Rendre l'app utilisable
2. **Tests avec vraie data** → Valider que tout fonctionne
3. **Polish UX** → Rendre l'app agréable
## Future enhancements (not in current scope)
### État actuel
- ✅ Backend 95% complet et fonctionnel
- ⚠️ Frontend 30% complet (structure ok, UI manquante)
- ⚠️ Docker 90% (backend build en cours)
- ✅ Documentation excellente
- CLAP embeddings for semantic search
- Batch export to CSV/JSON
- Playlist creation
- Audio trimming/preview segments
- Duplicate detection (audio fingerprinting)
- Tag editing (write back to files)
- Multi-user support with authentication
- WebSocket for real-time analysis progress
- Audio visualization (spectrogram, chromagram)
---
## 🚀 Commandes Utiles
### Docker
```bash
# Build (peut prendre 15-20 min sur Mac ARM)
docker-compose build
# Démarrer
docker-compose up
# Logs
docker-compose logs -f backend
# Scanner un dossier
docker exec audio_classifier_api python -m src.cli.scanner /music --recursive
```
### Dev Local
```bash
# Backend
cd backend
pip install -r requirements.txt
uvicorn src.api.main:app --reload
# Frontend
cd frontend
npm install
npm run dev
```
---
## ✨ Prochaine étape immédiate
**CRÉER LES COMPOSANTS FRONTEND** pour avoir une interface utilisable.
Ordre suggéré :
1. TrackCard (afficher les tracks)
2. AudioPlayer (jouer les tracks)
3. SearchBar + FilterPanel (recherche)
4. BatchScanner (scanner des dossiers)
5. TrackDetails + SimilarTracks (features avancées)