Initial commit - SoundWave v1.0
- Full PWA support with offline capabilities - Comprehensive search across songs, playlists, and channels - Offline playlist manager with download tracking - Pre-built frontend for zero-build deployment - Docker-based deployment with docker compose - Material-UI dark theme interface - YouTube audio download and management - Multi-user authentication support
This commit is contained in:
commit
51679d1943
254 changed files with 37281 additions and 0 deletions
301
docs/LYRICS_FEATURE.md
Normal file
301
docs/LYRICS_FEATURE.md
Normal file
|
|
@ -0,0 +1,301 @@
|
|||
# Lyrics Feature - SoundWave
|
||||
|
||||
## Overview
|
||||
|
||||
SoundWave now includes automatic lyrics fetching and synchronized display powered by the [LRCLIB API](https://lrclib.net/). This feature provides:
|
||||
|
||||
- **Automatic lyrics fetching** for newly downloaded audio
|
||||
- **Synchronized lyrics** display with real-time highlighting
|
||||
- **Caching system** to minimize API requests
|
||||
- **Background polling** to gradually build lyrics library
|
||||
- **Manual controls** for fetching, updating, and managing lyrics
|
||||
|
||||
## How It Works
|
||||
|
||||
### 1. Automatic Fetching
|
||||
|
||||
When you download an audio file, SoundWave automatically:
|
||||
1. Extracts metadata (title, artist, duration)
|
||||
2. Queries LRCLIB API for lyrics
|
||||
3. Stores synchronized (.lrc) or plain text lyrics
|
||||
4. Caches results to avoid duplicate requests
|
||||
|
||||
### 2. Background Polling
|
||||
|
||||
A Celery beat schedule runs periodic tasks:
|
||||
|
||||
- **Every hour**: Auto-fetch lyrics for up to 50 tracks without lyrics
|
||||
- **Weekly (Sunday 3 AM)**: Clean up old cache entries (30+ days)
|
||||
- **Weekly (Sunday 4 AM)**: Retry failed lyrics fetches (7+ days old)
|
||||
|
||||
### 3. Smart Caching
|
||||
|
||||
Two-level caching system:
|
||||
|
||||
1. **Django Cache**: In-memory cache for API responses (7 days)
|
||||
2. **Database Cache**: `LyricsCache` table stores lyrics by title/artist/duration
|
||||
|
||||
This ensures:
|
||||
- Minimal API requests (respecting rate limits)
|
||||
- Fast lyrics retrieval
|
||||
- Shared cache across tracks with same metadata
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### Get Lyrics for Audio
|
||||
```http
|
||||
GET /api/audio/{youtube_id}/lyrics/
|
||||
```
|
||||
|
||||
Returns lyrics data or triggers async fetch if not attempted.
|
||||
|
||||
### Manually Fetch Lyrics
|
||||
```http
|
||||
POST /api/audio/{youtube_id}/lyrics/fetch/
|
||||
Body: { "force": true }
|
||||
```
|
||||
|
||||
Forces immediate lyrics fetch from LRCLIB API.
|
||||
|
||||
### Update Lyrics Manually
|
||||
```http
|
||||
PUT /api/audio/{youtube_id}/lyrics/
|
||||
Body: {
|
||||
"synced_lyrics": "[00:12.00]Lyrics text...",
|
||||
"plain_lyrics": "Plain text lyrics...",
|
||||
"is_instrumental": false,
|
||||
"language": "en"
|
||||
}
|
||||
```
|
||||
|
||||
### Delete Lyrics
|
||||
```http
|
||||
DELETE /api/audio/{youtube_id}/lyrics/
|
||||
```
|
||||
|
||||
### Batch Fetch
|
||||
```http
|
||||
POST /api/audio/lyrics/fetch_batch/
|
||||
Body: { "youtube_ids": ["abc123", "def456"] }
|
||||
```
|
||||
|
||||
### Fetch All Missing
|
||||
```http
|
||||
POST /api/audio/lyrics/fetch_all_missing/
|
||||
Body: { "limit": 50 }
|
||||
```
|
||||
|
||||
### Statistics
|
||||
```http
|
||||
GET /api/audio/lyrics/stats/
|
||||
```
|
||||
|
||||
Returns:
|
||||
```json
|
||||
{
|
||||
"total_audio": 1250,
|
||||
"total_lyrics_attempted": 980,
|
||||
"with_synced_lyrics": 720,
|
||||
"with_plain_lyrics": 150,
|
||||
"instrumental": 30,
|
||||
"failed": 80,
|
||||
"coverage_percentage": 72.0
|
||||
}
|
||||
```
|
||||
|
||||
## Frontend Components
|
||||
|
||||
### LyricsPlayer Component
|
||||
|
||||
```tsx
|
||||
import LyricsPlayer from '@/components/LyricsPlayer';
|
||||
|
||||
<LyricsPlayer
|
||||
youtubeId="abc123"
|
||||
currentTime={45.2}
|
||||
onClose={() => setShowLyrics(false)}
|
||||
embedded={false}
|
||||
/>
|
||||
```
|
||||
|
||||
**Features:**
|
||||
- Real-time synchronized highlighting
|
||||
- Auto-scroll with toggle
|
||||
- Synced/Plain text tabs
|
||||
- Retry fetch button
|
||||
- Instrumental detection
|
||||
|
||||
### Props
|
||||
- `youtubeId`: YouTube video ID
|
||||
- `currentTime`: Current playback time in seconds
|
||||
- `onClose`: Callback when closed (optional)
|
||||
- `embedded`: Compact mode flag (optional)
|
||||
|
||||
## Database Models
|
||||
|
||||
### Lyrics Model
|
||||
```python
|
||||
class Lyrics(models.Model):
|
||||
audio = OneToOneField(Audio)
|
||||
synced_lyrics = TextField()
|
||||
plain_lyrics = TextField()
|
||||
is_instrumental = BooleanField()
|
||||
source = CharField() # 'lrclib', 'genius', 'manual'
|
||||
language = CharField()
|
||||
fetched_date = DateTimeField()
|
||||
fetch_attempted = BooleanField()
|
||||
fetch_attempts = IntegerField()
|
||||
last_error = TextField()
|
||||
```
|
||||
|
||||
### LyricsCache Model
|
||||
```python
|
||||
class LyricsCache(models.Model):
|
||||
title = CharField()
|
||||
artist_name = CharField()
|
||||
album_name = CharField()
|
||||
duration = IntegerField()
|
||||
synced_lyrics = TextField()
|
||||
plain_lyrics = TextField()
|
||||
is_instrumental = BooleanField()
|
||||
language = CharField()
|
||||
source = CharField()
|
||||
cached_date = DateTimeField()
|
||||
last_accessed = DateTimeField()
|
||||
access_count = IntegerField()
|
||||
not_found = BooleanField()
|
||||
```
|
||||
|
||||
## Celery Tasks
|
||||
|
||||
### fetch_lyrics_for_audio
|
||||
```python
|
||||
from audio.tasks_lyrics import fetch_lyrics_for_audio
|
||||
|
||||
fetch_lyrics_for_audio.delay('youtube_id', force=False)
|
||||
```
|
||||
|
||||
### fetch_lyrics_batch
|
||||
```python
|
||||
from audio.tasks_lyrics import fetch_lyrics_batch
|
||||
|
||||
fetch_lyrics_batch.delay(['id1', 'id2', 'id3'], delay_seconds=2)
|
||||
```
|
||||
|
||||
### auto_fetch_lyrics
|
||||
```python
|
||||
from audio.tasks_lyrics import auto_fetch_lyrics
|
||||
|
||||
auto_fetch_lyrics.delay(limit=50, max_attempts=3)
|
||||
```
|
||||
|
||||
### cleanup_lyrics_cache
|
||||
```python
|
||||
from audio.tasks_lyrics import cleanup_lyrics_cache
|
||||
|
||||
cleanup_lyrics_cache.delay(days_old=30)
|
||||
```
|
||||
|
||||
### refetch_failed_lyrics
|
||||
```python
|
||||
from audio.tasks_lyrics import refetch_failed_lyrics
|
||||
|
||||
refetch_failed_lyrics.delay(days_old=7, limit=20)
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Celery Beat Schedule
|
||||
Located in `backend/config/celery.py`:
|
||||
|
||||
```python
|
||||
app.conf.beat_schedule = {
|
||||
'auto-fetch-lyrics': {
|
||||
'task': 'audio.auto_fetch_lyrics',
|
||||
'schedule': crontab(minute=0), # Every hour
|
||||
'kwargs': {'limit': 50, 'max_attempts': 3},
|
||||
},
|
||||
# ... more tasks
|
||||
}
|
||||
```
|
||||
|
||||
### LRCLIB Instance
|
||||
Default: `https://lrclib.net`
|
||||
|
||||
To use custom instance:
|
||||
```python
|
||||
from audio.lyrics_service import LyricsService
|
||||
|
||||
service = LyricsService(lrclib_instance='https://custom.lrclib.net')
|
||||
```
|
||||
|
||||
## LRC Format
|
||||
|
||||
Synchronized lyrics use the LRC format:
|
||||
|
||||
```
|
||||
[ar: Artist Name]
|
||||
[ti: Song Title]
|
||||
[al: Album Name]
|
||||
[00:12.00]First line of lyrics
|
||||
[00:15.50]Second line of lyrics
|
||||
[00:18.20]Third line of lyrics
|
||||
```
|
||||
|
||||
Timestamps format: `[mm:ss.xx]`
|
||||
- `mm`: Minutes (2 digits)
|
||||
- `ss`: Seconds (2 digits)
|
||||
- `xx`: Centiseconds (2 digits)
|
||||
|
||||
## Admin Interface
|
||||
|
||||
Django Admin provides:
|
||||
|
||||
### Lyrics Admin
|
||||
- List view with filters (source, language, fetch status)
|
||||
- Search by audio title/channel/youtube_id
|
||||
- Edit synced/plain lyrics
|
||||
- View fetch attempts and errors
|
||||
|
||||
### LyricsCache Admin
|
||||
- List view with filters (source, not_found, date)
|
||||
- Search by title/artist
|
||||
- View access count statistics
|
||||
- Bulk action: Clear not_found entries
|
||||
|
||||
## Rate Limiting
|
||||
|
||||
To avoid overwhelming LRCLIB API:
|
||||
|
||||
1. **Request delays**: 1-2 second delays between batch requests
|
||||
2. **Caching**: 7-day cache for successful fetches, 1-day for not_found
|
||||
3. **Max attempts**: Stop after 3-5 failed attempts
|
||||
4. **Retry backoff**: Wait 7+ days before retrying failed fetches
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### No lyrics found
|
||||
- Check if track metadata (title, artist) is accurate
|
||||
- Try manual fetch with force=true
|
||||
- Check LRCLIB database has lyrics for this track
|
||||
- Verify track isn't instrumental
|
||||
|
||||
### Sync issues
|
||||
- Ensure audio duration matches lyrics timing
|
||||
- Check LRC format is valid (use validator)
|
||||
- Verify current_time prop is updated correctly
|
||||
|
||||
### Performance
|
||||
- Monitor cache hit rate: `/api/audio/lyrics-cache/stats/`
|
||||
- Clear old not_found entries regularly
|
||||
- Adjust Celery beat schedule if needed
|
||||
|
||||
## Credits
|
||||
|
||||
- **LRCLIB API**: https://lrclib.net/
|
||||
- **LRC Format**: https://en.wikipedia.org/wiki/LRC_(file_format)
|
||||
- **Inspiration**: lrcget project by tranxuanthang
|
||||
|
||||
## License
|
||||
|
||||
This feature is part of SoundWave and follows the same MIT license.
|
||||
Loading…
Add table
Add a link
Reference in a new issue