- Full PWA support with offline capabilities - Comprehensive search across songs, playlists, and channels - Offline playlist manager with download tracking - Pre-built frontend for zero-build deployment - Docker-based deployment with docker compose - Material-UI dark theme interface - YouTube audio download and management - Multi-user authentication support
6.8 KiB
6.8 KiB
Lyrics Feature - SoundWave
Overview
SoundWave now includes automatic lyrics fetching and synchronized display powered by the LRCLIB API. This feature provides:
- Automatic lyrics fetching for newly downloaded audio
- Synchronized lyrics display with real-time highlighting
- Caching system to minimize API requests
- Background polling to gradually build lyrics library
- Manual controls for fetching, updating, and managing lyrics
How It Works
1. Automatic Fetching
When you download an audio file, SoundWave automatically:
- Extracts metadata (title, artist, duration)
- Queries LRCLIB API for lyrics
- Stores synchronized (.lrc) or plain text lyrics
- Caches results to avoid duplicate requests
2. Background Polling
A Celery beat schedule runs periodic tasks:
- Every hour: Auto-fetch lyrics for up to 50 tracks without lyrics
- Weekly (Sunday 3 AM): Clean up old cache entries (30+ days)
- Weekly (Sunday 4 AM): Retry failed lyrics fetches (7+ days old)
3. Smart Caching
Two-level caching system:
- Django Cache: In-memory cache for API responses (7 days)
- Database Cache:
LyricsCachetable stores lyrics by title/artist/duration
This ensures:
- Minimal API requests (respecting rate limits)
- Fast lyrics retrieval
- Shared cache across tracks with same metadata
API Endpoints
Get Lyrics for Audio
GET /api/audio/{youtube_id}/lyrics/
Returns lyrics data or triggers async fetch if not attempted.
Manually Fetch Lyrics
POST /api/audio/{youtube_id}/lyrics/fetch/
Body: { "force": true }
Forces immediate lyrics fetch from LRCLIB API.
Update Lyrics Manually
PUT /api/audio/{youtube_id}/lyrics/
Body: {
"synced_lyrics": "[00:12.00]Lyrics text...",
"plain_lyrics": "Plain text lyrics...",
"is_instrumental": false,
"language": "en"
}
Delete Lyrics
DELETE /api/audio/{youtube_id}/lyrics/
Batch Fetch
POST /api/audio/lyrics/fetch_batch/
Body: { "youtube_ids": ["abc123", "def456"] }
Fetch All Missing
POST /api/audio/lyrics/fetch_all_missing/
Body: { "limit": 50 }
Statistics
GET /api/audio/lyrics/stats/
Returns:
{
"total_audio": 1250,
"total_lyrics_attempted": 980,
"with_synced_lyrics": 720,
"with_plain_lyrics": 150,
"instrumental": 30,
"failed": 80,
"coverage_percentage": 72.0
}
Frontend Components
LyricsPlayer Component
import LyricsPlayer from '@/components/LyricsPlayer';
<LyricsPlayer
youtubeId="abc123"
currentTime={45.2}
onClose={() => setShowLyrics(false)}
embedded={false}
/>
Features:
- Real-time synchronized highlighting
- Auto-scroll with toggle
- Synced/Plain text tabs
- Retry fetch button
- Instrumental detection
Props
youtubeId: YouTube video IDcurrentTime: Current playback time in secondsonClose: Callback when closed (optional)embedded: Compact mode flag (optional)
Database Models
Lyrics Model
class Lyrics(models.Model):
audio = OneToOneField(Audio)
synced_lyrics = TextField()
plain_lyrics = TextField()
is_instrumental = BooleanField()
source = CharField() # 'lrclib', 'genius', 'manual'
language = CharField()
fetched_date = DateTimeField()
fetch_attempted = BooleanField()
fetch_attempts = IntegerField()
last_error = TextField()
LyricsCache Model
class LyricsCache(models.Model):
title = CharField()
artist_name = CharField()
album_name = CharField()
duration = IntegerField()
synced_lyrics = TextField()
plain_lyrics = TextField()
is_instrumental = BooleanField()
language = CharField()
source = CharField()
cached_date = DateTimeField()
last_accessed = DateTimeField()
access_count = IntegerField()
not_found = BooleanField()
Celery Tasks
fetch_lyrics_for_audio
from audio.tasks_lyrics import fetch_lyrics_for_audio
fetch_lyrics_for_audio.delay('youtube_id', force=False)
fetch_lyrics_batch
from audio.tasks_lyrics import fetch_lyrics_batch
fetch_lyrics_batch.delay(['id1', 'id2', 'id3'], delay_seconds=2)
auto_fetch_lyrics
from audio.tasks_lyrics import auto_fetch_lyrics
auto_fetch_lyrics.delay(limit=50, max_attempts=3)
cleanup_lyrics_cache
from audio.tasks_lyrics import cleanup_lyrics_cache
cleanup_lyrics_cache.delay(days_old=30)
refetch_failed_lyrics
from audio.tasks_lyrics import refetch_failed_lyrics
refetch_failed_lyrics.delay(days_old=7, limit=20)
Configuration
Celery Beat Schedule
Located in backend/config/celery.py:
app.conf.beat_schedule = {
'auto-fetch-lyrics': {
'task': 'audio.auto_fetch_lyrics',
'schedule': crontab(minute=0), # Every hour
'kwargs': {'limit': 50, 'max_attempts': 3},
},
# ... more tasks
}
LRCLIB Instance
Default: https://lrclib.net
To use custom instance:
from audio.lyrics_service import LyricsService
service = LyricsService(lrclib_instance='https://custom.lrclib.net')
LRC Format
Synchronized lyrics use the LRC format:
[ar: Artist Name]
[ti: Song Title]
[al: Album Name]
[00:12.00]First line of lyrics
[00:15.50]Second line of lyrics
[00:18.20]Third line of lyrics
Timestamps format: [mm:ss.xx]
mm: Minutes (2 digits)ss: Seconds (2 digits)xx: Centiseconds (2 digits)
Admin Interface
Django Admin provides:
Lyrics Admin
- List view with filters (source, language, fetch status)
- Search by audio title/channel/youtube_id
- Edit synced/plain lyrics
- View fetch attempts and errors
LyricsCache Admin
- List view with filters (source, not_found, date)
- Search by title/artist
- View access count statistics
- Bulk action: Clear not_found entries
Rate Limiting
To avoid overwhelming LRCLIB API:
- Request delays: 1-2 second delays between batch requests
- Caching: 7-day cache for successful fetches, 1-day for not_found
- Max attempts: Stop after 3-5 failed attempts
- Retry backoff: Wait 7+ days before retrying failed fetches
Troubleshooting
No lyrics found
- Check if track metadata (title, artist) is accurate
- Try manual fetch with force=true
- Check LRCLIB database has lyrics for this track
- Verify track isn't instrumental
Sync issues
- Ensure audio duration matches lyrics timing
- Check LRC format is valid (use validator)
- Verify current_time prop is updated correctly
Performance
- Monitor cache hit rate:
/api/audio/lyrics-cache/stats/ - Clear old not_found entries regularly
- Adjust Celery beat schedule if needed
Credits
- LRCLIB API: https://lrclib.net/
- LRC Format: https://en.wikipedia.org/wiki/LRC_(file_format)
- Inspiration: lrcget project by tranxuanthang
License
This feature is part of SoundWave and follows the same MIT license.