🎤
Scribe v2 (STT)
90+ languages, word level timestamps, speaker diarization (32 speakers), entity detection. Batch and real time WebSocket modes.
🎵
Music Generation
Text to music with genre, style, and structure control. Vocals in multiple languages. Section level editing. Up to 5 minutes.
🔊
Sound Effects
Text to sound effects. Describe what you need in natural language. Royalty free MP3 or WAV output.
🌍
Dubbing
Automatic video/audio dubbing in 29 languages. Preserves original speaker voice. Supports MP4, WAV, MOV, MP3.
🔇
Voice Isolator
Remove background noise and reverb. Accepts files up to 500MB/1 hour. WAV, MP3, FLAC, OGG, AAC inputs.
🔄
Voice Changer
Speech to speech voice transformation. Apply any voice to existing audio while preserving content and timing.