cjwbw/seamless_communication ❓📝🖼️🔢 → ❓

▶️ 119.4K runs 📅 Sep 2023 ⚙️ Cog 0.8.3 🔗 GitHub 📄 Paper ⚖️ License

speech-to-text speech-translation text-to-speech

Performance

5.1sTypical run time

~85sCold start (first call)

119.4KTotal runs

About

SeamlessM4T—Massively Multilingual & Multimodal Machine Translation

Example Output

Output

{"text_output":"Le modèle M4T sans faille de MetaAI démocratise la communication parlée à travers les barrières linguistiques.","audio_output":"https://pbxt.replicate.delivery/HSuWXcQA0PbWIBQ3renLWSxbFBKcsCQojUtAQK6eys5038jRA/out.wav"}

Performance Metrics

5.10s Prediction Time

84.66s Total Time

All Input Parameters

{
  "task_name": "S2ST (Speech to Speech translation)",
  "input_audio": "https://replicate.delivery/pbxt/JWSAJpKxUszI0scNYatExIXZX2rJ78UBilGXCTq4Ct9BDwTA/sample_input_2.mp3",
  "input_text_language": "None",
  "max_input_audio_length": 60,
  "target_language_text_only": "Norwegian Nynorsk",
  "target_language_with_speech": "French"
}

Input Parameters

task_name Default: S2ST (Speech to Speech translation): Choose a task
input_text Type: string: Provide input for tasks with text: T2ST and T2TT.
input_audio Type: string: Provide input file for tasks with speech input: S2ST, S2TT and ASR.
input_text_language Default: None: Specify language of the input_text for T2ST and T2TT
max_input_audio_length Type: numberDefault: 60: Set maximum input audio length.
target_language_text_only Default: Norwegian Nynorsk: Set target language for tasks with text output only: S2TT, T2TT and ASR.
target_language_with_speech Default: French: Set target language for tasks with speech output: S2ST or T2ST. Less languages are available for speech compared to text output.

Output Schema

Output

Version Details

Version ID: 668a4fec05a887143e5fe8d45df25ec4c794dd43169b9a11562309b2d45873b0
Version Created: September 13, 2023

Run on Replicate →