lucataco/pheme 🔢❓📝 → 🖼️
About
Pheme generates a variety of conversational voices in 16 kHz for phone-call applications

Example Output
Prompt:
"I gotta say, I would never expect that to happen!"
Output
Performance Metrics
5.90s
Prediction Time
5.92s
Total Time
All Input Parameters
{ "top_k": 210, "voice": "POD0000004393_S0000029", "prompt": "I gotta say, I would never expect that to happen!", "temperature": 0.7 }
Input Parameters
- top_k
- Top k
- voice
- Voice to use
- prompt
- Input text
- temperature
- Temperature
Output Schema
Output
Example Execution Logs
Lightning automatically upgraded your loaded checkpoint from v1.2.7 to v2.1.2. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint checkpoints/models--pyannote--embedding/snapshots/c6335d8f1cd77b30084387468a6cf26fea90009b/pytorch_model.bin` Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.8.1+cu102, yours is 2.1.1+cu121. Bad things might happen unless you revert torch to 1.x. Lightning automatically upgraded your loaded checkpoint from v1.2.7 to v2.1.2. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint checkpoints/models--pyannote--embedding/snapshots/c6335d8f1cd77b30084387468a6cf26fea90009b/pytorch_model.bin` Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.8.1+cu102, yours is 2.1.1+cu121. Bad things might happen unless you revert torch to 1.x. decoder_input_ids: tensor([[ 1, 3, 387, 387, 301, 492, 376, 158, 127, 660, 629, 261, 27, 751, 682, 930, 1005, 1005, 563, 957, 601, 530, 530, 484, 484, 779, 210, 210, 927, 816, 79, 794, 109, 316, 316, 316, 115, 21, 836, 603, 603, 283, 826, 253, 107, 276, 276, 276, 276, 276, 276, 195, 394, 350, 82, 82, 82, 82, 570, 570, 570, 996, 996, 996, 996, 996, 820, 820, 313, 228, 228, 79, 79, 794, 994, 19, 64, 100, 100, 169, 669, 883, 883, 102, 102, 722, 942, 942, 930, 235, 235, 563, 889, 889, 785, 288, 288, 288, 288, 288, 957, 601, 740, 740, 172, 172, 497, 318, 318, 1032, 1032, 567, 567, 614, 200, 629, 229, 229, 985, 932, 232, 187, 209, 209, 488, 488, 488, 488, 512, 512, 512, 512, 790, 790, 800, 630, 836, 722, 1000, 40, 40, 55, 78, 356, 105, 105, 105, 105, 105, 342, 139, 248, 832, 832, 321, 929, 275, 275, 185, 185, 333, 333, 246, 242, 494, 126, 231, 231, 434, 646, 832, 832, 565, 565, 721, 146, 146, 512, 512, 685, 800, 630, 289, 289, 937, 792, 950, 950, 125, 125, 61, 61, 811, 235, 235, 621, 278, 452, 598, 320, 110, 840, 64, 579, 803, 696, 503, 603, 603, 283, 283, 283, 283, 283, 857, 87, 87, 43, 229, 859, 931, 931, 356, 356, 761, 761, 761, 244, 325, 463, 463, 463, 463, 223, 223, 596, 596, 502, 502, 256, 256, 834, 616, 223, 596, 596, 550, 333, 772, 996, 820, 531, 678, 147, 546, 653, 626, 721, 1024, 1024, 925, 623, 437, 302, 1032, 605, 144, 323, 323, 323, 682, 930, 235, 726, 200, 1006, 392, 96, 96, 120, 534, 564, 564, 282, 282, 626, 764, 764, 364, 624, 111, 805, 396, 396, 850, 237, 436, 432, 432, 676, 676, 676, 415, 415, 388, 388, 764, 764, 764, 412, 412, 412, 233, 233, 499, 499, 61, 61, 759, 841, 441, 836, 269, 603, 1000, 253, 253, 877, 877, 877, 482, 482, 315, 895, 731, 731, 153, 457, 937, 962, 462, 779, 210, 887, 887, 179, 229, 850, 494, 126, 875, 231, 605, 749, 1033, 969, 969, 409, 409, 648, 175, 296, 296, 869, 692, 692, 455, 168, 997, 354, 354, 585, 241, 241, 492, 492]], device='cuda:0') semantic_tokens: 1.3835923671722412 acoustic_tokens: 0.520693302154541 vocoder time: 0.03781533241271973
Version Details
- Version ID
f307b9d2b9966608aec791d4a741dc3806f95d9eb92300fdcefeb9aecd4594cd
- Version Created
- January 12, 2024