adymaharana/story-dalle 🔢❓📝✓ → 🖼️
About
A model trained for the task of story visualization; generating images to pair with captions in a story.

Example Output
Output

Performance Metrics
17.65s
Prediction Time
17.69s
Total Time
All Input Parameters
{ "top_k": 32, "top_p": 0.2, "source": "Pororo", "caption_1": "Pororo is in a party.", "caption_2": "Pororo is singing a song on the stage in the party", "caption_3": "Poby is cheering in the audience", "caption_4": "Crong is dancing in the party", "n_candidates": 4 }
Input Parameters
- top_k
- the number of highest probability vocabulary tokens to keep for top-k-filtering
- top_p
- Only the most probable tokens with probabilities that add up to `top_p` or higher are kept for generation
- source
- The main character of your story
- caption_1
- First scene in your story
- caption_2
- Second scene in your story
- caption_3
- Third scene in your story
- caption_4
- Final scene in your story
- n_candidates
- Num candidates to generate for each story panel
- supercondition
- Set `supercondition` to True to enable generation using a null hypothesis.
Output Schema
Output
Example Execution Logs
['Pororo is in a party.', 'Pororo is singing a song on the stage in the party.', 'Poby is cheering in the audience.', 'Crong is dancing in the party.'] [1, 1, 1, 1] Pororo 4 Pororo is in a party. Pororo is singing a song on the stage in the party. Poby is cheering in the audience. Crong is dancing in the party. ['Pororo is in a party.', 'Pororo is singing a song on the stage in a party.', 'Poby is cheering in the audience.', 'Crong is dancing in a party.'] torch.Size([16, 64]) torch.Size([16, 256]) torch.Size([16, 1, 1536]) torch.Size([16, 1]) 0%| | 0/256 [00:00<?, ?it/s] 1%| | 2/256 [00:00<00:17, 14.21it/s] 2%|▏ | 4/256 [00:00<00:16, 15.75it/s] 2%|▏ | 6/256 [00:00<00:15, 16.16it/s] 3%|▎ | 8/256 [00:00<00:15, 16.34it/s] 4%|▍ | 10/256 [00:00<00:15, 16.36it/s] 5%|▍ | 12/256 [00:00<00:15, 16.21it/s] 5%|▌ | 14/256 [00:00<00:14, 16.21it/s] 6%|▋ | 16/256 [00:00<00:14, 16.28it/s] 7%|▋ | 18/256 [00:01<00:14, 16.45it/s] 8%|▊ | 20/256 [00:01<00:14, 16.54it/s] 9%|▊ | 22/256 [00:01<00:14, 16.58it/s] 9%|▉ | 24/256 [00:01<00:14, 16.53it/s] 10%|█ | 26/256 [00:01<00:13, 16.51it/s] 11%|█ | 28/256 [00:01<00:13, 16.35it/s] 12%|█▏ | 30/256 [00:01<00:13, 16.29it/s] 12%|█▎ | 32/256 [00:01<00:13, 16.41it/s] 13%|█▎ | 34/256 [00:02<00:13, 16.49it/s] 14%|█▍ | 36/256 [00:02<00:13, 16.20it/s] 15%|█▍ | 38/256 [00:02<00:13, 16.33it/s] 16%|█▌ | 40/256 [00:02<00:13, 16.43it/s] 16%|█▋ | 42/256 [00:02<00:13, 16.35it/s] 17%|█▋ | 44/256 [00:02<00:12, 16.39it/s] 18%|█▊ | 46/256 [00:02<00:12, 16.33it/s] 19%|█▉ | 48/256 [00:02<00:12, 16.34it/s] 20%|█▉ | 50/256 [00:03<00:12, 16.45it/s] 20%|██ | 52/256 [00:03<00:12, 16.25it/s] 21%|██ | 54/256 [00:03<00:12, 16.40it/s] 22%|██▏ | 56/256 [00:03<00:12, 16.50it/s] 23%|██▎ | 58/256 [00:03<00:11, 16.50it/s] 23%|██▎ | 60/256 [00:03<00:11, 16.57it/s] 24%|██▍ | 62/256 [00:03<00:11, 16.62it/s] 25%|██▌ | 64/256 [00:03<00:11, 16.45it/s] 26%|██▌ | 66/256 [00:04<00:11, 16.48it/s] 27%|██▋ | 68/256 [00:04<00:11, 16.41it/s] 27%|██▋ | 70/256 [00:04<00:11, 16.28it/s] 28%|██▊ | 72/256 [00:04<00:11, 16.13it/s] 29%|██▉ | 74/256 [00:04<00:11, 16.18it/s] 30%|██▉ | 76/256 [00:04<00:11, 16.03it/s] 30%|███ | 78/256 [00:04<00:11, 16.15it/s] 31%|███▏ | 80/256 [00:04<00:10, 16.15it/s] 32%|███▏ | 82/256 [00:05<00:10, 16.21it/s] 33%|███▎ | 84/256 [00:05<00:10, 16.27it/s] 34%|███▎ | 86/256 [00:05<00:10, 16.26it/s] 34%|███▍ | 88/256 [00:05<00:10, 16.40it/s] 35%|███▌ | 90/256 [00:05<00:10, 16.23it/s] 36%|███▌ | 92/256 [00:05<00:10, 15.97it/s] 37%|███▋ | 94/256 [00:05<00:10, 16.16it/s] 38%|███▊ | 96/256 [00:05<00:09, 16.20it/s] 38%|███▊ | 98/256 [00:06<00:09, 16.28it/s] 39%|███▉ | 100/256 [00:06<00:09, 16.33it/s] 40%|███▉ | 102/256 [00:06<00:09, 16.37it/s] 41%|████ | 104/256 [00:06<00:09, 16.34it/s] 41%|████▏ | 106/256 [00:06<00:09, 16.19it/s] 42%|████▏ | 108/256 [00:06<00:09, 16.11it/s] 43%|████▎ | 110/256 [00:06<00:09, 15.97it/s] 44%|████▍ | 112/256 [00:06<00:08, 16.11it/s] 45%|████▍ | 114/256 [00:06<00:08, 16.24it/s] 45%|████▌ | 116/256 [00:07<00:08, 16.36it/s] 46%|████▌ | 118/256 [00:07<00:08, 16.05it/s] 47%|████▋ | 120/256 [00:07<00:08, 16.24it/s] 48%|████▊ | 122/256 [00:07<00:08, 16.33it/s] 48%|████▊ | 124/256 [00:07<00:08, 16.35it/s] 49%|████▉ | 126/256 [00:07<00:08, 16.11it/s] 50%|█████ | 128/256 [00:07<00:07, 16.27it/s] 51%|█████ | 130/256 [00:07<00:07, 16.16it/s] 52%|█████▏ | 132/256 [00:08<00:07, 16.31it/s] 52%|█████▏ | 134/256 [00:08<00:07, 15.99it/s] 53%|█████▎ | 136/256 [00:08<00:07, 15.92it/s] 54%|█████▍ | 138/256 [00:08<00:07, 15.93it/s] 55%|█████▍ | 140/256 [00:08<00:07, 15.90it/s] 55%|█████▌ | 142/256 [00:08<00:07, 16.01it/s] 56%|█████▋ | 144/256 [00:08<00:06, 16.15it/s] 57%|█████▋ | 146/256 [00:08<00:06, 16.31it/s] 58%|█████▊ | 148/256 [00:09<00:06, 16.41it/s] 59%|█████▊ | 150/256 [00:09<00:06, 16.21it/s] 59%|█████▉ | 152/256 [00:09<00:06, 16.37it/s] 60%|██████ | 154/256 [00:09<00:06, 16.31it/s] 61%|██████ | 156/256 [00:09<00:06, 16.28it/s] 62%|██████▏ | 158/256 [00:09<00:05, 16.43it/s] 62%|██████▎ | 160/256 [00:09<00:05, 16.49it/s] 63%|██████▎ | 162/256 [00:09<00:05, 16.47it/s] 64%|██████▍ | 164/256 [00:10<00:05, 16.47it/s] 65%|██████▍ | 166/256 [00:10<00:05, 16.47it/s] 66%|██████▌ | 168/256 [00:10<00:05, 16.52it/s] 66%|██████▋ | 170/256 [00:10<00:05, 16.51it/s] 67%|██████▋ | 172/256 [00:10<00:05, 16.42it/s] 68%|██████▊ | 174/256 [00:10<00:04, 16.41it/s] 69%|██████▉ | 176/256 [00:10<00:04, 16.37it/s] 70%|██████▉ | 178/256 [00:10<00:04, 16.29it/s] 70%|███████ | 180/256 [00:11<00:04, 16.32it/s] 71%|███████ | 182/256 [00:11<00:04, 16.40it/s] 72%|███████▏ | 184/256 [00:11<00:04, 16.44it/s] 73%|███████▎ | 186/256 [00:11<00:04, 16.47it/s] 73%|███████▎ | 188/256 [00:11<00:04, 16.33it/s] 74%|███████▍ | 190/256 [00:11<00:04, 16.28it/s] 75%|███████▌ | 192/256 [00:11<00:03, 16.10it/s] 76%|███████▌ | 194/256 [00:11<00:03, 16.02it/s] 77%|███████▋ | 196/256 [00:12<00:03, 15.53it/s] 77%|███████▋ | 198/256 [00:12<00:03, 15.38it/s] 78%|███████▊ | 200/256 [00:12<00:03, 15.58it/s] 79%|███████▉ | 202/256 [00:12<00:03, 15.55it/s] 80%|███████▉ | 204/256 [00:12<00:03, 15.34it/s] 80%|████████ | 206/256 [00:12<00:03, 15.27it/s] 81%|████████▏ | 208/256 [00:12<00:03, 15.40it/s] 82%|████████▏ | 210/256 [00:12<00:02, 15.36it/s] 83%|████████▎ | 212/256 [00:13<00:02, 15.38it/s] 84%|████████▎ | 214/256 [00:13<00:02, 15.38it/s] 84%|████████▍ | 216/256 [00:13<00:02, 15.35it/s] 85%|████████▌ | 218/256 [00:13<00:02, 15.33it/s] 86%|████████▌ | 220/256 [00:13<00:02, 15.20it/s] 87%|████████▋ | 222/256 [00:13<00:02, 15.24it/s] 88%|████████▊ | 224/256 [00:13<00:02, 15.41it/s] 88%|████████▊ | 226/256 [00:14<00:01, 15.55it/s] 89%|████████▉ | 228/256 [00:14<00:01, 15.72it/s] 90%|████████▉ | 230/256 [00:14<00:01, 15.75it/s] 91%|█████████ | 232/256 [00:14<00:01, 15.83it/s] 91%|█████████▏| 234/256 [00:14<00:01, 15.89it/s] 92%|█████████▏| 236/256 [00:14<00:01, 15.95it/s] 93%|█████████▎| 238/256 [00:14<00:01, 15.99it/s] 94%|█████████▍| 240/256 [00:14<00:00, 16.15it/s] 95%|█████████▍| 242/256 [00:14<00:00, 16.28it/s] 95%|█████████▌| 244/256 [00:15<00:00, 16.07it/s] 96%|█████████▌| 246/256 [00:15<00:00, 16.20it/s] 97%|█████████▋| 248/256 [00:15<00:00, 16.29it/s] 98%|█████████▊| 250/256 [00:15<00:00, 16.15it/s] 98%|█████████▊| 252/256 [00:15<00:00, 16.10it/s] 99%|█████████▉| 254/256 [00:15<00:00, 16.00it/s] 100%|██████████| 256/256 [00:15<00:00, 15.79it/s] 100%|██████████| 256/256 [00:15<00:00, 16.13it/s] torch.Size([16, 16, 16]) torch.Size([16, 3, 256, 256]) tensor([[[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], [[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]], [[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]]]) torch.Size([3, 1124, 1064])
Version Details
- Version ID
f74d40125d71ddd5885020201e638b6b270347bb606a6afbc61edc7b077bfb7b
- Version Created
- November 23, 2022