dribnet/clipit 📝🔢 → ❓

▶️ 6.7K runs 📅 Sep 2021 ⚙️ Cog 0.1.3+shimmed 🔗 GitHub ⚖️ License
clip customizable-quality image-generation iterative-display text-to-image vqgan

About

Image generation with CLIP + VQGAN / PixelDraw

Example Output

Output

[object Object]

Performance Metrics

All Input Parameters
{
  "aspect": "widescreen",
  "prompts": "sunset river snow mountain",
  "quality": "draft"
}
Input Parameters
aspect Type: stringDefault: widescreen
widescreen or square aspect
prompts Type: stringDefault: sunset in the city
Text prompts
quality Type: stringDefault: normal
quality
display_every Type: integerDefault: 20
Display image iterations. For reference, the total number of iterations is determined by the quality chosen above: draft=200, normal=300, better and best=500
Output Schema

Type: arrayItems Type: object

Example Execution Logs
Working with z of shape (1, 256, 16, 16) = 65536 dimensions.
Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to /root/.cache/torch/hub/checkpoints/vgg16-397923af.pth

  0%|          | 0.00/528M [00:00<?, ?B/s]

  1%|          | 3.15M/528M [00:00<00:16, 33.0MB/s]

  1%|1         | 5.95M/528M [00:00<00:17, 31.8MB/s]

  4%|4         | 22.4M/528M [00:00<00:12, 42.1MB/s]

  9%|8         | 44.9M/528M [00:00<00:09, 55.9MB/s]

 13%|#3        | 69.2M/528M [00:00<00:06, 73.0MB/s]

 17%|#7        | 91.7M/528M [00:00<00:04, 92.1MB/s]

 21%|##        | 109M/528M [00:00<00:04, 103MB/s]

 25%|##4       | 130M/528M [00:00<00:03, 123MB/s]

 29%|##8       | 152M/528M [00:00<00:02, 143MB/s]

 34%|###3      | 177M/528M [00:01<00:02, 165MB/s]

 38%|###8      | 203M/528M [00:01<00:01, 187MB/s]

 43%|####3     | 228M/528M [00:01<00:01, 204MB/s]

 48%|####7     | 252M/528M [00:01<00:01, 217MB/s]

 52%|#####2    | 275M/528M [00:01<00:01, 215MB/s]

 57%|#####6    | 299M/528M [00:01<00:01, 225MB/s]

 61%|######1   | 322M/528M [00:01<00:00, 228MB/s]

 65%|######5   | 345M/528M [00:01<00:00, 215MB/s]

 70%|######9   | 369M/528M [00:01<00:00, 225MB/s]

 74%|#######4  | 391M/528M [00:02<00:00, 202MB/s]

 78%|#######7  | 412M/528M [00:02<00:00, 157MB/s]

 83%|########2 | 437M/528M [00:02<00:00, 179MB/s]

 88%|########7 | 463M/528M [00:02<00:00, 200MB/s]

 93%|#########2| 490M/528M [00:02<00:00, 218MB/s]

 97%|#########7| 513M/528M [00:02<00:00, 219MB/s]

100%|##########| 528M/528M [00:02<00:00, 205MB/s]
loaded pretrained LPIPS loss from taming/modules/autoencoder/lpips/vgg.pth
VQLPIPSWithDiscriminator running with hinge loss.
Restored from models/vqgan_imagenet_f16_16384.ckpt

  0%|                                               | 0.00/338M [00:00<?, ?iB/s]

  0%|1                                     | 1.12M/338M [00:00<00:30, 11.6MiB/s]

  5%|#7                                    | 15.7M/338M [00:00<00:21, 16.0MiB/s]

 12%|####4                                 | 39.6M/338M [00:00<00:14, 22.3MiB/s]

 19%|#######                               | 62.8M/338M [00:00<00:09, 30.6MiB/s]

 23%|########8                             | 79.0M/338M [00:00<00:06, 40.6MiB/s]

 30%|###########2                          | 99.8M/338M [00:00<00:04, 53.7MiB/s]

 36%|##############1                        | 122M/338M [00:00<00:03, 69.8MiB/s]

 43%|################6                      | 144M/338M [00:00<00:02, 88.2MiB/s]

 48%|###################3                    | 164M/338M [00:00<00:01, 107MiB/s]

 54%|#####################6                  | 183M/338M [00:01<00:01, 123MiB/s]

 60%|########################1               | 204M/338M [00:01<00:00, 142MiB/s]

 68%|###########################             | 228M/338M [00:01<00:00, 164MiB/s]

 74%|#############################7          | 251M/338M [00:01<00:00, 180MiB/s]

 81%|################################2       | 272M/338M [00:01<00:00, 190MiB/s]

 87%|##################################7     | 294M/338M [00:01<00:00, 185MiB/s]

 93%|#####################################3  | 315M/338M [00:01<00:00, 196MiB/s]

100%|#######################################8| 336M/338M [00:01<00:00, 203MiB/s]

100%|########################################| 338M/338M [00:01<00:00, 203MiB/s]
Using device:
cuda:0
Optimising using:
Adam
Using text prompts:
['sunset river snow mountain']
Using seed:
14723532543356836858

0it [00:00, ?it/s]
/root/.pyenv/versions/3.8.12/lib/python3.8/site-packages/torch/nn/functional.py:3451: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  warnings.warn(
iter: 0, loss: 0.952009, losses: 0.952009

0it [00:00, ?it/s]

1it [00:00,  3.26it/s]

2it [00:00,  3.60it/s]

3it [00:00,  3.89it/s]

4it [00:00,  4.11it/s]

5it [00:01,  4.31it/s]

6it [00:01,  4.45it/s]

7it [00:01,  4.57it/s]

8it [00:01,  4.65it/s]

9it [00:01,  4.73it/s]

10it [00:02,  4.78it/s]
iter: 10, loss: 0.872961, losses: 0.872961

10it [00:02,  4.78it/s]

11it [00:02,  4.13it/s]

12it [00:02,  4.35it/s]

13it [00:02,  4.50it/s]

14it [00:03,  4.62it/s]

15it [00:03,  4.66it/s]

16it [00:03,  4.71it/s]

17it [00:03,  4.76it/s]

18it [00:03,  4.81it/s]

19it [00:04,  4.78it/s]

20it [00:04,  4.83it/s]
iter: 20, loss: 0.816933, losses: 0.816933

20it [00:04,  4.83it/s]

21it [00:05,  2.72it/s]

22it [00:05,  3.13it/s]

23it [00:05,  3.52it/s]

24it [00:05,  3.82it/s]

25it [00:05,  4.06it/s]

26it [00:06,  4.29it/s]

27it [00:06,  4.49it/s]

28it [00:06,  4.58it/s]

29it [00:06,  4.67it/s]

30it [00:06,  4.76it/s]
iter: 30, loss: 0.793794, losses: 0.793794

30it [00:07,  4.76it/s]

31it [00:07,  2.49it/s]

32it [00:07,  2.93it/s]

33it [00:08,  3.33it/s]

34it [00:08,  3.67it/s]

35it [00:08,  4.00it/s]

36it [00:08,  4.23it/s]

37it [00:08,  4.38it/s]

38it [00:09,  4.53it/s]

39it [00:09,  4.61it/s]

40it [00:09,  4.69it/s]
iter: 40, loss: 0.788617, losses: 0.788617

40it [00:09,  4.69it/s]

41it [00:11,  1.19it/s]

42it [00:12,  1.51it/s]

43it [00:12,  1.90it/s]

44it [00:12,  2.32it/s]

45it [00:12,  2.74it/s]

46it [00:12,  3.16it/s]

47it [00:13,  3.51it/s]

48it [00:13,  3.82it/s]

49it [00:13,  4.06it/s]

50it [00:13,  4.16it/s]
iter: 50, loss: 0.800087, losses: 0.800087

50it [00:13,  4.16it/s]

51it [00:14,  3.83it/s]

52it [00:14,  4.07it/s]

53it [00:14,  4.27it/s]

54it [00:14,  4.40it/s]

55it [00:14,  4.52it/s]

56it [00:15,  4.59it/s]

57it [00:15,  4.64it/s]

58it [00:15,  4.68it/s]

59it [00:15,  4.75it/s]

60it [00:16,  4.76it/s]
iter: 60, loss: 0.798107, losses: 0.798107

60it [00:16,  4.76it/s]

61it [00:16,  3.10it/s]

62it [00:16,  3.51it/s]

63it [00:17,  3.80it/s]

64it [00:17,  4.08it/s]

65it [00:17,  4.31it/s]

66it [00:17,  4.49it/s]

67it [00:17,  4.61it/s]

68it [00:18,  4.70it/s]

69it [00:18,  4.70it/s]

70it [00:18,  4.71it/s]
iter: 70, loss: 0.797794, losses: 0.797794

70it [00:18,  4.71it/s]

71it [00:20,  1.22it/s]

72it [00:20,  1.53it/s]

73it [00:21,  1.92it/s]

74it [00:21,  2.33it/s]

75it [00:21,  2.76it/s]

76it [00:21,  3.16it/s]

77it [00:22,  3.49it/s]

78it [00:22,  3.81it/s]

79it [00:22,  4.06it/s]

80it [00:22,  4.29it/s]
iter: 80, loss: 0.784545, losses: 0.784545

80it [00:22,  4.29it/s]

81it [00:24,  1.26it/s]

82it [00:25,  1.57it/s]

83it [00:25,  1.97it/s]

84it [00:25,  2.39it/s]

85it [00:25,  2.81it/s]

86it [00:25,  3.22it/s]

87it [00:26,  3.57it/s]

88it [00:26,  3.86it/s]

89it [00:26,  4.09it/s]

90it [00:26,  4.28it/s]
iter: 90, loss: 0.784368, losses: 0.784368

90it [00:26,  4.28it/s]

91it [00:27,  1.86it/s]

92it [00:28,  2.30it/s]

93it [00:28,  2.72it/s]

94it [00:28,  3.14it/s]

95it [00:28,  3.49it/s]

96it [00:28,  3.79it/s]

97it [00:29,  4.01it/s]

98it [00:29,  4.24it/s]

99it [00:29,  4.39it/s]

100it [00:29,  4.54it/s]
iter: 100, loss: 0.778032, losses: 0.778032

100it [00:29,  4.54it/s]

101it [00:32,  1.22it/s]

102it [00:32,  1.52it/s]

103it [00:32,  1.91it/s]

104it [00:32,  2.34it/s]

105it [00:32,  2.76it/s]

106it [00:33,  3.19it/s]

107it [00:33,  3.53it/s]

108it [00:33,  3.87it/s]

109it [00:33,  4.12it/s]

110it [00:33,  4.29it/s]
iter: 110, loss: 0.766626, losses: 0.766626

110it [00:34,  4.29it/s]

111it [00:34,  4.14it/s]

112it [00:34,  4.32it/s]

113it [00:34,  4.42it/s]

114it [00:34,  4.59it/s]

115it [00:35,  4.61it/s]

116it [00:35,  4.72it/s]

117it [00:35,  4.77it/s]

118it [00:35,  4.77it/s]

119it [00:35,  4.83it/s]

120it [00:36,  4.80it/s]
iter: 120, loss: 0.776433, losses: 0.776433

120it [00:36,  4.80it/s]

121it [00:36,  3.05it/s]

122it [00:36,  3.42it/s]

123it [00:37,  3.74it/s]

124it [00:37,  3.98it/s]

125it [00:37,  4.20it/s]

126it [00:37,  4.33it/s]

127it [00:37,  4.44it/s]

128it [00:38,  4.50it/s]

129it [00:38,  4.58it/s]

130it [00:38,  4.67it/s]
iter: 130, loss: 0.783198, losses: 0.783198

130it [00:38,  4.67it/s]

131it [00:40,  1.23it/s]

132it [00:41,  1.56it/s]

133it [00:41,  1.96it/s]

134it [00:41,  2.39it/s]

135it [00:41,  2.82it/s]

136it [00:41,  3.20it/s]

137it [00:42,  3.53it/s]

138it [00:42,  3.82it/s]

139it [00:42,  4.06it/s]

140it [00:42,  4.26it/s]
iter: 140, loss: 0.770599, losses: 0.770599

140it [00:42,  4.26it/s]

141it [00:44,  1.28it/s]

142it [00:44,  1.63it/s]

143it [00:45,  2.03it/s]

144it [00:45,  2.43it/s]

145it [00:45,  2.83it/s]

146it [00:45,  3.20it/s]

147it [00:46,  3.53it/s]

148it [00:46,  3.81it/s]

149it [00:46,  4.01it/s]

150it [00:46,  4.22it/s]
iter: 150, loss: 0.75432, losses: 0.75432

150it [00:46,  4.22it/s]

151it [00:48,  1.28it/s]

152it [00:48,  1.60it/s]

153it [00:49,  2.00it/s]

154it [00:49,  2.43it/s]

155it [00:49,  2.84it/s]

156it [00:49,  3.22it/s]

157it [00:50,  3.57it/s]

158it [00:50,  3.84it/s]

159it [00:50,  4.02it/s]

160it [00:50,  4.20it/s]
iter: 160, loss: 0.766017, losses: 0.766017

160it [00:50,  4.20it/s]

161it [00:51,  2.33it/s]

162it [00:51,  2.75it/s]

163it [00:51,  3.15it/s]

164it [00:52,  3.48it/s]

165it [00:52,  3.77it/s]

166it [00:52,  4.03it/s]

167it [00:52,  4.25it/s]

168it [00:53,  4.35it/s]

169it [00:53,  4.47it/s]

170it [00:53,  4.54it/s]
iter: 170, loss: 0.769404, losses: 0.769404

170it [00:53,  4.54it/s]

171it [00:53,  4.33it/s]

172it [00:53,  4.41it/s]

173it [00:54,  4.55it/s]

174it [00:54,  4.59it/s]

175it [00:54,  4.66it/s]

176it [00:54,  4.65it/s]

177it [00:55,  4.69it/s]

178it [00:55,  4.72it/s]

179it [00:55,  4.75it/s]

180it [00:55,  4.72it/s]
iter: 180, loss: 0.772921, losses: 0.772921

180it [00:55,  4.72it/s]

181it [00:55,  4.44it/s]

182it [00:56,  4.54it/s]

183it [00:56,  4.57it/s]

184it [00:56,  4.62it/s]

185it [00:56,  4.62it/s]

186it [00:56,  4.65it/s]

187it [00:57,  4.68it/s]

188it [00:57,  4.72it/s]

189it [00:57,  4.68it/s]

190it [00:57,  4.71it/s]
iter: 190, loss: 0.749382, losses: 0.749382

190it [00:57,  4.71it/s]

191it [00:58,  4.49it/s]

192it [00:58,  4.52it/s]

193it [00:58,  4.60it/s]

194it [00:58,  4.66it/s]

195it [00:58,  4.64it/s]

196it [00:59,  4.67it/s]

197it [00:59,  4.68it/s]

198it [00:59,  4.69it/s]

199it [00:59,  4.68it/s]

200it [00:59,  4.68it/s]
iter: 200, loss: 0.770398, losses: 0.770398

200it [01:00,  4.68it/s]

200it [01:00,  3.32it/s]
Version Details
Version ID
fce706432e4003efc9e6a62c60631a90fb87a1aa121f8396f2f602a1c46e3676
Version Created
September 28, 2021
Run on Replicate →