yoadtew/arithmetic 🖼️🔢📝 → ❓

▶️ 94 runs 📅 Feb 2022 ⚙️ Cog 0.1.3+shimmed
image-captioning image-to-text visual-understanding

About

Example Output

Output

[object Object]

Performance Metrics

26.75s Prediction Time
26.96s Total Time
All Input Parameters
{
  "image1": "https://replicate.delivery/mgxm/ae9ba19a-3ef0-4724-b6c5-715a0be39692/eiffle.jpg",
  "image2": "https://replicate.delivery/mgxm/9e0c882e-3c1a-40b6-b231-d032a10b5ebd/usa.jpg",
  "image3": "https://replicate.delivery/mgxm/2c259b12-8940-4502-8206-9af24e663ee2/capitol.jpg",
  "beam_size": "5",
  "cond_text": "Image of",
  "end_factors": 1.06,
  "ce_loss_scale": 0.2,
  "max_seq_lengths": "4"
}
Input Parameters
image1 Type: string
Final result will be: image1 + (image2 - image3)
image2 Type: string
Final result will be: image1 + (image2 - image3)
image3 Type: string
Final result will be: image1 + (image2 - image3)
beam_size Type: integerDefault: 3Range: 1 - 10
Number of beams to use
cond_text Type: stringDefault: Image of a
conditional text
end_factors Type: numberDefault: 1.06Range: 1 - 1.1
Higher value for shorter captions
ce_loss_scale Type: numberDefault: 0.2Range: 0 - 0.6
Scale of cross-entropy loss with un-shifted language model
max_seq_lengths Type: integerDefault: 3Range: 1 - 20
Maximum number of tokens to generate
Output Schema

Type: arrayItems Type: object

Example Execution Logs
07/02/2022 20:40:53 | [' French %% -1.5718586', ' the %% -1.7321693', ' France %% -2.646708', ' World %% -3.665781', ' world %% -4.00222']
07/02/2022 20:41:05 | [' France. %% -2.2273226', ' French. %% -2.5530052', ' the world %% -2.7489004', ' world. %% -2.9033055', ' the French %% -2.97674']
07/02/2022 20:41:16 | [' France.! %% -2.2273226', ' the world. %% -2.4138563', ' French.! %% -2.5530052', ' world.! %% -2.9033055', ' the French. %% -2.954996']
Version Details
Version ID
737d4915f4d8303b5f38cd3ecbbc310bda34b76044473de002f1e23494d86218
Version Created
February 7, 2022
Run on Replicate →