j-min/clip-caption-reward 🖼️❓ → 📝

▶️ 296.1K runs 📅 May 2022 ⚙️ Cog 0.3.4 🔗 GitHub 📄 Paper ⚖️ License
clip-reward fine-grained-captioning image-analysis image-captioning

About

Fine-grained Image Captioning with CLIP Reward

Example Output

Output

a group of people riding their bikes on the busy street with a blue sign

Performance Metrics

5.85s Prediction Time
6.00s Total Time
All Input Parameters
{
  "image": "https://replicate.delivery/mgxm/d452ef45-ce7e-4f8d-a63f-350d624fc95a/COCO_val2014_000000462565.jpeg",
  "reward": "clips_grammar"
}
Input Parameters
image (required) Type: string
Input image.
reward Default: clips_grammar
Choose a reward criterion.
Output Schema

Output

Type: string

Example Execution Logs
Loading cfg from configs/phase2/clipRN50_clips_grammar.yml
Warning: key input_clipscore_vis_dir not in args
Warning: key use_multi_rewards not in args
Warning: key use_grammar not in args
Warning: key use_grammar_baseline not in args
Warning: key clip_load_path not in args
Warning: key N_enc not in args
Warning: key N_dec not in args
Warning: key d_model not in args
Warning: key d_ff not in args
Warning: key num_att_heads not in args
Warning: key dropout not in args
Warning: key REFORWARD not in args
Warning: key precision not in args
vocab size: 9487
transformer
Loading checkpoint from save/clipRN50_clips_grammar/clipRN50_clips_grammar-last.ckpt
<All keys matched successfully>
Version Details
Version ID
de37751f75135f7ebbe62548e27d6740d5155dfefdf6447db35c9865253d7e06
Version Created
May 31, 2022
Run on Replicate →