j-min/clip-caption-reward 🖼️❓ → 📝
About
Fine-grained Image Captioning with CLIP Reward

Example Output
Output
a group of people riding their bikes on the busy street with a blue sign
Performance Metrics
5.85s
Prediction Time
6.00s
Total Time
All Input Parameters
{ "image": "https://replicate.delivery/mgxm/d452ef45-ce7e-4f8d-a63f-350d624fc95a/COCO_val2014_000000462565.jpeg", "reward": "clips_grammar" }
Input Parameters
- image (required)
- Input image.
- reward
- Choose a reward criterion.
Output Schema
Output
Example Execution Logs
Loading cfg from configs/phase2/clipRN50_clips_grammar.yml Warning: key input_clipscore_vis_dir not in args Warning: key use_multi_rewards not in args Warning: key use_grammar not in args Warning: key use_grammar_baseline not in args Warning: key clip_load_path not in args Warning: key N_enc not in args Warning: key N_dec not in args Warning: key d_model not in args Warning: key d_ff not in args Warning: key num_att_heads not in args Warning: key dropout not in args Warning: key REFORWARD not in args Warning: key precision not in args vocab size: 9487 transformer Loading checkpoint from save/clipRN50_clips_grammar/clipRN50_clips_grammar-last.ckpt <All keys matched successfully>
Version Details
- Version ID
de37751f75135f7ebbe62548e27d6740d5155dfefdf6447db35c9865253d7e06
- Version Created
- May 31, 2022