bytedance/bagel ๐Ÿ”ขโ“๐Ÿ–ผ๏ธ๐Ÿ“โœ“ โ†’ โ“

โ–ถ๏ธ 149.4K runs ๐Ÿ“… May 2025 โš™๏ธ Cog 0.14.9 ๐Ÿ”— GitHub ๐Ÿ“„ Paper โš–๏ธ License
image-editing image-to-text text-to-image

About

๐ŸฅฏByteDance Seed's Bagel Unified multimodal AI that generates images, edits images, and understands images in one 7B parameter model๐Ÿฅฏ

Example Output

Prompt:

"She boards a modern London Tube, quietly reading a folded newspaper, wearing the same clothes"

Output

{"text":null,"image":"https://replicate.delivery/xezq/KWWpk9oeorSXJKisnvRRdSgxif9d06XNC1zAi2sreWeJ6NelC/output.webp"}

Performance Metrics

73.93s Prediction Time
83.02s Total Time
All Input Parameters
{
  "task": "image-editing",
  "image": "https://replicate.delivery/pbxt/N3j6lENyFDxARhorZ8yY86qhIF1uMuvEMO1KytosXUxaz0EO/image.png",
  "prompt": "She boards a modern London Tube, quietly reading a folded newspaper, wearing the same clothes",
  "cfg_img_scale": 2,
  "output_format": "webp",
  "cfg_renorm_min": 1,
  "cfg_text_scale": 4,
  "output_quality": 90,
  "timestep_shift": 3,
  "cfg_renorm_type": "text_channel",
  "enable_thinking": false,
  "num_inference_steps": 50
}
Input Parameters
seed Type: integer
Random seed for reproducible results
task Default: text-to-image
Task to perform
image Type: string
Input image for editing or understanding tasks
prompt (required) Type: string
Text prompt for generation, editing, or understanding
cfg_img_scale Type: numberDefault: 1.5Range: 1 - 10
Image guidance scale for preserving input image details
output_format Default: webp
Output image format
cfg_renorm_min Type: numberDefault: 1Range: 0 - 1
Minimum CFG renorm value
cfg_text_scale Type: numberDefault: 4Range: 1 - 20
Text guidance scale for how closely to follow the prompt
output_quality Type: integerDefault: 90Range: 1 - 100
Image compression quality for lossy formats
timestep_shift Type: numberDefault: 3Range: 1 - 10
Distribution of denoising steps between composition and details
cfg_renorm_type Default: global
CFG renormalization method
enable_thinking Type: booleanDefault: false
Enable chain-of-thought reasoning for better results
num_inference_steps Type: integerDefault: 50Range: 1 - 100
Number of denoising steps
Output Schema
Example Execution Logs
[+] Using seed: 60309
[+] Loaded input image: (1534, 1968)
[+] Running image editing
[+] Processing prompt: She boards a modern London Tube, quietly reading a folded newspaper, wearing the same clothes
[+] Generated 800x1024 image saved as WEBP
Version Details
Version ID
7dd8def79e503990740db4704fa81af995d440fefe714958531d7044d2757c9c
Version Created
May 23, 2025
Run on Replicate โ†’