usamaehsan/qwen-image-edit-fastest-2 🔢🖼️📝 → 🖼️

▶️ 24.2K runs 📅 Nov 2025 ⚙️ Cog 0.16.9 🔗 GitHub
image-editing image-to-image

About

400+ images in 1$ 1024 resolution// 2s /img on L40S GPU, check description and examples

Example Output

Prompt:

"make him A hyper-realistic, close-up portrait of a tribal elder from the Omo Valley, painted with intricate white chalk patterns and adorned with a headdress made of dried flowers, seed pods, and rusted bottle caps. The focus is razor-sharp on the texture of the skin, showing every pore, wrinkle, and scar that tells a story of survival. The background is a blurred, smoky hut interior, with the warm glow of a cooking fire reflecting in the subject's dark, soulful eyes. Shot on a Leica M6 with Kodak Portra 400 film grain aesthetic."

Output

Example output

Performance Metrics

2.95s Prediction Time
2.97s Total Time
All Input Parameters
{
  "image": [
    "https://replicate.delivery/pbxt/O8wtmSTmW2Sxxi0S0m80qnCXIhoUHqpsvXbvoDEwUYMaa7Ja/WhatsApp%20Image%202025-11-26%20at%205.30.39%20AM.jpeg"
  ],
  "width": 1024,
  "height": 1024,
  "prompt": "make him A hyper-realistic, close-up portrait of a tribal elder from the Omo Valley, painted with intricate white chalk patterns and adorned with a headdress made of dried flowers, seed pods, and rusted bottle caps. The focus is razor-sharp on the texture of the skin, showing every pore, wrinkle, and scar that tells a story of survival. The background is a blurred, smoky hut interior, with the warm glow of a cooking fire reflecting in the subject's dark, soulful eyes. Shot on a Leica M6 with Kodak Portra 400 film grain aesthetic.",
  "guidance_scale": 1,
  "true_cfg_scale": 1,
  "negative_prompt": " ",
  "num_inference_steps": 2,
  "num_images_per_prompt": 1
}
Input Parameters
seed Type: integer
Random seed
image (required) Type: array
Input image(s). Single image for editing, or 2-4 images for composition
width Type: integerDefault: 0Range: 0 - 2048
Output width (0 = use input size). Recommended: 512 or 768 for speed
height Type: integerDefault: 0Range: 0 - 2048
Output height (0 = use input size). Recommended: 512 or 768 for speed
prompt Type: stringDefault: Remove the background
Text prompt describing the edit or composition
guidance_scale Type: numberDefault: 0Range: 0 - 20
Guidance scale (for composition)
true_cfg_scale Type: numberDefault: 1Range: 1 - 20
True CFG scale. Recommended: 1.0-2.0
negative_prompt Type: stringDefault:
Negative prompt (things to avoid)
num_inference_steps Type: integerRange: 1 - 50
Number of steps. Using 2 for maximum speed. Can use 2-4
num_images_per_prompt Type: integerDefault: 1Range: 1 - 4
Number of images to generate
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
============================================================
PREDICT: Starting prediction
============================================================
✓ Received 1 image(s)
Loading images...
Image 1: 150x150 (0.001s)
✓ Loaded in 0.00s
------------------------------------------------------------
PARAMETERS:
------------------------------------------------------------
Mode: EDIT
Prompt: 'make him A hyper-realistic, close-up portrait of a tribal elder from the Omo Valley, painted with intricate white chalk patterns and adorned with a headdress made of dried flowers, seed pods, and rusted bottle caps. The focus is razor-sharp on the texture of the skin, showing every pore, wrinkle, and scar that tells a story of survival. The background is a blurred, smoky hut interior, with the warm glow of a cooking fire reflecting in the subject's dark, soulful eyes. Shot on a Leica M6 with Kodak Portra 400 film grain aesthetic.'
Seed: 61344
Steps: 2 (model optimized for 4)
Resolution: 1024x1024
True CFG: 1.0
Guidance: 1.0
------------------------------------------------------------
INFERENCE: Running Nunchaku-optimized model...
------------------------------------------------------------
negative_prompt is passed but classifier-free guidance is not enabled since true_cfg_scale <= 1
guidance_scale is passed as 1.0, but ignored since the model is not guidance-distilled.
✓ Inference completed in 2.10s
(1.050s per step)
Saving output...
✓ Saved: 1024x1024
============================================================
PREDICT COMPLETE: 2.42s
- Image loading: 0.00s
- Inference: 2.10s (1.050s/step)
- Saving: 0.32s
============================================================
Version Details
Version ID
972e2beef7079fb6d2f9ea53131c4aa15938d72b1f78dbcd429d8dcab826f01e
Version Created
December 1, 2025
Run on Replicate →