zsxkib/yolo-world 🔢📝🖼️ → 🖼️

▶️ 12.8K runs 📅 Feb 2024 ⚙️ Cog 0.8.6 🔗 GitHub 📄 Paper ⚖️ License

image-object-detection open-vocabulary

About

Real-Time Open-Vocabulary Object Detection

Example Output

Output

{"media_path":"https://replicate.delivery/pbxt/LGmIGeYF1kRQFSw4EpqMUYUhrMsAUk849GQPYNpgSdFSNzNJA/output.png"}

Performance Metrics

4.34s Prediction Time

98.93s Total Time

All Input Parameters

{
  "nms_thr": 0.5,
  "score_thr": 0.05,
  "class_names": "dog, eye, tongue, ear, leash, backpack, person, nose",
  "input_media": "https://replicate.delivery/pbxt/KOJpWfZmaP6tUv8fqR2n0z3FdBhtytoP5llaecrvvez0p4LE/dog.jpeg",
  "return_json": false,
  "max_num_boxes": 100
}

Input Parameters

nms_thr Type: numberDefault: 0.5Range: 0 - 1: NMS threshold
score_thr Type: numberDefault: 0.05Range: 0 - 1: Score threshold for displaying bounding boxes
class_names Type: stringDefault: dog, eye, tongue, ear, leash, backpack, person, nose: Enter the classes to be detected, separated by comma
input_media (required) Type: string: Path to the input image or video
max_num_boxes Type: integerDefault: 100Range: 1 - 300: Maximum number of bounding boxes to display

Output Schema

Output

Type: string • Format: uri

Example Execution Logs

/root/.pyenv/versions/3.10.12/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.)
return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]

Version Details

Version ID: 07aee09fc38bc4459409caa872ea416717712f4e6e875f8751a0d0d5bbea902f
Version Created: February 12, 2024

Run on Replicate →