zsxkib/yolo-world 🔢📝🖼️ → 🖼️

▶️ 12.3K runs 📅 Feb 2024 ⚙️ Cog 0.8.6 🔗 GitHub 📄 Paper ⚖️ License
image-object-detection open-vocabulary

About

Real-Time Open-Vocabulary Object Detection

Example Output

Output

Performance Metrics

4.34s Prediction Time
98.93s Total Time
All Input Parameters
{
  "nms_thr": 0.5,
  "score_thr": 0.05,
  "class_names": "dog, eye, tongue, ear, leash, backpack, person, nose",
  "input_media": "https://replicate.delivery/pbxt/KOJpWfZmaP6tUv8fqR2n0z3FdBhtytoP5llaecrvvez0p4LE/dog.jpeg",
  "return_json": false,
  "max_num_boxes": 100
}
Input Parameters
nms_thr Type: numberDefault: 0.5Range: 0 - 1
NMS threshold
score_thr Type: numberDefault: 0.05Range: 0 - 1
Score threshold for displaying bounding boxes
class_names Type: stringDefault: dog, eye, tongue, ear, leash, backpack, person, nose
Enter the classes to be detected, separated by comma
input_media (required) Type: string
Path to the input image or video
max_num_boxes Type: integerDefault: 100Range: 1 - 300
Maximum number of bounding boxes to display
Output Schema

Output

Type: stringFormat: uri

Example Execution Logs
/root/.pyenv/versions/3.10.12/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.)
return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
Version Details
Version ID
07aee09fc38bc4459409caa872ea416717712f4e6e875f8751a0d0d5bbea902f
Version Created
February 12, 2024
Run on Replicate →