zsxkib/yolo-world 🔢📝🖼️ → 🖼️
About
Real-Time Open-Vocabulary Object Detection

Example Output
Output
Performance Metrics
4.34s
Prediction Time
98.93s
Total Time
All Input Parameters
{ "nms_thr": 0.5, "score_thr": 0.05, "class_names": "dog, eye, tongue, ear, leash, backpack, person, nose", "input_media": "https://replicate.delivery/pbxt/KOJpWfZmaP6tUv8fqR2n0z3FdBhtytoP5llaecrvvez0p4LE/dog.jpeg", "return_json": false, "max_num_boxes": 100 }
Input Parameters
- nms_thr
- NMS threshold
- score_thr
- Score threshold for displaying bounding boxes
- class_names
- Enter the classes to be detected, separated by comma
- input_media (required)
- Path to the input image or video
- max_num_boxes
- Maximum number of bounding boxes to display
Output Schema
Output
Example Execution Logs
/root/.pyenv/versions/3.10.12/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Version Details
- Version ID
07aee09fc38bc4459409caa872ea416717712f4e6e875f8751a0d0d5bbea902f
- Version Created
- February 12, 2024