zsxkib/yolo-world 🔢📝🖼️ → 🖼️
About
Real-Time Open-Vocabulary Object Detection
Example Output
Output
Performance Metrics
4.34s
Prediction Time
98.93s
Total Time
All Input Parameters
{
"nms_thr": 0.5,
"score_thr": 0.05,
"class_names": "dog, eye, tongue, ear, leash, backpack, person, nose",
"input_media": "https://replicate.delivery/pbxt/KOJpWfZmaP6tUv8fqR2n0z3FdBhtytoP5llaecrvvez0p4LE/dog.jpeg",
"return_json": false,
"max_num_boxes": 100
}
Input Parameters
- nms_thr
- NMS threshold
- score_thr
- Score threshold for displaying bounding boxes
- class_names
- Enter the classes to be detected, separated by comma
- input_media (required)
- Path to the input image or video
- max_num_boxes
- Maximum number of bounding boxes to display
Output Schema
Output
Example Execution Logs
/root/.pyenv/versions/3.10.12/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Version Details
- Version ID
07aee09fc38bc4459409caa872ea416717712f4e6e875f8751a0d0d5bbea902f- Version Created
- February 12, 2024