light770/qwen3-reranker-0.6b 📝🔢 → ❓

▶️ 7 runs 📅 Feb 2026 ⚙️ Cog 0.16.11
information-retrieval rag reranking

About

Compact Powerhouse for Vector Reranking

Example Output

Output

{"count":2,"scores":[0.9990234375,0.0013799667358398438],"ranked_indices":[0,1]}

Performance Metrics

0.32s Prediction Time
227.64s Total Time
All Input Parameters
{
  "query": "What is the capital of China?",
  "documents": [
    "The capital of China is Beijing.",
    "Paris is the capital of France."
  ],
  "batch_size": 8,
  "max_length": 8192
}
Input Parameters
query (required) Type: string
documents (required) Type: array
batch_size Type: integerDefault: 8
max_length Type: integerDefault: 8192
instruction Type: string
Output Schema

Output

Type: object

Example Execution Logs
You're using a Qwen2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
/root/.pyenv/versions/3.11.14/lib/python3.11/site-packages/transformers/tokenization_utils_base.py:2919: UserWarning: `max_length` is ignored when `padding`=`True` and there is no truncation strategy. To pad to max length, use `padding='max_length'`.
warnings.warn(
Version Details
Version ID
7a8945d590f05c389afe3e8c024cc10632d89ea88cd3b62ebef30e7763a1b7bb
Version Created
February 17, 2026
Run on Replicate →