lucataco/bge-m3 🔢📝❓ → 📝

▶️ 279 runs 📅 Feb 2024 ⚙️ Cog 0.9.3 🔗 GitHub 📄 Paper ⚖️ License
multilingual retrieval text-embedding

About

BGE-M3, the first embedding model which supports multiple retrieval mode, multilingual and multi-granularity retrieval.

Example Output

Output

[[0.626 0.3477]
[0.3499 0.678 ]]

Performance Metrics

53.45s Prediction Time
172.98s Total Time
All Input Parameters
{
  "max_length": 4096,
  "sentences_1": "What is BGE M3?\nDefination of BM25",
  "sentences_2": "BGE M3 is an embedding model supporting dense retrieval, lexical matching and multi-vector interaction.\nBM25 is a bag-of-words retrieval function that ranks a set of documents based on the query terms appearing in each document",
  "embedding_type": "dense"
}
Input Parameters
max_length Type: integerDefault: 8192
Maximum length of the input for dense embeddings, use a smaller value to speed up the encoding process
sentences_1 (required) Type: string
Input Sentence list 1 - Each sentence should be split by a newline
sentences_2 (required) Type: string
Input Sentence list 2 - Each sentence should be split by a newline
embedding_type Default: dense
Type of embedding to use
Output Schema

Output

Type: string

Example Execution Logs
Sentences_1 split out:
['What is BGE M3?', 'Defination of BM25']
Sentences_2 split out:
['BGE M3 is an embedding model supporting dense retrieval, lexical matching and multi-vector interaction.', 'BM25 is a bag-of-words retrieval function that ranks a set of documents based on the query terms appearing in each document']
encoding:   0%|          | 0/1 [00:00<?, ?it/s]
encoding: 100%|██████████| 1/1 [00:23<00:00, 23.31s/it]
encoding: 100%|██████████| 1/1 [00:28<00:00, 28.09s/it]
encoding:   0%|          | 0/1 [00:00<?, ?it/s]
encoding: 100%|██████████| 1/1 [00:22<00:00, 22.75s/it]
encoding: 100%|██████████| 1/1 [00:25<00:00, 25.20s/it]
Version Details
Version ID
3af6c861256a2a8e07a54a478813e6632f339f05235b59374f292f4759555bfb
Version Created
February 7, 2024
Run on Replicate →