lucataco/bge-m3 🔢📝❓ → 📝

▶️ 279 runs 📅 Feb 2024 ⚙️ Cog 0.9.3 🔗 GitHub 📄 Paper ⚖️ License

multilingual retrieval text-embedding

About

BGE-M3, the first embedding model which supports multiple retrieval mode, multilingual and multi-granularity retrieval.

Example Output

Output

[[0.626 0.3477]
[0.3499 0.678 ]]

Performance Metrics

53.45s Prediction Time

172.98s Total Time

All Input Parameters

{
  "max_length": 4096,
  "sentences_1": "What is BGE M3?\nDefination of BM25",
  "sentences_2": "BGE M3 is an embedding model supporting dense retrieval, lexical matching and multi-vector interaction.\nBM25 is a bag-of-words retrieval function that ranks a set of documents based on the query terms appearing in each document",
  "embedding_type": "dense"
}

Input Parameters

max_length Type: integerDefault: 8192: Maximum length of the input for dense embeddings, use a smaller value to speed up the encoding process
sentences_1 (required) Type: string: Input Sentence list 1 - Each sentence should be split by a newline
sentences_2 (required) Type: string: Input Sentence list 2 - Each sentence should be split by a newline
embedding_type Default: dense: Type of embedding to use

Output Schema

Output

Type: string

Example Execution Logs

Sentences_1 split out:
['What is BGE M3?', 'Defination of BM25']
Sentences_2 split out:
['BGE M3 is an embedding model supporting dense retrieval, lexical matching and multi-vector interaction.', 'BM25 is a bag-of-words retrieval function that ranks a set of documents based on the query terms appearing in each document']
encoding:   0%|          | 0/1 [00:00<?, ?it/s]
encoding: 100%|██████████| 1/1 [00:23<00:00, 23.31s/it]
encoding: 100%|██████████| 1/1 [00:28<00:00, 28.09s/it]
encoding:   0%|          | 0/1 [00:00<?, ?it/s]
encoding: 100%|██████████| 1/1 [00:22<00:00, 22.75s/it]
encoding: 100%|██████████| 1/1 [00:25<00:00, 25.20s/it]

Version Details

Version ID: 3af6c861256a2a8e07a54a478813e6632f339f05235b59374f292f4759555bfb
Version Created: February 7, 2024

Run on Replicate →