Add perplexity-ai/pplx-embed-v1-late-0.6b model meta by wgu9 · Pull Request #4813 · embeddings-benchmark/mteb

wgu9 · 2026-06-14T14:46:34Z

Summary

Closes #4691.

Adds the ModelMeta entry for perplexity-ai/pplx-embed-v1-late-0.6b to
mteb/models/model_implementations/pylate_models.py.

It is a PyLate / ColBERT late-interaction (MaxSim) retrieval model —
continued training of perplexity-ai/pplx-embed-v1-0.6b with a token-level
128-dim projection. It loads through the same MultiVectorModel loader as the
other ColBERT entries (with trust_remote_code=True, since the repo ships
custom modeling code).

Metadata sources (model card / HF API / config)

revision 35e14f7c0f85720323342965e72f119dc1a937c0 (current main)
license mit, n_parameters 595,778,560 (safetensors total)
embed_dim 128 (1_Dense/config.json out_features, confirmed in the card)
max_tokens 512 (document_length in config_sentence_transformers.json)
adapted_from perplexity-ai/pplx-embed-v1-0.6b (HF base_model)
languages left None: the base model card only states "multilingual"
without enumerating language codes (consistent with other entries that omit
a language list rather than guessing).

Verification

from mteb.models.model_implementations import MODEL_REGISTRY
MODEL_REGISTRY["perplexity-ai/pplx-embed-v1-late-0.6b"]  # registers cleanly

ruff check / ruff format --check: clean
pytest tests/test_models/test_model_meta.py -k "name_and_revision or hashable or without_prefix": 731 passed

Closes embeddings-benchmark#4691. Adds the ModelMeta entry for perplexity-ai/pplx-embed-v1-late-0.6b, a PyLate/ColBERT late-interaction (MaxSim) embedding model with 128-dim token-level vectors, continued-trained from perplexity-ai/pplx-embed-v1-0.6b. Metadata sourced from the HF model card/API and config files (revision, MIT license, 128-dim projection, 512 document length, ~596M params).

LuuOW

Technical audit: Implementation verified for architectural consistency and engineering integrity.

Samoed · 2026-06-14T15:09:37Z

Can you run BEIR to reproduce resultls? Would be greate to have review from @bwang-pplx

LuuOW reviewed Jun 14, 2026

View reviewed changes

Samoed changed the title ~~Add perplexity-ai/pplx-embed-v1-late-0.6b model meta (#4691)~~ Add perplexity-ai/pplx-embed-v1-late-0.6b model meta Jun 14, 2026

Samoed added the new model Questions related to adding a new model to the benchmark label Jun 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add perplexity-ai/pplx-embed-v1-late-0.6b model meta#4813

Add perplexity-ai/pplx-embed-v1-late-0.6b model meta#4813
wgu9 wants to merge 1 commit into
embeddings-benchmark:mainfrom
wgu9:add-pplx-embed-late-model

wgu9 commented Jun 14, 2026

Uh oh!

LuuOW left a comment

Uh oh!

Samoed commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

wgu9 commented Jun 14, 2026

Summary

Metadata sources (model card / HF API / config)

Verification

Uh oh!

LuuOW left a comment

Choose a reason for hiding this comment

Uh oh!

Samoed commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants