vllm.model_executor.models.interfaces_base ¶
VllmModel ¶
Bases: Protocol[T_co]
The interface required for all models in vLLM.
Source code in vllm/model_executor/models/interfaces_base.py
VllmModelForPooling ¶
Bases: VllmModel[T_co], Protocol[T_co]
The interface required for all pooling models in vLLM.
Source code in vllm/model_executor/models/interfaces_base.py
attn_type class-attribute ¶
Indicates the vllm.config.model.ModelConfig.attn_type to use by default.
You can use the vllm.model_executor.models.interfaces_base.attn_type decorator to conveniently set this field.
default_seq_pooling_type class-attribute ¶
Indicates the vllm.config.pooler.PoolerConfig.seq_pooling_type to use by default.
You can use the vllm.model_executor.models.interfaces_base.default_pooling_type decorator to conveniently set this field.
default_tok_pooling_type class-attribute ¶
Indicates the vllm.config.pooler.PoolerConfig.tok_pooling_type to use by default.
You can use the vllm.model_executor.models.interfaces_base.default_pooling_type decorator to conveniently set this field.
is_pooling_model class-attribute ¶
is_pooling_model: Literal[True] = True
A flag that indicates this model supports pooling.
Note
There is no need to redefine this flag if this class is in the MRO of your model class.
score_type class-attribute ¶
Indicates the vllm.config.model.ModelConfig.score_type to use by default.
Score API handles score/rerank for: - "score" task (score_type: cross-encoder models) - "embed" task (score_type: bi-encoder models) - "token_embed" task (score_type: late interaction models)
score_type defaults to bi-encoder, then the Score API uses the "embed" task. If you set score_type to cross-encoder via vllm.model_executor.models.interfaces.SupportsCrossEncoding, then the Score API uses the "score" task. If you set score_type to late-interaction via vllm.model_executor.models.interfaces.SupportsLateInteraction, then the Score API uses the "token_embed" task.
VllmModelForTextGeneration ¶
attn_type ¶
Decorator to set VllmModelForPooling.attn_type.
default_pooling_type ¶
default_pooling_type(
*,
seq_pooling_type: SequencePoolingType = "LAST",
tok_pooling_type: TokenPoolingType = "ALL",
)
Decorator to set VllmModelForPooling.default_*_pooling_type.