vllm.v1.worker.cpu_model_runner ¶
CPUModelRunner ¶
Bases: GPUModelRunner
Source code in vllm/v1/worker/cpu_model_runner.py
__init__ ¶
__init__(vllm_config: VllmConfig, device: device)
Source code in vllm/v1/worker/cpu_model_runner.py
_init_device_properties ¶
_may_reorder_batch ¶
_may_reorder_batch(
scheduler_output: SchedulerOutput,
) -> None
Source code in vllm/v1/worker/cpu_model_runner.py
_postprocess_tensors ¶
Source code in vllm/v1/worker/cpu_model_runner.py
_sync_device ¶
_to_list ¶
get_dp_padding ¶
load_model ¶
load_model(eep_scale_up: bool = False) -> None
Source code in vllm/v1/worker/cpu_model_runner.py
warming_up_model ¶
Source code in vllm/v1/worker/cpu_model_runner.py
_set_global_compilation_settings ¶
_set_global_compilation_settings(config: VllmConfig)