Windows 11 源码编译 vLLM 0.16 完全指南CUDA 12.6 / PyTorch 2.7.1cu126【再次实战检验】本文是上篇 vLLM Windows cu128 编译指南 的复盘版本。上篇使用 CUDA 12.8 编译本篇使用 CUDA 12.6 重新编译与 PyTorch 2.7.1cu126 完全匹配。同时修正了上篇中subst映射用途的描述并给出更清晰的一键恢复脚本。Windows 多版本 CUDA cuDNN 环境配置完全指南Windows 本地编译 CUDA Extension Wheel 完全指南Windows 11 源码编译 vLLM 0.16 完全指南RTX 3090 / CUDA 12.8 / PyTorch 2.7.1环境信息项目版本操作系统Windows 11GPUNVIDIA GeForce RTX 3090 (sm_86)驱动595.02CUDA Toolkit12.6编译用Python3.12.11PyTorch2.7.1cu126Visual Studio2022 Professional v17.12.17vLLM 分支SystemPanic/vllm-windowsvllm-for-windows 分支编译产物版本0.16.0rc2.dev243gc8e1f5abe.d20260309.cu126一、为什么要重新编译 cu126 版本Windows 11 源码编译 vLLM 0.16 完全指南RTX 3090 / CUDA 12.8 / PyTorch 2.7.1上篇用系统上的 CUDA 12.8 编译但实际虚拟环境中使用的 PyTorch 是2.7.1cu126即 torch 内部绑定的是 CUDA 12.6 的运行时。虽然 cu128 wheel 在多数场景下也能向前兼容运行但为了避免潜在的版本不匹配问题用与 torch 完全一致的 CUDA 版本重新编译更为稳妥。Windows 多版本 CUDA cuDNN 环境配置完全指南# 快捷切换CUDAcuDNN编译链 # 启动脚本 . D:\Program\switch-cuda.ps1 # 切换到 CUDA 12.6 Switch-CUDA 12.6验证当前 torch 的 CUDA 版本python -c import torch; print(torch.__version__, | CUDA:, torch.version.cuda) # 输出2.7.1cu126 | CUDA: 12.6二、关键概念subst 映射的正确用途这是本次编译最容易出错的地方必须先理解清楚。CUDA 默认安装在C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6路径含空格。MSVC 的cl.exe在处理-I参数时不会自动加引号空格会导致路径被截断出现编译错误。解决方案用subst把 CUDA 目录映射到无空格的盘符。Z: → C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6⚠️Z: 映射的是 CUDA 目录不是 vLLM 源码目录。这两者不能混淆。vLLM 源码始终在J:\PythonProjects4\vllm-windows编译命令里也用完整的 J: 路径指定源码位置。映射命令# 映射Z: → CUDA 12.6 目录让torch找到正确版本的nvcc subst Z: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6# 验证映射 subst | findstr Z如需取消映射# 取消当前Z:映射 subst Z: /D三、编译前准备3.1 确认环境在VS 2022 Developer Command Promptx64中操作确认以下工具可用cl # 输出用于 x64 的 Microsoft (R) C/C 优化编译器 19.42.xxxxx 版 python --version # Python 3.12.11激活 vLLM 专用 venvcd J:\PythonProjects4\vllm-windows .\.venv\Scripts\Activate.ps1 python -c import torch; print(torch.__version__, torch.version.cuda) # 确认输出2.7.1cu126 12.63.2 清理 CMake 缓存重要如果之前有过失败的编译尝试必须先清理缓存否则旧的 CMake 变量会干扰新的编译Remove-Item -Recurse -Force J:\PythonProjects4\vllm-windows\build -ErrorAction SilentlyContinue Remove-Item -Recurse -Force J:\PythonProjects4\vllm-windows\.deps -ErrorAction SilentlyContinue.deps目录包含 CMake 自动下载的 CUTLASS、triton-windows、FlashMLA 等外部依赖清理后编译时会重新下载需要网络连接。四、设置编译环境以下所有命令在同一个终端会话中顺序执行# 1. 映射 CUDA 12.6 到 Z: 盘解决路径含空格问题 subst Z: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6 # 验证映射正确 dir Z:\bin\nvcc.exe # 必须能找到 # 2. 设置 CUDA 路径变量全部指向 Z: $env:CUDA_HOME Z: $env:CUDA_PATH Z: $env:CUDA_ROOT Z: $env:CudaToolkitDir Z:\ $env:PATH Z:\bin; $env:PATH # 验证第一行必须是 Z:\bin\nvcc.exe where.exe nvcc # 3. 设置编译参数 $env:DISTUTILS_USE_SDK 1 $env:VLLM_TARGET_DEVICE cuda $env:MAX_JOBS 10 # 根据 CPU 核心数调整 $env:TORCH_CUDA_ARCH_LIST 8.6 # RTX 3090 对应 sm_86 $env:USE_LIBUV 0⚠️where.exe nvcc的第一行必须是Z:\bin\nvcc.exe而不是C:\Program Files\...。如果不是说明 PATH 设置有问题检查$env:PATH开头是否有Z:\bin。五、执行编译pip wheel J:\PythonProjects4\vllm-windows --no-build-isolation --no-deps -w J:\PythonProjects4\vllm-windows\wheels\ 21 | Tee-Object -FilePath J:\PythonProjects4\vllm-windows\wheels\build-cu126.log编译过程说明CMake 配置阶段约 5-10 分钟自动下载 CUTLASS、FlashMLA、triton-windows 等外部依赖ninja 编译阶段约 60-90 分钟共 145 个编译目标以下跳过信息是正常的不是错误-- FlashMLA will not compile: unsupported CUDA architecture 8.6 需要 sm_90 -- [QUTLASS] Skipping build: CUDA 12.8 or newer is required cu126 不支持 -- Not building scaled_mm_c3x_sm90 需要 sm_90 -- Not building NVFP4 需要 sm_100以下警告也可忽略CMake Warning: Pytorch version 2.10.0 expected for CUDA build, saw 2.7.1 instead.编译成功标志[145/145] Linking CXX shared module vllm-flash-attn\_vllm_fa2_C.pyd Successfully built vllm六、验证 wheelcd J:\PythonProjects4\vllm-windows Get-ChildItem J:\PythonProjects4\vllm-windows\wheels\ | Select-Object Name, LastWriteTime, Length应看到类似vllm-0.16.0rc2.dev243gc8e1f5abe.d20260309.cu126-cp312-cp312-win_amd64.whl 269MB文件名结构cu126— 与 PyTorch 2.7.1cu126 匹配 ✅cp312— Python 3.12d20260309— 编译日期七、安装与验证7.1 在 vLLM venv 中验证python -c import os os.environ[USE_LIBUV] 0 import vllm._C as _C print(✅ _C 扩展加载成功) from vllm import LLM, SamplingParams import vllm print(✅ vllm 导入成功版本:, vllm.__version__) 7.2 安装到其他环境如 ComfyUIpip install vllm-0.16.0rc2.dev243gc8e1f5abe.d20260309.cu126-cp312-cp312-win_amd64.whl --no-deps pip install llguidance xgrammar⚠️USE_LIBUV0 必须在 import vllm 之前设置否则 PyTorch 2.7.1 stable 会报错import os os.environ[USE_LIBUV] 0 import vllm八、一键恢复脚本下次需要重新编译时在 VS 2022 Developer Shell 中执行cd J:\PythonProjects4\vllm-windows .\.venv\Scripts\Activate.ps1 # 清理旧缓存如需要 # Remove-Item -Recurse -Force build, .deps # 映射 CUDA 路径Z: → CUDA 12.6每次重启后需重新执行 subst Z: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6 # 设置环境变量 $env:CUDA_HOME Z: $env:CUDA_PATH Z: $env:CUDA_ROOT Z: $env:CudaToolkitDir Z:\ $env:PATH Z:\bin; $env:PATH $env:DISTUTILS_USE_SDK 1 $env:VLLM_TARGET_DEVICE cuda $env:MAX_JOBS 10 $env:TORCH_CUDA_ARCH_LIST 8.6 $env:USE_LIBUV 0 # 确认 nvcc 第一行是 Z:\bin\nvcc.exe where.exe nvcc # 编译 pip wheel J:\PythonProjects4\vllm-windows --no-build-isolation --no-deps -w J:\PythonProjects4\vllm-windows\wheels\保存所有编译依赖的 wheel 到指定目录假设保存到J:\PythonProjects4\vllm-windows则保存 wheel 命令为# 打包成 wheel 文件 pip wheel . --no-build-isolation -w J:\PythonProjects4\vllm-windows\vllmwhl_cu126\# 查看生成的 wheel Get-ChildItem J:\PythonProjects4\vllm-windows\vllmwhl_cu126\[notice] A new release of pip is available: 25.3 - 26.0.1[notice] To update, run: python.exe -m pip install --upgrade pip(.venv) PS J:\PythonProjects4\vllm-windows(.venv) PS J:\PythonProjects4\vllm-windows # 查看生成的 wheel(.venv) PS J:\PythonProjects4\vllm-windows Get-ChildItem J:\PythonProjects4\vllm-windows\vllmwhl_cu126\Directory: J:\PythonProjects4\vllm-windows\vllmwhl_cu126Mode LastWriteTime Length Name---- ------------- ------ -----a--- 2026/3/9 17:47 15265 aiohappyeyeballs-2.6.1-py3-none-any.whl-a--- 2026/3/9 17:47 455407 aiohttp-3.13.3-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 7490 aiosignal-1.4.0-py3-none-any.whl-a--- 2026/3/9 17:47 5303 annotated_doc-0.0.4-py3-none-any.whl-a--- 2026/3/9 17:47 13643 annotated_types-0.7.0-py3-none-any.whl-a--- 2026/3/9 17:47 455156 anthropic-0.84.0-py3-none-any.whl-a--- 2026/3/9 17:47 113592 anyio-4.12.1-py3-none-any.whl-a--- 2026/3/9 17:47 1973200 apache_tvm_ffi-0.1.9-cp312-abi3-win_amd64.whl-a--- 2026/3/9 17:47 27488 astor-0.8.1-py2.py3-none-any.whl-a--- 2026/3/9 17:47 67615 attrs-25.4.0-py3-none-any.whl-a--- 2026/3/9 17:47 215704 blake3-1.0.8-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 13900 cachetools-7.0.4-py3-none-any.whl-a--- 2026/3/9 17:47 69817 cbor2-5.8.0-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 153684 certifi-2026.2.25-py3-none-any.whl-a--- 2026/3/9 17:47 183557 cffi-2.0.0-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 142856 charset_normalizer-3.4.5-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 108274 click-8.3.1-py3-none-any.whl-a--- 2026/3/9 17:47 22228 cloudpickle-3.1.2-py3-none-any.whl-a--- 2026/3/9 17:47 25335 colorama-0.4.6-py2.py3-none-any.whl-a--- 2026/3/9 17:47 192620 compressed_tensors-0.13.0-py3-none-any.whl-a--- 2026/3/9 17:47 3480909 cryptography-46.0.5-cp311-abi3-win_amd64.whl-a--- 2026/3/9 17:47 43903 cuda_pathfinder-1.4.1-py3-none-any.whl-a--- 2026/3/9 17:47 96267167 cupy_cuda12x-14.0.1-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 39381 depyf-0.20.0-py3-none-any.whl-a--- 2026/3/9 17:47 120019 dill-0.4.1-py3-none-any.whl-a--- 2026/3/9 17:47 45550 diskcache-5.6.3-py3-none-any.whl-a--- 2026/3/9 17:47 20277 distro-1.9.0-py3-none-any.whl-a--- 2026/3/9 17:47 331094 dnspython-2.8.0-py3-none-any.whl-a--- 2026/3/9 17:47 36896 docstring_parser-0.17.0-py3-none-any.whl-a--- 2026/3/9 17:47 65638 einops-0.8.2-py3-none-any.whl-a--- 2026/3/9 17:47 35604 email_validator-2.3.0-py3-none-any.whl-a--- 2026/3/9 17:47 12304 fastapi_cli-0.0.24-py3-none-any.whl-a--- 2026/3/9 17:47 28359 fastapi_cloud_cli-0.14.1-py3-none-any.whl-a--- 2026/3/9 17:47 116999 fastapi-0.135.1-py3-none-any.whl-a--- 2026/3/9 17:47 490429 fastar-0.8.0-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 26427 filelock-3.25.0-py3-none-any.whl-a--- 2026/3/9 17:47 209703185 flashinfer_jit_cache-0.6.3-cp39-abi3-win_amd64.whl-a--- 2026/3/9 17:47 7651605 flashinfer_python-0.6.3-py3-none-any.whl-a--- 2026/3/9 17:47 44591 frozenlist-1.8.0-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 202505 fsspec-2026.2.0-py3-none-any.whl-a--- 2026/3/9 17:47 114244 gguf-0.18.0-py3-none-any.whl-a--- 2026/3/9 17:47 22800 grpcio_reflection-1.78.0-py3-none-any.whl-a--- 2026/3/9 17:47 4797657 grpcio-1.78.0-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 37515 h11-0.16.0-py3-none-any.whl-a--- 2026/3/9 17:47 78784 httpcore-1.0.9-py3-none-any.whl-a--- 2026/3/9 17:47 86694 httptools-0.7.1-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 8960 httpx_sse-0.4.3-py3-none-any.whl-a--- 2026/3/9 17:47 73517 httpx-0.28.1-py3-none-any.whl-a--- 2026/3/9 17:47 566395 huggingface_hub-0.36.2-py3-none-any.whl-a--- 2026/3/9 17:47 71008 idna-3.11-py3-none-any.whl-a--- 2026/3/9 17:47 55500 ijson-3.5.0-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 23635 interegular-0.3.3-py37-none-any.whl-a--- 2026/3/9 17:47 134899 jinja2-3.1.6-py3-none-any.whl-a--- 2026/3/9 17:47 205424 jiter-0.13.0-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 20419 jmespath-1.1.0-py3-none-any.whl-a--- 2026/3/9 17:47 18437 jsonschema_specifications-2025.9.1-py3-none-any.whl-a--- 2026/3/9 17:47 90630 jsonschema-4.26.0-py3-none-any.whl-a--- 2026/3/9 17:47 111036 lark-1.2.2-py3-none-any.whl-a--- 2026/3/9 17:47 30332380 llvmlite-0.44.0-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 45418 lm_format_enforcer-0.11.3-py3-none-any.whl-a--- 2026/3/9 17:47 61595 loguru-0.7.3-py3-none-any.whl-a--- 2026/3/9 17:47 87321 markdown_it_py-4.0.0-py3-none-any.whl-a--- 2026/3/9 17:47 15105 markupsafe-3.0.3-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 233615 mcp-1.26.0-py3-none-any.whl-a--- 2026/3/9 17:47 9979 mdurl-0.1.2-py3-none-any.whl-a--- 2026/3/9 17:47 6518623 mistral_common-1.9.1-py3-none-any.whl-a--- 2026/3/9 17:47 105738 model_hosting_container_standards-0.1.13-py3-none-any.whl-a--- 2026/3/9 17:47 536198 mpmath-1.3.0-py3-none-any.whl-a--- 2026/3/9 17:47 72708 msgpack-1.1.2-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 190024 msgspec-0.20.0-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 46053 multidict-6.7.1-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 2068504 networkx-3.6.1-py3-none-any.whl-a--- 2026/3/9 17:47 309975 ninja-1.13.0-py3-none-win_amd64.whl-a--- 2026/3/9 17:47 2831929 numba-0.61.2-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 12614190 numpy-2.2.6-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 1591041 nvidia_cudnn_frontend-1.18.0-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 50680 nvidia_ml_py-13.590.48-py3-none-any.whl-a--- 2026/3/9 17:47 2438369 openai_harmony-0.0.8-cp38-abi3-win_amd64.whl-a--- 2026/3/9 17:47 1136409 openai-2.26.0-py3-none-any.whl-a--- 2026/3/9 17:47 40070414 opencv_python_headless-4.13.0.92-cp37-abi3-win_amd64.whl-a--- 2026/3/9 17:47 2060945 outlines_core-0.2.11-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 74366 packaging-26.0-py3-none-any.whl-a--- 2026/3/9 17:47 10877 partial_json_parser-0.2.1.1.post7-py3-none-any.whl-a--- 2026/3/9 17:47 7033367 pillow-12.1.1-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 22424 portalocker-3.2.0-py3-none-any.whl-a--- 2026/3/9 17:47 64057 prometheus_client-0.24.1-py3-none-any.whl-a--- 2026/3/9 17:47 19296 prometheus_fastapi_instrumentator-7.1.0-py3-none-any.whl-a--- 2026/3/9 17:47 41655 propcache-0.4.1-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 437118 protobuf-6.33.5-cp310-abi3-win_amd64.whl-a--- 2026/3/9 17:47 137737 psutil-7.2.2-cp37-abi3-win_amd64.whl-a--- 2026/3/9 17:47 22335 py_cpuinfo-9.0.0-py3-none-any.whl-a--- 2026/3/9 17:47 35833 pybase64-1.4.3-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 8044600 pycountry-26.2.16-py3-none-any.whl-a--- 2026/3/9 17:47 48172 pycparser-3.0-py3-none-any.whl-a--- 2026/3/9 17:47 2020145 pydantic_core-2.41.5-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 74296 pydantic_extra_types-2.11.0-py3-none-any.whl-a--- 2026/3/9 17:47 58929 pydantic_settings-2.13.1-py3-none-any.whl-a--- 2026/3/9 17:47 463580 pydantic-2.12.5-py3-none-any.whl-a--- 2026/3/9 17:47 1225217 pygments-2.19.2-py3-none-any.whl-a--- 2026/3/9 17:47 28224 pyjwt-2.11.0-py3-none-any.whl-a--- 2026/3/9 17:47 22101 python_dotenv-1.2.2-py3-none-any.whl-a--- 2026/3/9 17:47 15548 python_json_logger-4.0.0-py3-none-any.whl-a--- 2026/3/9 17:47 24579 python_multipart-0.0.22-py3-none-any.whl-a--- 2026/3/9 17:47 9495040 pywin32-311-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 154003 pyyaml-6.0.3-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 619480 pyzmq-27.1.0-cp312-abi3-win_amd64.whl-a--- 2026/3/9 17:47 27427353 ray-2.54.0-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 26766 referencing-0.37.0-py3-none-any.whl-a--- 2026/3/9 17:47 277297 regex-2026.2.28-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 64738 requests-2.32.5-py3-none-any.whl-a--- 2026/3/9 17:47 32963 rich_toolkit-0.19.7-py3-none-any.whl-a--- 2026/3/9 17:47 310458 rich-14.3.3-py3-none-any.whl-a--- 2026/3/9 17:47 726090 rignore-0.7.6-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 240463 rpds_py-0.30.0-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 341380 safetensors-0.7.0-cp38-abi3-win_amd64.whl-a--- 2026/3/9 17:47 1054671 sentencepiece-0.2.1-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 439198 sentry_sdk-2.54.0-py2.py3-none-any.whl-a--- 2026/3/9 17:47 13247 setproctitle-1.3.7-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 1064234 setuptools-80.10.2-py3-none-any.whl-a--- 2026/3/9 17:47 9755 shellingham-1.5.4-py2.py3-none-any.whl-a--- 2026/3/9 17:47 11050 six-1.17.0-py2.py3-none-any.whl-a--- 2026/3/9 17:47 10235 sniffio-1.3.1-py3-none-any.whl-a--- 2026/3/9 17:47 14270 sse_starlette-3.3.2-py3-none-any.whl-a--- 2026/3/9 17:47 74272 starlette-0.52.1-py3-none-any.whl-a--- 2026/3/9 17:47 320736 supervisor-4.3.0-py2.py3-none-any.whl-a--- 2026/3/9 17:47 6299353 sympy-1.14.0-py3-none-any.whl-a--- 2026/3/9 17:47 39814 tabulate-0.10.0-py3-none-any.whl-a--- 2026/3/9 17:47 878694 tiktoken-0.12.0-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 2747786 tokenizers-0.22.2-cp39-abi3-win_amd64.whl-a--- 2026/3/9 17:47 113757972 torch-2.10.0-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 78374 tqdm-4.67.3-py3-none-any.whl-a--- 2026/3/9 17:47 11993498 transformers-4.57.6-py3-none-any.whl-a--- 2026/3/9 17:47 47382693 triton_windows-3.6.0.post25-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 56085 typer-0.24.1-py3-none-any.whl-a--- 2026/3/9 17:47 44614 typing_extensions-4.15.0-py3-none-any.whl-a--- 2026/3/9 17:47 14611 typing_inspection-0.4.2-py3-none-any.whl-a--- 2026/3/9 17:47 131584 urllib3-2.6.3-py3-none-any.whl-a--- 2026/3/9 17:47 68783 uvicorn-0.41.0-py3-none-any.whl-a--- 2026/3/9 17:49 269063485 vllm-0.16.0rc2.dev243gc8e1f5abe.d20260309.cu126-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 288410 watchfiles-1.1.1-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 178693 websockets-16.0-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 4083 win32_setctime-1.2.0-py3-none-any.whl-a--- 2026/3/9 17:47 671496 winloop-0.5.0-cp312-cp312-win_amd64.whl-a--- 2026/3/9 17:47 2639032 xformers-0.0.35.dev1121-cp39-abi3-win_amd64.whl-a--- 2026/3/9 17:47 87674 yarl-1.23.0-cp312-cp312-win_amd64.whl九、与上篇的对比Windows 11 源码编译 vLLM 0.16 完全指南RTX 3090 / CUDA 12.8 / PyTorch 2.7.1上篇cu128本篇cu126CUDA 编译版本12.812.6torch 匹配度部分匹配完全匹配 ✅subst Z: 指向CUDA 12.8 目录CUDA 12.6 目录wheel 大小269 MB269 MB编译日期2026030820260309两个 wheel 可以同时保留根据目标环境的 torchcuda 版本选择安装。十、参考资料SystemPanic/vllm-windows上篇Windows 11 源码编译 vLLM 0.16 完全指南cu128Windows 多版本 CUDA/cuDNN 环境配置完全指南Windows CUDA Extension 编译完全指南