Xinyu Dong

Xinyu Dong 董新宇

AI Infra Engineer at Baidu. Core contributor to the vLLM-Kunlun community, building high-performance LLM inference engines for Kunlun XPU.

vLLMvLLM-KunlunXGrammarLLM InferenceKunlun XPU
OPEN SOURCE

Contributions & Work

Based on actual GitHub pull request records. Click each project to view details.

Community-maintained vLLM hardware plugin for Kunlun XPU. Supports 15+ mainstream LLMs with quantization, LoRA, and multi-modal capabilities. 390 Stars on GitHub.

🧠Model Support7
Merged
#233Support qwen3-next model in v0.15.1
Merged
#84DeepSeek Support MTP (Multi-Token Prediction)
Merged
#62Support XiaoMi MIMO Flash V2
Merged
#71Support gpt-oss and update model list
Merged
#19Support llama3 on v0.11.0
Merged
#261Recover use reshape and cache kernel to update mamba cache
Open
#195DeepSeekV2 Add indexer_rotary_emb to control MLA rope style
Kernel & Performance5
Merged
#277Use kernel to fast GemmaRMSNorm
Merged
#265Reduce Host and device sync in Qwen3.5
Merged
#224Register custom_op for kunlun graph (torch compile)
Merged
#177Migrate XTorch operations to Kunlun operations (accelerating iteration)
Merged
#3Enable fast random sample on Kunlun3 Platform
🔄v0.15.1 Major Upgrade6
Merged
#209Implement and register Fused MoE Kunlun kernels using OOT method
Merged
#227Partially supports torch compile
Merged
#212Remove V0 code and fix circular reference
Merged
#203Unified the registration of custom operators to torch.ops
Merged
#201Fixed Kunlun plugin initialization failure due to circular references
Merged
#202Optimize utils, remove VLLM_USE_V1 check, correct dependency sources
🔧Bugfix6
Merged
#288Fixed MiniMax-M2 parser failed to validate function names
Merged
#285enable_thinking: False, Qwen3.5 model returns error in content stream
Merged
#262Fix function call calling xgrammar failed
Merged
#252Fix cache indices problem for Qwen3.5-Moe
Merged
#231Fixed the distributed environment initialization issue
Merged
#193Fixed Kunlun Graph Failed
CONNECT

Get In Touch

Interested in AI inference, vLLM contributions, or hardware-software co-design? Feel free to reach out.

© 2026 Xinyu Dong · Built with React + Vite