Sen Yang

TL;DR (Overview)

# **Sen Yang's Personal Website**
## About Me
- **Computer Vision Researcher**
- Research Interests:
  - Computer Vision
  - Multimodal Large Language Models
  - Autonomous Driving
## Education
- **Ph.D.**：Southeast University (2019.5-2023.3)
- Master：Southeast University (2017.9-2019.1)
- Bachelor：Jilin University (2013.9-2017.7)
## Work Experience
- **Baidu VIS Senior R&D Engineer** (2023.7-Present)
- Tencent TPG Intern (2021.12-2022.8)
- Megvii Intern (2021.1-2021.10)
## Research Publications
- **Autonomous Driving**
  - TopoSD: Topology-Enhanced Lane Segment Perception
  - MGMapNet: Multi-Granularity Representation Learning
- **Multimodal Large Models**
  - Vision Remember: Alleviating Visual Forgetting in Efficient MLLM
- **Pose Estimation**
  - Detecting and grouping keypoints
  - Capturing the motion of every joint
  - Searching part-specific neural fabrics
  - SimCC: A Simple Coordinate Classification
  - TokenPose: Learning Keypoint Tokens
  - TransPose: Keypoint Localization via Transformer
## Technical Skills
- **Multimodal Large Models**
  - MLLM Architectures: LLaVA, Qwen2.5-VL, LISA
  - Training Techniques: SFT, Autoregressive Models, RL
  - Visual Token Compression, Large-scale Distributed Training
- **Autonomous Driving Perception**
  - BEV Visual Mapping, Temporal Modeling
  - Multimodal Fusion: Vision + Map Structured Data
  - Navigation Map Integration, Probabilistic Planning
- **Deep Learning Frameworks**
  - PyTorch, Python, C++
  - Transformer Models, GPU/Ascend NPU Development
## Contact Information
- Email: yangsenius@gmail.com
- Blog: senyang-ml.github.io
- Google Scholar Profile

Tip: You can drag and click to explore the mindmap, "Reset" refreshes the view

About Me

I am a research engineer at Baidu, primarily engaged in computer vision, multimodal large language models, and autonomous driving. I received my Ph.D. from Southeast University in 2023. My research focuses on computer vision and deep learning, with particular attention to 2D/3D human pose estimation, autonomous driving perception, and visual multimodal foundation models. I am passionate about developing innovative solutions that combine cutting-edge research with practical applications.

My research interests include:

Computer Vision
Deep Learning
Human Pose Estimation
Autonomous Driving Perception
Multimodal Foundation Models

Work and Internship Experience

Baidu VIS

Senior R&D Engineer

2023.7 - Present

Responsible for in-depth research and innovative applications in multimodal large models, computer vision perception, and decision-making algorithms, aiming to push the boundaries of technology and solve complex challenges. My work encompasses the entire process from cutting-edge algorithm design to product deployment, focusing on translating theoretical breakthroughs into practical business value and achieving significant progress in multiple core areas.

Tencent PCG

Intern

2021.12 - 2022.8

Responsible for 3D human reconstruction and motion generation project. Proposed an independent token representation method based on the parameterized SMPL model, achieving high-precision 3D human reconstruction and joint motion capture, improving 3DPW metrics by 8%. The paper was published in ICLR-2023 (spotlight, top25%).

Megvii Technology

Intern

2021.1 - 2021.10

Participated in human pose estimation projects. Designed a Transformer-based pose estimation model using token representation (ICCV-2021). Researched attention patterns in Transformer (Pattern Recognition). Pioneered a new coordinate classification paradigm, SimCC, breaking through the precision bottleneck of traditional regression and heatmap methods (ECCV 2022 Oral, adopted as a core solution by mainstream pose estimation frameworks).

Research Publications

HisTrackMap: Global Vectorized High-Definition Map Construction via History Map Tracking

Jing Yang*, Sen Yang*, Xiao Tan, Hanli Wang.

arXiv preprint arXiv:2503.07168, 2025

Paper Project

TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior

Sen Yang, Minyue Jiang, Ziwei Fan, Xiaolu Xie, Xiao Tan, Yingying Li, Errui Ding, Liang Wang, Jingdong Wang.

2024 Preprint

Paper

MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction

Jing Yang*, Minyue Jiang*, Sen Yang*, Xiao Tan, Yingying Li, Errui Ding, Hanli Wang, Jingdong Wang.

ICLR 2025

PDF

Detecting and grouping keypoints for multi-person pose estimation using instance-aware attention

Sen Yang, Ze Feng, Zhicheng Wang, Yanjie Li, Shoukui Zhang, Zhibin Quan, Shu-tao Xia, Wankou Yang.

Pattern Recognition

Journal Paper

Capturing the motion of every joint: 3D human pose and mesh recovery with independent tokens

Sen Yang, Wen Heng, Gang Liu, Guozhong Luo, Wankou Yang, Gang Yu.

ICLR 2023 (spotlight, top 25%)

Paper Code Project

Searching part-specific neural fabrics for human pose estimation

Sen Yang, Wankou Yang, Zhen Cui.

Pattern Recognition

Journal Paper Code

SimCC: A Simple Coordinate Classification perspective for human pose estimation

Yanjie Li, Sen Yang, Peidong Liu, Shoukui Zhang, Yunxiao Wang, Zhicheng Wang, Wankou Yang, Shu-Tao Xia.

ECCV 2022 (oral, top 5%) (cited 200+ times)

Paper Code Zhihu

TokenPose: Learning Keypoint Tokens for Human Pose Estimation

Yanjie Li, Shoukui Zhang, Zhicheng Wang, Sen Yang, Wankou Yang, Shu-Tao Xia, Erjin Zhou.

ICCV 2021 (cited 400+ times)

Paper Code

TransPose: Keypoint Localization via Transformer

Sen Yang, Zhibin Quan, Mu Nie, Wankou Yang.

ICCV 2021 (cited 500+ times)