cv

Click the PDF icon on the right to download my CV.

Basics

Name Yun Zhang
Label First year PhD student at UCLA's Mobility Lab
Email yun666@g.ucla.edu
Phone (310) 694-6791
Url https://HandsomeYun.github.io/
Summary Researcher specializing in physical intelligence, autonomous driving, and computer vision.

Education

  • 2025.09 - Present

    United States

    PhD
    University of California, Los Angeles (UCLA)
    Mobility Lab
  • 2021.09 - 2025.06

    United States

    Undergraduate
    University of California, Los Angeles (UCLA)
    B.S. in Mathematics in Computer Science, B.S. in Statistics and Data Science
    • Cumulative GPA: 3.823/4.0
    • Dean's Honors List (Fall 2021, Winter/Spring/Fall 2022, Winter/Spring 2023)
  • 2015.09 - 2021.06

    Athens, Greece

    High School
    American Community School of Athens (ACS Athens)
    • Weighted Cumulative GPA: 4.886/4.0
    • Final IB Score: 44/45

Publications

  • 2025.09.01
    MIC-BEV: Multi-Infrastructure Camera Bird's-Eye-View Transformer with Relation-Aware Fusion for 3D Object Detection
    Submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Best Paper Award for DriveX Workshop
    First Author. MIC-BEV is a Transformer-based framework for multi-camera infrastructure perception. It performs 3D object detection and BEV segmentation by fusing features from multiple cameras through a geometry-aware graph module. Designed for diverse camera setups and harsh conditions, MIC-BEV maintains strong robustness under sensor degradation. To support this, we introduce M2I, a synthetic dataset covering varied layouts, weather, and viewpoints. Experiments on both M2I and the real-world RoScenes dataset show that MIC-BEV achieves state-of-the-art performance and reliability for real-world deployment. (Currently releasing the Workshop version)
  • 2025.03.09
    AutoVLA: Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning
    Accepted by Neural Information Processing Systems (NeurIPS)
    Fourth Author. AutoVLA is a vision-language-action model for end-to-end autonomous driving with adaptive reasoning and reinforcement fine-tuning capabilities.
  • 2025.03.09
    InSPE: Rapid Evaluation of Heterogeneous Multi-Modal Infrastructure Sensor Placement
    Submitted to The IEEE/CVF Winter Conference on Applications of Computer Vision 2026 (WACV)
    Co-firsr Author. This paper introduces InSPE, a framework for evaluating heterogeneous multi-modal infrastructure sensor placement by integrating metrics like sensor coverage, occlusion, and information gain, supported by a new dataset and benchmarking experiments to optimize perception in intelligent intersections.
  • 2025.03.09
    AgentAlign: Misalignment-Adapted Multi-Agent Perception for Resilient Inter-Agent Sensor Correlations
    Submitted to The IEEE/CVF Winter Conference on Applications of Computer Vision 2026 (WACV)
    Second Author. This work presents AgentAlign, a real-world multi-agent perception framework that mitigates multi-modality misalignment in cooperative autonomous systems using cross-modality feature alignment and introduces the V2XSet-Noise dataset for robust evaluation.
  • 2025.03.09
    RelMap: Enhancing Online Map Construction with Class-Aware Spatial Relation and Semantic Priors
    Submitted to The 40th Annual AAAI Conference on Artificial Intelligence (AAAI)
    Second Author. RelMap is an online HD map construction framework that enhances vectorized map generation using class-aware spatial relations and semantic priors, significantly improving accuracy and data efficiency on nuScenes and Argoverse 2 datasets.
  • 2025.03.09
    V2XPnP: Vehicle-to-Everything Spatio-Temporal Fusion for Multi-Agent Perception and Prediction
    Accepted by International Conference on Computer Vision (ICCV)
    Sixth Author. V2XPnP proposes a spatio-temporal fusion framework for multi-agent perception and prediction, leveraging a Transformer-based architecture and a novel sequential dataset to benchmark when, what, and how to fuse information in V2X scenarios.

Research

  • 2024.05 - 2024.08
    Researcher
    HKU Summer Research Program
    Leveraged Large Language Models (MiniGPT-4) for multi-modality brain tumor segmentation, integrating four distinct MRI modalities (T1c, T1w, T2c, and FLAIR) onto a common space to enhance segmentation accuracy.
    • Awarded Best Presenter and received a PhD offer with a Presidential Scholarship.
  • 2023.02 - 2025.09
    Research Assistant
    Mobility Lab, UCLA
    Contributed to multi-agent perception, sensor fusion, and infrastructure-aware autonomous driving, co-authoring five papers on multi-modal sensor placement (InSPE), misalignment adaptation in cooperative perception (AgentAlign), class-aware map construction (RelMap), spatio-temporal fusion for V2X perception (V2XPnP), and real-world cooperative perception datasets (V2X-ReaLO).
    • Participating in the U.S. DOT Intersection Safety Challenge and won $750,000
  • 2023.01 - 2024.12
    Research Assistant
    Vwani Roychowdhury's Lab, UCLA
    Contributed to the implementation and deep learning models of Hilbert (HIL) detector of PyHFO, a multi-window desktop application providing time-efficient HFO detection algorithms for artifact and HFO with spike classification
    • Reduced the detection run-time by 50 times compared to state-of-the-art with comparative study to ensure correctness.

Work

  • 2023.06 - 2023.12
    AI/Data Analyst Intern
    Office of Palo Alto Councilmember Greg Tanaka
    Analyzed voter data from public social media, HubSpot, and voter profiles within a California congressional district, identifying trends and developing predictive models to anticipate voting behavior
    • Utilized LLMs to generate personalized campaign emails and campaign services, increasing efficiency.
  • 2022.12 - 2023.03
    Data Analysis Intern
    Uber, Hong Kong
    Participated in the facial mask recognition project during the COVID-19 pandemic for backend utilities
    • Performed in-depth analysis and demand forecasting for Uber's regional operations, evaluating the influence of factors such as humidity, wind, time of day, as well as origin and destination

Skills

Autonomous Systems & Simulation
CARLA
OpenCDA
OpenSCENARIO Documentation
Scenario Runner
Multi-Agent & Cooperative Perception
V2X
Cooperative Perception
Sensor Fusion
Multi-Agent Perception
Intermediate Fusion
Multi-Sensor Misalignment
Programming Languages
Python
C++
JavaScript
C#
R
LaTeX
Bash/Shell Scripting
Machine Learning & Data Science
Pytorch
TensorFlow
Scikit-learn
Pandas
NumPy
MATLAB
Jupyter Notebooks
Medical Imaging & Biomedical Analysis
Segment Anything Model (SAM)
nnUNet
BraTS
Image Segmentation
DevOps & Cloud Computing
Docker
AWS
Git
GitKraken
Web Development & Frontend Technologies
React
Node.js
HTML
CSS
JavaScript
Tableau

Languages

Chinese
Native speaker
English
Fluent

Interests

Cooking
Chinese Cuisine
Japanese Cuisine
Western Cuisine
Desserts
Fusion Cooking