Yunkai (James) Zhan

+1 (213)-716-2016zhanyunkai603@gmail.comlinkedin.com/in/yunkai-zhanmailto:zhanyunkai603@gmail.com

Education

University of Southern California, Viterbi School of Engineering

Aug 2023 — May 2027

Bachelor of Science, Computer Science | Applied Math (double major)

Experience

Currents AIMachine Learning Engineer (Internship) — Social World Model, Post Training

Oct 2025 — Present

Building an LLM-based Celebrity Simulator to simulate the top 100 political, business, and finance figures' tone, knowledge, belief system, and decision making with a post-training pipeline and agentic framework.
Establishing a post-training pipeline with multi-source data collection and processing, Supervised Fine-Tuning (SFT) on Qwen, and reinforcement learning with a generator-discriminator framework.
Participating in the company's web agent extension launch, including test script writing and marketing planning.

Alibaba CloudSoftware Engineer (Internship) — Knowledge Graph, RAG

Jun 2024 — Aug 2024

Pioneered the industry's first RAG multi-hop question set auto-generator for the Chinese finance sector; evaluated and improved the model's retrieval accuracy by 20%.
Developed a Python pipeline based on Knowledge Graph (Neo4j), vector embeddings, and prompt engineering to auto-generate more challenging questions for the RAG system.
Applied the pipeline to Alibaba's finance AI assistant Qwen Dianjin's internal RAG evaluation, improving product satisfaction among 30+ leading Chinese financial companies.

Projects

Multimodal LLM Adaptive ThinkingAcademic Research — MLLM Post-Training

Feb 2026 — Present

Post-trained multimodal LLMs (Qwen3-4B-VL) across diverse vision-language tasks under both thinking and non-thinking modes; identified a bidirectional performance gap where thinking mode outperforms by ~15% on geometry and math but underperforms by 5–10% on VQA and counting tasks.
Designing GRPO-based training with mode-conditioned reward shaping to unify reasoning capabilities, enabling a single model to adaptively leverage extended or compressed thinking without sacrificing performance in either regime.

Research Assistant — USC HUMANS LabAcademic Research — Social Simulation, Agent-Based Modeling, RL

May 2025 — Present

Contributed to a WWW 2026 submission on simulating LLM-driven influence operations (IO) using multi-agent generative modeling in social media environments.
Helped implement a Generative Agent-Based Model (GABM) of 40 organic and 10 IO agents using AutoGen and Llama 3.3-70B, exploring coordination under varying operational awareness levels.
Researching the application of reinforcement learning to multi-agent social media simulation to improve simulated online marketing engagement rate and purchasing intent uplift.

Research Assistant — USC Media Communication LabAcademic Research — RL, Green Learning

May 2025 — Present

Conducting research with Prof. Jay Kuo on the intersection of Reinforcement Learning and Green Learning — an explainable, energy-efficient alternative to conventional deep learning pipelines.
Surveyed and reproduced 10+ papers on main approaches in Green Learning on standard vision benchmarks, focused on feature engineering (variants of PCA, Sparse Coding) and XGBoost.
Researching the application of the Green Learning pipeline on Policy Gradient as an alternative to Deep Policy Networks with high interpretability, efficiency, and superior performance on small datasets.

LLM-Powered Learning AssistantProject — Next.js, TypeScript, MongoDB, RAG, LLM

May 2025 — Aug 2025

Engineered a production-ready Retrieval-Augmented Generation (RAG) pipeline using FastAPI, embedding course materials with Voyage-3 into an L2-normalized FAISS vector index for high-speed semantic search.
Orchestrated a multi-model LLM strategy, leveraging Claude 3.5 Haiku for efficient document summarization and Claude 3.7 Sonnet for context-aware answers, achieving 99% concept accuracy.
Built a full-stack interface featuring an interactive document viewer and persistent chat backed by MongoDB for scalable storage of large PDFs.

Skills

Coding:

Python (NumPy, Pandas, Scikit-Learn, PyTorch), C++, Java, MATLAB

Web Development:

React, Node.js, HTML, CSS, TypeScript, JavaScript, Spring Boot, MySQL, Git, Docker

ML / AI:

LLM Post-Training (SFT, RLHF, GRPO), RAG, Knowledge Graphs (Neo4j), Vector Embeddings (FAISS), Multi-Agent Systems (AutoGen), Green Learning

Relevant Coursework:

Algorithms, Data Structures, Software Engineering, Machine Learning, Linear Algebra, Probability and Statistics, Data Science

Experience

Currents AIMachine Learning Engineer (Internship) — Social World Model, Post Training

Oct 2025 — Present

Building an LLM-based Celebrity Simulator to simulate the top 100 political, business, and finance figures' tone, knowledge, belief system, and decision making with a post-training pipeline and agentic framework.
Establishing a post-training pipeline with multi-source data collection and processing, Supervised Fine-Tuning (SFT) on Qwen, and reinforcement learning with a generator-discriminator framework.
Participating in the company's web agent extension launch, including test script writing and marketing planning.

Alibaba CloudSoftware Engineer (Internship) — Knowledge Graph, RAG

Jun 2024 — Aug 2024

Pioneered the industry's first RAG multi-hop question set auto-generator for the Chinese finance sector; evaluated and improved the model's retrieval accuracy by 20%.
Developed a Python pipeline based on Knowledge Graph (Neo4j), vector embeddings, and prompt engineering to auto-generate more challenging questions for the RAG system.
Applied the pipeline to Alibaba's finance AI assistant Qwen Dianjin's internal RAG evaluation, improving product satisfaction among 30+ leading Chinese financial companies.

Projects

Multimodal LLM Adaptive ThinkingAcademic Research — MLLM Post-Training

Feb 2026 — Present

Post-trained multimodal LLMs (Qwen3-4B-VL) across diverse vision-language tasks under both thinking and non-thinking modes; identified a bidirectional performance gap where thinking mode outperforms by ~15% on geometry and math but underperforms by 5–10% on VQA and counting tasks.
Designing GRPO-based training with mode-conditioned reward shaping to unify reasoning capabilities, enabling a single model to adaptively leverage extended or compressed thinking without sacrificing performance in either regime.

Research Assistant — USC HUMANS LabAcademic Research — Social Simulation, Agent-Based Modeling, RL

May 2025 — Present

Contributed to a WWW 2026 submission on simulating LLM-driven influence operations (IO) using multi-agent generative modeling in social media environments.
Helped implement a Generative Agent-Based Model (GABM) of 40 organic and 10 IO agents using AutoGen and Llama 3.3-70B, exploring coordination under varying operational awareness levels.
Researching the application of reinforcement learning to multi-agent social media simulation to improve simulated online marketing engagement rate and purchasing intent uplift.

Research Assistant — USC Media Communication LabAcademic Research — RL, Green Learning

May 2025 — Present

Conducting research with Prof. Jay Kuo on the intersection of Reinforcement Learning and Green Learning — an explainable, energy-efficient alternative to conventional deep learning pipelines.
Surveyed and reproduced 10+ papers on main approaches in Green Learning on standard vision benchmarks, focused on feature engineering (variants of PCA, Sparse Coding) and XGBoost.
Researching the application of the Green Learning pipeline on Policy Gradient as an alternative to Deep Policy Networks with high interpretability, efficiency, and superior performance on small datasets.

LLM-Powered Learning AssistantProject — Next.js, TypeScript, MongoDB, RAG, LLM

May 2025 — Aug 2025

Engineered a production-ready Retrieval-Augmented Generation (RAG) pipeline using FastAPI, embedding course materials with Voyage-3 into an L2-normalized FAISS vector index for high-speed semantic search.
Orchestrated a multi-model LLM strategy, leveraging Claude 3.5 Haiku for efficient document summarization and Claude 3.7 Sonnet for context-aware answers, achieving 99% concept accuracy.
Built a full-stack interface featuring an interactive document viewer and persistent chat backed by MongoDB for scalable storage of large PDFs.

Skills

Coding:

Python (NumPy, Pandas, Scikit-Learn, PyTorch), C++, Java, MATLAB

Web Development:

React, Node.js, HTML, CSS, TypeScript, JavaScript, Spring Boot, MySQL, Git, Docker

ML / AI:

LLM Post-Training (SFT, RLHF, GRPO), RAG, Knowledge Graphs (Neo4j), Vector Embeddings (FAISS), Multi-Agent Systems (AutoGen), Green Learning

Relevant Coursework:

Algorithms, Data Structures, Software Engineering, Machine Learning, Linear Algebra, Probability and Statistics, Data Science