Zhen Wang / 王震
Hi! I'm Zhen, currently a postdoctoral researcher at UC San Diego, working with Prof. Zhiting Hu and Prof. Eric P. Xing, focusing on advancing foundation agentic systems and scientific discovery. I obtained my PhD from The Ohio State University, advised by Prof. Huan Sun, where I developed foundational frameworks for knowledge-centric NLP systems.
I'm fortunate to have the privilege of working with exceptional researchers like Rameswar Panda, Yoon Kim, Nebojsa Jojic, Nikolay Malkin, Leonid Karlinsky, and Bo Zong across premier industrial labs (MIT-IBM Lab, Microsoft Research, NEC Labs America) and academic institutions (UCSD, CMU, MBZUAI). I've been honored with the OpenAI Agentic AI Research Grant, SoCal NLP 2023 Best Paper Award, Alexa Prize TaskBot Challenge 2022, and Rising Star in Data Science 2021.
Outside of research, you'll find me exploring hiking trails, playing pickleball, or planning my next adventure in national parks. I'm also a passionate sports fan, cheering for the Buckeyes, Dodgers, Lakers, Inter Miami, and Chiefs (for reasons unrelated to tight ends).
Email /
GitHub /
Twitter /
Google Scholar
|
At a rooftop in Anchorage, Alaska 2019
|
Research Overview
Building Trustworthy Systems that Perceive, Think, and Act: Today's most advanced AI systems, despite their impressive capabilities, remain fundamentally reactive — unable to actively explore possibilities, strategically plan actions, or safely adapt their behavior in the real world. This limitation becomes increasingly critical as AI systems are deployed in scenarios requiring sustained interaction, complex reasoning, and reliable real-world engagement.
My research establishes Foundation Agentic Systems that transform how AI engages with complex worlds across the perception-cognition-action loop while ensuring reliable governance.
-
World Model-based Simulation and Planning: At its core, active intelligence requires the ability to simulate and reason about potential futures. My research introduces principled world model formulation, enabling AI systems for active simulation and strategic planning. [LLM-reasoners, COLM'24; PromptAgent , ICLR'24]
-
Structured Reasoning and Inference-compute Scaling: Structure, whether explicit in graphs or implicit in language spaces, provides the key to achieve reliable and interpretable reasoning. [RAP, EMNLP'23; ThinkSum, ACL'23; SurfCon, KDD'19]
-
Efficient Adaptation and Real-world Interaction: Real-world deployment demands extreme computational efficiency in behavioral adaptation and real-world interaction. My research achieves this through minimal-overhead architectures and parameter-efficient techniques. [ToolkenGPT, NeurIPS'23 Oral; MPT, ICLR'23].
-
Scalable Methods for Safety, Alignment, and Oversight: Governance must scale with AI capabilities without unsustainable scaling of computational or human resources. Our approaches pioneer algorithmic solutions that grow more effective as systems become more capable. [DRPO,EMNLP2'24; Decentralized Arena]
Research Opportunities: I consistently seek out highly motivated students, particularly from underrepresented groups, to join me in various research projects both during the school year and throughout the summer. If you are interested in LLM augmentation (reasoning, tool-using, planning, etc), LLM agents, and AI4Science research, kindly email me expressing your interest.
|
News
-
10/2024: Excited to release Decentralized Arena, a decentralized and democratic LLM benchmarking system, where all LLMs participating in judging each other. It's an automated, fully transparent, faster, more scalable, and less biased version of Chatbot Arena. Check our HF leaderboard.
-
10/2024: Happy to help release the TxT360: A Top-Quality LLM Pre-training Dataset Requires the Perfect Blend. It is the first dataset to globally deduplicate 99 CommonCrawl snapshots and 14 high-quality data sources. Download the dataset for your LLM pre-training here.
-
09/2024: DRPO (Dynamic Rewarding with Prompt Optimization) was accepted to the main conference of EMNLP 2024. It is the first tuning-free method to self-align LLMs with human preferences, without any model tuning or human preference annotations.
-
07/2024: Excited to be selected to attend 2024 AI+Science Summer School at the University of Chicago!
-
07/2024: LLM Reasoners has been accepted to COLM 2024. Congrats to the LLM Reasoners team. Check out our brilliant +1k stars GitHub package for advanced reasoning with LLMs!
-
02/2024: Received a research grant from OpenAI to support our research in agentic systems!
-
01/2024: Wrote a blog, Reflecting on ChatGPT’s First Year: Evolutions, Twists, and Smooth Directions, discussing recent NLP research directions heavily impacted by ChatGPT-related techniques. Welcome to any comments!
-
01/2024: Invited to serve as a Reviewer for KDD 2024, ICML 2024, and COLM 2024 (The first conference on language modeling! Happy to contribute to making this a success!)
-
01/2024: PromptAgent was accepted to ICLR 2024! The first principled framework to formalize the problem of API-based discrete prompt optimization and benchmark the exploration efficiency and prompt transferability!
-
12/2024: Attended NeurIPS 2023 at NOLA! Happy to meet many old and new friends!
-
11/2023: Honored to receive the Top Reviewer Award of NeurIPS 2023. Thanks for the complimentary registration!
-
11/2023: Honored to receive the Best Paper Award at SoCal NLP 2023.
-
11/2023: Three papers, RAP, ToolkenGPT, and PromptAgent, will be presented at SoCal NLP 2023 at UCLA on Nov. 17th, 2023.
-
10/2023: Reasoning via Planning has been featured in the recently published State of AI Report 2023. Check the great report here!
-
10/2023: Reasoning via Planning paper that augments LLM reasoning with external world models and principled planning was accepted to the main conference of EMNLP 2023!
-
09/2023: ToolkenGPT paper that augments LLMs with efficient and plug-and-play tool learning was accepted as an oral presentation of NeurIPS 2023!
-
08/2023: Invited to serve as a Reviewer for ICLR 2024.
-
07/2023: Invited to serve as a PC member for AAAI 2024.
-
06/2023: Invited to give a talk at George Mason University on June 13th, 2023.
-
06/2023: Invited to give a talk at North Carolina State University on June 3rd, 2023.
-
06/2023: Invited to serve as a Reviewer for EMNLP 2023.
-
05/2023: One paper about prompting LLMs with probabilistic reasoning was accepted to ACL 2023!
-
03/2023: Invited to serve as a Reviewer for NeurIPS 2023.
-
03/2023: Invited to serve as an Area Chair for NLPCC 2023.
-
02/2023: Kicked off my postdoc adventure! 😎 Super stoked to work with the brilliant minds from UCSD, CMU, MBZUAI, and beyond to push the boundaries of those incredible large language models and harness their power for human society and other scientific domains! 🚀🔬💡
-
01/2023: One paper, Multitask Prompt Tuning, was accepted to ICLR 2023!
-
01/2023: One paper, Entity Tracking via Effective Use of Multi-Task Learning Models, was accepted to EACL 2023!
-
01/2023: Invited to serve as a Reviewer for ICML 2023.
-
12/2022: Invited to serve as a PC Member for KDD 2023 Research Track.
-
12/2022: Happy to finish my first in-person NLP course teaching, and I'm impressed by what students have achieved after learning the very advanced NLP techniques. Check out this quick summary of their amazing final projects!
-
11/2022: Passed the Ph.D. dissertation defense, Toward Knowledge-centric Natural Language Processing: Acquisition, Representation, Transfer, and Reasoning. Thanks to all my committee members, Prof. Huan Sun, Srinivasan Parthasarathy, Yu Su, and Wei-Lun Chao.
-
08/2022: Will be teaching CSE 5525: Foundations of Speech and Language Processing (Undergrad & Graduate) as an instructor at OSU this fall.
-
08/2022: Invited to serve as a PC member for AAAI 2023.
-
07/2022: Attended NAACL 2022 in Seattle. Presented our CQA Knowledge Transfer paper. Glad to meet all my old and new friends!
-
06/2022: Invited to serve as a PC member for EMNLP 2022.
-
06/2022: Happy to give a talk about Efficient Adaptation of Large Language Models at the MIT Summer Working Group on Large LMs.
-
06/2022: Our TacoBot earned the third-place honor in the inaugural Alexa Prize TaskBot Challenge!
-
06/2022: Happy to give a tutorial about natural language processing and large language models in TDAI Deep Learning Summer School along with Prof. Huan Sun. Slides can be found here.
-
05/2022: Excited to join MIT-IBM Watson AI Lab in Cambridge, Boston as a research intern working with Rameswar Panda and Yoon Kim on efficient adaptation/pruning for large language models.
-
03/2022: Our team TacoBot moved forward to finals of the Alexa Prize TaskBot challenge!
-
03/2022: Honored to receive the 2022 Graduate Research Award of the CSE department.
-
02/2022: One paper, Coherence boosting: When your pretrained language model is not paying enough attention, was accepted to ACL 2022!
-
02/2022: Our team TacoBot moved forward to the semifinals of the Alexa Prize TaskBot challenge! Try "Alexa, let's work together" or "Alexa, assist me" in your Alexa devices or the app when you want to do a DIY or cooking task!
-
12/2021: Passed the Ph.D. candidacy exam with the proposal, "Knowledge-centric Natural Language Processing: Acquisition, Representation, and Reasoning." Thanks to all my committee members, Prof. Huan Sun, Srinivasan Parthasarathy, Yu Su, and Wei-Lun Chao.
-
05/2021: Our team was selected to participate in the Alexa Prize TaskBot Challenge as one of 10 teams over 125 applications initiated from 15 countries! Looking forward to building a smart dialogue system to guide users through complex, multi-step plans (e.g., Cooking and DIY tasks) via multimodal interactions.
-
05/2021: Started Research Intern at Microsoft Research! Excited to work with Nebojsa Jojic and Kolya Malkin on Conditional Text Generation!
-
03/2021: Invited to serve as a PC member of EMNLP 2021.
-
03/2021: Attended WSDM 2021 virtually and presented our work on modeling context pair interaction and learning graph pair emebddings.
-
03/2021: Honored to win Graduate Student Research Poster Award (Top 5) for the 2021 Annual Student Research Poster Exhibition in our CSE department.
-
02/2021: Panelist on the panel discussion in Department of Astronomy, "2001: A Space Odyssey - Science Fiction vs. Science Fact", discussing Science Fiction, Life in Universe, Time & Relativity and Anthropology.
-
01/2021: Received SIGIR Student Travel Grant for WSDM 2021.
-
01/2021: Attended CDAC Rising Stars in Data Science Workshop with the agenda here.
-
12/2020: Honored to be selected to the Rising Stars in Data Science workshop hosted by the Center for Data and Computing (CDAC) at the University of Chicago.
-
12/2020: Invited to serve as a PC member of ACL 2021.
-
10/2020: Invited to serve as a PC member of NAACL 2021.
-
10/2020: One paper, Modeling Context Pair Interaction for Pairwise Tasks on Graphs, was accepted to WSDM 2021 (Acceptance Rate: 18.6%).
-
07/2020: Attended ACL 2020 virtually and presented our paper via QA sessions and pre-recorded video.
-
05/2020: Started Research Intern at NEC Labs America! Excited to work with Dr. Bo Zong on Commonsense Reasoning for NLU.
-
04/2020: One paper, Rationalizing Medical Relation Prediction from Corpus-level Statistics, about building self-interpretable deep learning model for relation prediction was accepted by ACL 2020 (Acceptance Rate: 22.7%).
-
04/2020: Invited to serve as a PC member of NLPCC 2020.
-
09/2019: One paper, Graph Embedding on Biomedical Networks: Methods, Applications, and Evaluations, was accepted by Bioinformatics.
-
08/2019: Attended KDD 2019 in Anchorage, Alaska, and presented our work in an oral talk.
-
06/2019: Received SIGKDD Student Travel Award for KDD 2019.
-
04/2019: One paper, SurfCon: Synonym Discovery on Privacy-Aware Clinical Data, about knowledge extraction from text was accepted by KDD 2019 (Research Track, Acceptance Rate: 14.2%, Oral).
-
08/2018: One paper about Code Summarization was accepted by KDD 2018, Deep Learning Day.
-
06/2018: Attended NAACL 2018 in New Orleans, LA.
|
|
→ Decentralized Arena introduces an automated, scalable system for evaluating LLMs across specialized domains, where the models themselves participate in assessing each other. This democratic evaluation approach achieves 95% correlation with Chatbot Arena rankings while maintaining full transparency and reproducibility.
Blog / Leaderboard
|
|
→ This paper introduces DRPO, a novel tuning-free approach that enables LLMs to achieve self-alignment through dynamic rewarding and search-based optimization, eliminating the need for costly human annotations and model training. This paves the way toward cost-effective and scalable AI alignment.
[EMNLP 2024] PDF / Code / Poster
|
|
→ Tired of manual prompt engineering? PromptAgent offers the first principled framework to formalize the problem of API-based prompt optimization (state, action, reward, etc); also the first to benchmark exploration efficiency and show the transferability of optimized prompts. Targeting for expert-level prompting, there are many exciting directions ahead of PromptAgent!
[ICLR 2024] PDF / Code / Slides / Poster
|
|
→ LLMs lack internal world models for effective reasoning. Reasoning via Planning (RAP) reformulates LLM reasoning as a planning problem, thus incorporating an external world model and principled planning seamlessly. This is a new framework applicable across varying tasks and an exciting direction for LLM augmentation research.
[EMNLP 2023] (Oral, Main) PDF / Code / Slides / Poster / Featured in State of AI Report 2023
|
|
→ ToolkenGPT augments LLMs with massive tools/APIs by representing tools as tokens (“toolken”) and enabling tool calls in the same way as generating regular words. ToolkenGPT is super efficient for learning massive tools, as plugging in new tools is as easy as learning embeddings.
[NeurIPS 2023] (Oral) PDF / Code / Slides / Poster / SoCal NLP 2023 Best Paper Award
|
|
→ We propose Multitask Prompt Tuning (MPT) to exploit the rich cross-task knowledge for more efficient and generalizable transfer learning. MPT learns a single transferrable soft prompt through the use of a novel combination of prompt decomposition and prompt distillation.
[ICLR 2023] PDF / Code / Slides / Poster / Huggingface PEFT PR
|
|
→ We propose a two-stage probabilistic inference paradigm, ThinkSum, to improve LLMs' abilities of reasoning over multiple objects in two steps, Think (e.g., retrieval of associations) and Sum (e.g., aggregation of results), which beats chain-of-thought prompting in hard BIG-bench tasks.
[ACL 2023]PDF / Code / Slides / Poster
|
|
→ We demonstrate that large language models have insufficiently learned the effect of distant words on next-token prediction. We present Coherence Boosting, an inference procedure that increases an LM’s focus on a long context, which greatly improves NLG and NLU tasks.
[ACL 2022] PDF / Code / Slides / Poster (Long Paper, Oral Presentation)
|
|
→ We propose a self-interpretable framework to rationalize the neural relation prediction based on corpus-level statistics. This framework is inspired by human cognitive theory about recall and recognition, which provides structured knowledge triplets as rationales.
[ACL 2020] PDF / Code / Slides / Poster / Video
|
|
→ We propose to discover structured knowledge--synonyms--from the privacy-aware text corpus and present a novel framework to leverage both surface form and context information to discover out-of-distribution synonyms.
[KDD 2019] PDF / Code / Slides / Poster (Research Track, Long Paper, Oral Presentation)
|
|