Zhen Wang

Blogs

I write blogs from time to time to introduce some of my new research topics or reflect on past research. (Opinions expressed are my own.)

ChatGPT's Second Year: 10 Aha Moments of 2024 That Rewired 2025

Jan, 2025

Here comes the second consecutive blog along the ChatGPT/AI "twist" Reflection Series. The scaling of computing in 2024 is slowing down, while inference-time scaling emerged. Again, 2024 delivered a series of unexpected “plot twists,” ranging from rethinking how LLMs handle reasoning and benchmarks and LLM agents to the seamless integration of multimodal I/O, small language models, persona vs. personalization, world models, AI scientists, etc. We shall see more vibrant "twists" in 2025.

Decentralized Arena via Collective LLM Intelligence

Building Automated, Robust, and Transparent LLM Evaluation for Numerous Dimensions
Oct, 2024

This blog introduces Decentralized Arena (DeArena), the first scalable, automated LLM evaluation system, expanding and refining the "Chatbot Arena" concept across a wide range of dimensions. By enabling LLMs to evaluate each other, DeArena fosters a transparent, autonomous, and reproducible approach to AI benchmarking, minimizing bias and computational demands through an efficient sorting-based algorithm that outperforms traditional methods. Its potential reaches into the oversight of future superintelligence, offering a democratic, collective intelligence-driven alternative when human judgment becomes insufficient or unreliable.

TxT360: A Top-Quality LLM Pre-training Dataset Requires the Perfect Blend

Oct, 2024

This blog introduces TxT360 (Trillion eXtracted Text), a large-scale pre-training dataset from LLM360. It is the first dataset to globally deduplicate 99 CommonCrawl snapshots and 14 high-quality data sources from diverse domains (e.g., FreeLaw, PG-19, etc.). We released this blog to explain every detail of how TxT360 was produced. Participating in this great project and leading several initiatives gave me better insights into how data scaling works in balancing noisy web and highly curated data.

Reflecting on ChatGPT’s First Year: Evolutions, Twists, and Smooth Directions

Jan, 2024

About one year after ChatGPT's release, this blog reflects on its impact on LLM research, examining which new topics have emerged and which older ones have become less relevant. What surprised me most is how rapidly the field has evolved in just one year. I highlighted several "twisted" directions, meaning areas where there has been an unexpected surge or decline in academic interest. In contrast, some "smooth" directions have consistently received significant research attention. By better understanding these shifts in research focus, we can more accurately measure the influence of ChatGPT-related techniques and more effectively prioritize the impactful areas we want to explore next.