The Sequence Chat: Lianmin Zheng, UC Berkeley About Vicuna, Chatbot Arena and the Open Source LLM Revolution
Was this email forwarded to you? Sign up here The Sequence Chat: Lianmin Zheng, UC Berkeley About Vicuna, Chatbot Arena and the Open Source LLM RevolutionThe co-creator of one of the most important open source LLMs shares his insights about research and development in foundation models.Lianmin Zheng is a Ph.D. student in the EECS department at UC Berkeley, advised by Ion Stoica and Joseph E. Gonzalez. His research interests include LLMs, compilers, and distributed systems. He was awarded the Meta PhD Fellowship. Currently, he is leading the LMSYS efforts and open-source projects including Vicuna and Chatbot Arena. Quick bio
I am a Ph.D. student working on the intersection of AI and systems. I am committed to open-source AI research by developing better models (e.g., Vicuna), evaluations (e.g., Chatbot Arena and MT-bench), and systems (e.g., FastChat, Alpa). I get started in AI from my undergrad research projects. 🛠 AI Work
The vision of Vicuna project is to build powerful models similar to OpenAI’s ChatGPT but with an open recipe. The rapid advancement of large language models (LLMs) has revolutionized AI systems, resulting in unprecedented levels of intelligence as seen in OpenAI's ChatGPT. However, despite its impressive performance, the training and architecture details of ChatGPT remain unclear, hindering research and open-source innovation in this field. So, we started Vicuna project to replicate ChatGPT-like capability with open recipe. This project is inspired by Llama and Alpaca. We emphasize the importance of data quality, so we find the best data source – user shared conversations on ShareGPT.
We used standard instruction fine-tuning and additionally handles multi-turn conversations. We carefully cleaned the collected conversations and only compute loss on the assistant outputs. This makes Vicuna better at multi-turn conversations. In the latest versions of Vicuna, we also extend the context length to 16k with RoPE interpolation. All our code and hyperparameters are available at https://github.com/lm-sys/FastChat.
To scale the training to larger models, you need more GPUs and better parallelism strategies. Finetuning a 33B is actually not that challenging for latest GPUs like H100 (80 GB) or A100 (80 GB), so we just use our existing code in FastChat, which utilizes Pytorch FSDP for parallelism. If you want to efficiently scale to a larger scale with more advanced parallelism strategies, you can check out Megatron-LM, DeepSpeed or our research project Alpa.
Possible topics:
Limitations:
We think we should evaluate LLMs on more open-ended and fresh questions, instead of multi-choice questions like MMLU, so we started Chatbot Arena and MT-bench. Chatbot arena is a benchmark platform for large language models (LLMs) that features anonymous, randomized battles in a crowdsourced manner. So far, we have collected around 70K votes and used these votes to compute Elo ratings of models. You can check out the latest leaderboard. It is based on human preferences. MT-bench is a small set of challenging multi-turn questions where you can use them in a more controlled and automated manner. The details can be found in our paper Judging LLM-as-a-judge with MT-Bench and Chatbot Arena. It is based on GPT-4 grading. I think it is very challenging to build a robust evaluation. Here are some suggestions:
Yes. We are working on enhancing the reasoning and coding ability of Vicuna. Stay tuned! 💥 Miscellaneous – a set of rapid-fire questions
Compiler. Besides generative AI, I worked on several compiler projects such as Alpa (based on Jax/XLA) and Ansor (based on TVM).
Get competitive performance in coding/algorithm competitions such as International Olympiad in Informatics (IOI) and The International Collegiate Programming Contest (ICPC), without seeing the problems in their training data.
The latest vicuna is finetuned from Llama-2. It focuses on chat ability and helpfulness. Compared to base models (e.g., Llama 2, Falcon), it has the instruction-following ability. Compared to other finetunes, the training data (ShareGPT) of Vicuna makes it able to handle chat on a diverse range of topics.
I think that they will continue to coexist, similarly to how we currently distribute software.You’re on the free list for TheSequence Scope and TheSequence Chat. For the full experience, become a paying subscriber to TheSequence Edge. Trusted by thousands of subscribers from the leading AI labs and universities. |
Older messages
Meet SDXL 1.0: Stability AI New Text-to-Image Super Model
Sunday, September 17, 2023
The model represents a major improvement over Stable Diffusion.
🎥 Building, Training & Deploying High-Quality ML Models—a Virtual Hands-On Lab
Sunday, September 17, 2023
Want to get some hands-on training to learn how to generate accurate training datasets for ML models, implement feature pipelines, and better manage the lifecycle of ML models and features? Join us at
NVIDIA, The Most Influential VC In Generative AI
Sunday, September 17, 2023
Sundays, The Sequence Scope brings a summary of the most important research papers, technology releases and VC funding deals in the artificial intelligence space.
Edge 325: A Summary of Our Series About New Techniques Foundation Models
Tuesday, September 12, 2023
18 topics about foundation models covered in one of our most ambitious series.
Falcon-180B Takes Open Source LLMs Closer to GPT-4
Sunday, September 10, 2023
Next Week in The Sequence: Edge 325: We conclude our longest and most sucessful series about new techniques in foundation models with a comprehensive summary. I can't wait to tell you about our
You Might Also Like
Weekend Reading — More time to write
Sunday, November 24, 2024
More Time to Write A fully functional clock that ticks backwards, giving you more time to write. Tech Stuff Martijn Faassen (FWIW I don't know how to use any debugger other than console.log) People
🕹️ Retro Consoles Worth Collecting While You Still Can — Is Last Year's Flagship Phone Worth Your Money?
Saturday, November 23, 2024
Also: Best Outdoor Smart Plugs, and More! How-To Geek Logo November 23, 2024 Did You Know After the "flair" that servers wore—buttons and other adornments—was made the butt of a joke in the
JSK Daily for Nov 23, 2024
Saturday, November 23, 2024
JSK Daily for Nov 23, 2024 View this email in your browser A community curated daily e-mail of JavaScript news React E-Commerce App for Digital Products: Part 4 (Creating the Home Page) This component
Not Ready For The Camera 📸
Saturday, November 23, 2024
What (and who) video-based social media leaves out. Here's a version for your browser. Hunting for the end of the long tail • November 23, 2024 Not Ready For The Camera Why hasn't video
Daily Coding Problem: Problem #1617 [Easy]
Saturday, November 23, 2024
Daily Coding Problem Good morning! Here's your coding interview problem for today. This problem was asked by Microsoft. You are given an string representing the initial conditions of some dominoes.
Ranked | The Tallest and Shortest Countries, by Average Height 📏
Saturday, November 23, 2024
These two maps compare the world's tallest countries, and the world's shortest countries, by average height. View Online | Subscribe | Download Our App TIME IS RUNNING OUT There's just 3
⚙️ Your own Personal AI Agent, for Everything
Saturday, November 23, 2024
November 23, 2024 | Read Online Subscribe | Advertise Good Morning. Welcome to this special edition of The Deep View, brought to you in collaboration with Convergence. Imagine if you had a digital
Educational Byte: Are Privacy Coins Like Monero and Zcash Legal?
Saturday, November 23, 2024
Top Tech Content sent at Noon! How the world collects web data Read this email in your browser How are you, @newsletterest1? 🪐 What's happening in tech today, November 23, 2024? The HackerNoon
🐍 New Python tutorials on Real Python
Saturday, November 23, 2024
Hey there, There's always something going on over at Real Python as far as Python tutorials go. Here's what you may have missed this past week: Black Friday Giveaway @ Real Python This Black
Re: Hackers may have stolen everyone's SSN!
Saturday, November 23, 2024
I wanted to make sure you saw Incogni's Black Friday deal, which is exclusively available for iPhone Life readers. Use coupon code IPHONELIFE to save 58%. Here's why we recommend Incogni for