The doctor is in—and all eyes are on him as he takes the stage at San Francisco’s largest technical artificial intelligence (AI) tech expo, the AI Engineer World Fair. Dressed in blue scrubs and a face mask, the pretend surgeon speaks calmly to the audience as he prepares to slice and dice the patient, a large language model named GPT-2. The operating table—“a table of numbers,” he tells the crowd—is an Excel spreadsheet.
The above presentation took place in June, and the delivery was characteristic of AI entrepreneur Ishan Anand’s creative and accessible approach in talking about his implementation of GPT-2, a type of artificial intelligence (AI) called a large language model (LLM), entirely in Excel. GPT-2 was a precursor, “the grandaddy of ChatGPT,” Anand says.
He built the model during the summer of 2023, and started developing teaching materials and making accompanying YouTube videos by that fall. Anand’s name-recognition has exploded in the past year; he’s been touted in tech magazines like Hacker News and Ars Technica, and the online course he’s teaching on the platform Maven has received rave reviews. One of the advantages of the course is that while it’s designed for those with a STEM background, no coding or AI experience is necessary.
“My goal is to make the technology of these large language models as simple and easy to understand as possible … because it’s really important to democratize the understanding of how LLMs work,” Anand says, “especially as it becomes a key part of being in the workforce but also as we as a society wrestle with these technologies.”
Anand shared his in-demand expertise and unique approach with Wisconsin School of Business students recently at the invitation of Enno Siemsen. Siemsen, the Patrick A. Thiele Distinguished Chair in Business and professor of operations and information management, teaches a Machine Learning class for the Master of Science-Business: Analytics (MSBA) program and reached out to Anand about coming to teach in his class.
“To most people, chatbots like ChatGPT appear like magic,” says Siemsen. “One essential mission of science is to dispel magic—and Ishan’s approach of teaching LLM through Excel is your best bet at dispelling the magic and understanding how LLMs work under the hood.”
Experiencing the ‘view source of AI’
Anand says one of the things that helps with understanding LLMs is what he calls the “view source of AI” that his model—built with a spreadsheet—allows. View source is a reference to the HTML command of the same name that allows a look at the architecture of a web page.
“‘View source’ can be as powerful a force for democratizing technology as ‘open source,’” says Anand. “Part of the reason the web took off is because it had this ‘view source’ capability that allowed anyone to understand how it worked. Open-source AI is more powerful when it’s made accessible and approachable.”
“AI is going to usher in more tools and terms we’ll all need to know,” says Anand. “You’re not going to use this spreadsheet that implements an LLM in production, but it’s a great learning tool to help people understand how they work. I think LLMs are actually way more accessible than people realize.”
Another advantage of Anand’s model is the ability to manipulate and explore the contents without a programming background. “What’s nice about the Excel spreadsheet is you download this file, you can play with it, you can see the inside. [Excel] lets anybody who can read a spreadsheet actually step through and see, ‘Oh, okay, this is how it really works.’ And it lets you debug your thinking.”
Hands-on GPT-2 immersion
The intensive workshop split five hours across two class periods, one each week.
“The job of an LLM is simply to predict the next word, like a sentence that ends with a fill-in-the-blank. The first day covers the process of getting a computer to turn this word problem into a number problem. The second day is about, once we’ve mapped words into numbers, then what are the calculations we do to predict the right word?”
“Big picture: This is just a giant word prediction engine,” Anand told the class.
Students got familiar with main concepts such as tokenization, embeddings, and attention, as well as short breakout sessions to work through practice problems.
“As a machine learning student, the guest lecture by Ishan Anand was invaluable,” says Marcela Salazar. “His approach to demystifying large language models through a clear demonstration of transformer architecture was enriching. His step-by-step breakdown of technical aspects like embeddings and tokenization has laid a strong foundation for understanding models such as Llama, Claude, and GPT.”
Student Brady France says the workshop “provided a comprehensive look at the components behind models like GPT-2, explaining the core concepts that enable these models to predict text.”
“I particularly appreciated the hands-on examples, where we used both Excel and Python to learn how tokenization, embeddings, and attention mechanisms function in practice,” he says. “We also explored how these elements differ in newer models, which provided valuable insight into the evolution of LLMs.”
The first session coincided with the Nobel Prize in Physics announcement surprisingly awarded to two pioneers in machine learning. Anand highlighted the significance to the class: “By the end of these two weeks, you’ll understand how these world-changing models actually work.”