Shuyuan

This course contains a curriculum unit to be integrated into the Shuyuan MWC curriculum. This unit's main objective is to provide students a conceptual understanding of how modern LLMs work, grounded in familiarity with their architecture and core algorithms. This is broadly the same approach taken by Stanford's CS336: Language Modeling from Scratch and Andrej Karpathy's online course, Neural Networks: Zero to Hero, though these labs are designed for introductory high school CS students.

The unit's organizing logic is to build successive models aiming to reproduce the generative AI students are familiar with, building up the conceptual framework needed to understand Transformer-based LLMs, giving a hands-on understanding of the components as far as possible, ultimately aiming for conceptual understandings while minimizing mathematical prerequisites.

The sequence of labs is:

As with all MWC units, this unit concludes with a project,