Smol and Ultra-Scale Training

view markdown

In my capacity as a co-lead for the ML Systems & Theory interest group at the Cohere Labs Community, we have decided to take a deep dive into HuggingFace’s smol and ultra-scale playbooks. The intention behind following these resources is that they provide the perfect balance of systems and theory in machine learning, which is of great interest to our attendees.

Our intention is to use the textbook as a guide to introduce key concepts, from which we can dive deeper and focus on interesting research and key applications. Below, you will find the recordings of the sessions that I have delivered:

Session One: An Introduction To Training and Memory-Overheads.

Session Three: An Overview of Attention in the Transformer