The Random Transformer | Understand how transformers work by demystifying all the math behind them

osanseviero.github.io

The Random Transformer | Understand how transformers work by demystifying all the math behind them

osanseviero.github.io

ericjmorey@programming.devM to

Machine Learning@programming.dev · 10 months ago

hackerllama - The Random Transformer

osanseviero.github.io

Understand how transformers work by demystifying all the math behind them

January 1, 2024 - Omar Sanseviero writes:

In this blog post, we’ll do an end-to-end example of the math within a transformer model. The goal is to get a good understanding of how the model works. To make this manageable, we’ll do lots of simplification. As we’ll be doing quite a bit of the math by hand, we’ll reduce the dimensions of the model. For example, rather than using embeddings of 512 values, we’ll use embeddings of 4 values. This will make the math easier to follow! We’ll use random vectors and matrices, but you can use your own values if you want to follow along.

Read The Random Transformer | Understand how transformers work by demystifying all the math behind them

You must log in or register to comment.

Chat