The goal of this assignment is to develop a decoder-only transformer language model from scratch. You will begin by implementing a minimal version of the transformer model, training it on a text dataset, and then progressively add more advanced techniques into your implementation. Please note that you have to write the entire code on your own. To ensure that you gain a deep understanding of the workings of transformer based language models, you are restricted to using only the most basic functionalities of PyTorch (such as nn.Linear, nn.Embedding, nn.Parameter, simple non-linearities, and nn.Dropout) along with other common libraries (Numpy, Pandas, etc.)