Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
rasbt 68c33a64e6 | 15 hours ago | |
---|---|---|
.github/workflows | 1 week ago | |
appendix-A | 3 weeks ago | |
appendix-D | 1 week ago | |
appendix-E | 15 hours ago | |
ch01 | 6 months ago | |
ch02 | 1 week ago | |
ch03 | 1 week ago | |
ch04 | 6 days ago | |
ch05 | 1 week ago | |
ch06 | 1 day ago | |
setup | 3 weeks ago | |
.gitignore | 1 day ago | |
LICENSE.txt | 1 month ago | |
README.md | 2 days ago | |
requirements.txt | 1 week ago |
This repository contains the code for coding, pretraining, and finetuning a GPT-like LLM and is the official code repository for the book Build a Large Language Model (From Scratch).
(If you downloaded the code bundle from the Manning website, please consider visiting the official code repository on GitHub at https://github.com/rasbt/LLMs-from-scratch.)
In Build a Large Language Model (From Scratch), you'll discover how LLMs work from the inside out. In this book, I'll guide you step by step through creating your own LLM, explaining each stage with clear text, diagrams, and examples.
The method described in this book for training and developing your own small-but-functional model for educational purposes mirrors the approach used in creating large-scale foundational models such as those behind ChatGPT.
Please note that this README.md
file is a Markdown (.md
) file. If you have downloaded this code bundle from the Manning website and are viewing it on your local computer, I recommend using a Markdown editor or previewer for proper viewing. If you haven't installed a Markdown editor yet, MarkText is a good free option.
Alternatively, you can view this and other files on GitHub at https://github.com/rasbt/LLMs-from-scratch.
[!TIP]
If you're seeking guidance on installing Python and Python packages and setting up your code environment, I suggest reading the README.md file located in the setup directory.
Chapter Title | Main Code (for quick access) | All Code + Supplementary |
---|---|---|
Setup recommendations | - | - |
Ch 1: Understanding Large Language Models | No code | - |
Ch 2: Working with Text Data | - ch02.ipynb - dataloader.ipynb (summary) - exercise-solutions.ipynb |
./ch02 |
Ch 3: Coding Attention Mechanisms | - ch03.ipynb - multihead-attention.ipynb (summary) - exercise-solutions.ipynb |
./ch03 |
Ch 4: Implementing a GPT Model from Scratch | - ch04.ipynb - gpt.py (summary) - exercise-solutions.ipynb |
./ch04 |
Ch 5: Pretraining on Unlabeled Data | - ch05.ipynb - gpt_train.py (summary) - gpt_generate.py (summary) - exercise-solutions.ipynb |
./ch05 |
Ch 6: Finetuning for Text Classification | - ch06.ipynb | ./ch06 |
Ch 7: Finetuning with Human Feedback | Q2 2024 | ... |
Appendix A: Introduction to PyTorch | - code-part1.ipynb - code-part2.ipynb - DDP-script.py - exercise-solutions.ipynb |
./appendix-A |
Appendix B: References and Further Reading | No code | - |
Appendix C: Exercise Solutions | No code | - |
Appendix D: Adding Bells and Whistles to the Training Loop | - appendix-D.ipynb | ./appendix-D |
Appendix E: Parameter-efficient Finetuning with LoRA | - appendix-E.ipynb | ./appendix-E |
Shown below is a mental model summarizing the contents covered in this book.
Several folders contain optional materials as a bonus for interested readers:
If you find this book or code useful for your research, please consider citing it:
@book{build-llms-from-scratch-book,
author = {Sebastian Raschka},
title = {Build A Large Language Model (From Scratch)},
publisher = {Manning},
year = {2023},
isbn = {978-1633437166},
url = {https://www.manning.com/books/build-a-large-language-model-from-scratch},
note = {Work in progress},
github = {https://github.com/rasbt/LLMs-from-scratch}
}
No Description
Jupyter Notebook Python Text other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》