4 Comments
User's avatar
Mandy Liu's avatar

Thank you for sharing Cornellius! One to bookmark for myself!

Expand full comment
Cornellius Yudha Wijaya's avatar

You are welcome Mandy. I am glad you find it useful.

Expand full comment
Meng Li's avatar

The training process for large models mainly includes the following stages:

1. Pretraining Stage

2. Tokenizer Training

3. Language Model Pretraining

4. Dataset Cleaning

5. Model Performance Evaluation

6. Instruction Tuning Stage

7. Open Source Dataset Organization

8. Model Evaluation Methods

Expand full comment
Cornellius Yudha Wijaya's avatar

Thank you for the input! It's certainly true the cycle for training LLM following the steps you pointing out.

Expand full comment