Post by Caracter / Redsky

What's pretraining?

jul 22, 2025, 10:15 pm • 0 0

Replies

the training that comes before the training

jul 22, 2025, 10:21 pm • 3 0 • view

KIDDING! it’s the biggest phase of LLM training. like when GPT4 famously cost $100M, most of they was on pretraining these days a lot more post training is mixed in, but pretraining is still very large

jul 22, 2025, 10:21 pm • 1 0 • view

in my mind(prob oversimplified): making a LLM is: -Get training data - Clean it - Make the model try and predict next token base on preceding tokens. Reward when right. Repeat - Tune the models to human preferences. What part would pretraining of that be?

jul 23, 2025, 11:34 am • 0 0 • view

that’s actually pretty good. the “predict next token” is pretraining. preferences is post training data is kind of an ever-present problem throughout the process

jul 23, 2025, 11:36 am • 1 0 • view

I knew what post-training meant but not pretraining 😅 on data, I guess model makers mostly reuse what they already collected?

jul 23, 2025, 5:32 pm • 0 0 • view

for data I asked because even models with a cutoff date supposedly in 2025 (like Gemini 2.5 in January) will often default to 2024 or even earlier knowledge. So maybe this is because most training data is <2023?

jul 23, 2025, 5:48 pm • 0 0 • view