Post by Tim Kellogg / Redsky

Tim Kellogg • Feed

that’s actually pretty good. the “predict next token” is pretraining. preferences is post training data is kind of an ever-present problem throughout the process

jul 23, 2025, 11:36 am • 1 0

Replies

I knew what post-training meant but not pretraining 😅 on data, I guess model makers mostly reuse what they already collected?

jul 23, 2025, 5:32 pm • 0 0 • view

for data I asked because even models with a cutoff date supposedly in 2025 (like Gemini 2.5 in January) will often default to 2024 or even earlier knowledge. So maybe this is because most training data is <2023?

jul 23, 2025, 5:48 pm • 0 0 • view