I am trying to learn more about inference time compute. Broadly the same model doing more work where it matters Any pointers?
I am trying to learn more about inference time compute. Broadly the same model doing more work where it matters Any pointers?
I hope this is helpful. Blog - Scaling RL Compute - gr.inc/blog/scaling... and Repo (not maintained) - github.com/xjdr-alt/ent...