That boils down to you not understanding what you're using. It didn't used to be better, it quite literally used to be horrible. Newer models are significantly better in general. As for why it generated different results, LLMs are not deterministic in nature.