openai took the microservices approach — a model router lets teams work in parallel on different models with little to no coordination. it’s more flexible bc you can tackle wildly different ideas this is the monolith approach. it’s really only good for scaling down compute