You can read our test from June in which we put LLMs through 500 geolocation puzzles here: www.bellingcat.com/resources/ho... This time we re-ran the trial including Google AI Mode, GPT-5, GPT-5 Thinking, and Grok 4 into the mix.
You can read our test from June in which we put LLMs through 500 geolocation puzzles here: www.bellingcat.com/resources/ho... This time we re-ran the trial including Google AI Mode, GPT-5, GPT-5 Thinking, and Grok 4 into the mix.
Still no AI model has achieved absolute accuracy, with all at some point, returning a hallucination. Even the models with the best results pointed confidently to at least one wrong location. The tools therefore should not be used in isolation when trying to locate an image.
Out of them all, Google AI mode outperformed, beating the scores of our last winner ChatGPT o4-mini-high. It also provided better results than Gemini 2.5 Pro Deep Research, despite being powered by a version of Gemini 2.5.
Google’s AI Mode, was the first, and only model so far, to correctly identify Noordwijk as the location in this photograph, used in both this trial and the one in June. Many models struggled to locate the town with GPT-5 Pro and Thinking wrongly identifying it as a beach in France.
Probably could have killed Bin Laden sooner if we had this capability back then. For years they tried to figure out how to geolocate based on the few pictures they obtained.
I wonder if this success is consistently repetable - LLMs always have a bit of randomness sprinkled.
See the full results by clicking into our investigation here: www.bellingcat.com/resources/20... If you have a suggestion for what model we should test next or what other OSINT skillsets we should be testing them on, let us know in the comments…
And then there's this lovely map, produced by Chat GPT 5 (see thread):
Very sobering report, calling for caution in the use of AI - thank you. Could it be that Google has an edge because of its longer history of data collection (and thus advantage in training)?
Google reviews with user loaded photos have been useful for reverse image searches even years prior to all the big money hype about AI.
📌