Post by Andy Bruce (Reuters) / Redsky

So yeah, all very silly. Mea culpa. Thankfully, I doubt very many people saw it but that's not really the point. If it had been for a story, I would've been a lot more thorough (and with others checking), but that's not the point either.

jun 23, 2025, 11:40 am • 84 0 • view

Point is... 1. Don't trust AI to run even the most basic statistical things (yes, yes I knew that already, but then you get told you must use this stuff, and then...). 2. Don't rush to post things to social media, it creates dumb and embarrassing situations like this one.

jun 23, 2025, 11:40 am • 283 46 • view

Still better to be honest about one's shortcomings and use it as a learning opportunity, rather than just hope people forget about the error and move on. Yours, Today's AI schmuck.

jun 23, 2025, 11:40 am • 264 4 • view

Not that I'm warning people not to use AI. It's just good to be (more) aware of what it's good for, and what it's not good for.

jun 23, 2025, 12:00 pm • 53 1 • view

Writing an Excel macro or Python script? Great! Grammar, spell-checking? Cool. Exploring how a story idea might link thematically to other fields and disciplines? Useful. Counting? No.

jun 23, 2025, 12:00 pm • 80 2 • view

Okay but like if it's not good for something as simple as COUNTING, then ... how smart and how good is it as a tool

jun 23, 2025, 3:28 pm • 1 0 • view

And just to stress - for folks saying this shows the decline of journalistic standards: this wasn't used for a Reuters article, it didn't go anywhere near one. It was just something I tweeted hastily.

jun 23, 2025, 1:46 pm • 38 0 • view

I appreciate there is some overlap between an employer and their employee's social media feed - and how my conduct reflects on them. But it's also important to draw a distinction that my ramblings here are not Reuters.

jun 23, 2025, 1:46 pm • 25 1 • view

Having reverse-Streisanded what is really a pretty minor mistake in the scheme of things, concerning an ephemeral tweet that got virtually no traction - I'd just like to reassure everyone that I've made waaaay more serious errors that have not attracted any attention.

jun 23, 2025, 2:21 pm • 41 0 • view

Bravo Andy. A lot of integrity on display here. (Didn't see the chart).

jun 23, 2025, 5:45 pm • 2 0 • view

You're a fine journo imo. I never saw the tweet and your transparency over this is admirable. Mistakes happen.

jun 24, 2025, 1:07 pm • 1 0 • view

That's not really better, though.

jun 23, 2025, 4:26 pm • 2 0 • view

Most everyone does. This was thoughtfully put and these things of course are sucker machines, built to keep you talking, it happens, it's all without malice, just that none of them care about you at all other than that you prompt it again

jun 23, 2025, 4:52 pm • 0 0 • view

Thank you SO much for this Andy. There have been some infamous million-dollar mistakes that often get rolled out for demonstrating these issues, but they're always so unrelatable and get lost in the noise. I feel this is a far more effective teachable moment. I'll be referring to it for some time 🙏

jun 30, 2025, 2:26 pm • 1 0 • view

Great thread. Recently, when sifting through hundreds of pages of pdfs for counting purposes, I compared the no-effort AI version to the high-effort human version. AI answers were close enough to sound credible, but were also in almost every case wrong. bsky.app/profile/toby...

jun 23, 2025, 2:01 pm • 8 1 • view

That’s the danger though isn’t it. If it was obviously wrong, it would be easier to spot. But it’s here and there correct and then made up and at some point we no longer know what’s real and not.

jun 23, 2025, 2:04 pm • 3 0 • view

And it’s that little bit of non-reality that can slip through, and then gets amplified each time until we have created a monster not grounded in reality and it will take a long time to unpick where the errors came in

jun 23, 2025, 2:11 pm • 2 0 • view

Like when it counts the Rs in strawberry, straw and berry

jun 23, 2025, 7:17 pm • 0 0 • view

hey andy, appreciate your transparency. earlier in the thread you said work has been encouraging you to use this AI. so even though this didn't impact work you did for your employer, do you think this error was partly a result of the pressure to use copilot?

jun 23, 2025, 3:03 pm • 3 0 • view

And also, have you or will you share this experience with your employer's decision makers?

jun 23, 2025, 4:53 pm • 0 0 • view

Working at a major tech company and we have been inundated with requested to use AI in our daily work. Some of it is good, but when I’m told to use it for peer reviews and generating self reviews it feels awful.

jun 23, 2025, 3:07 pm • 0 0 • view

That may be the most thorough and honest mea culpa I have ever seen. Two Our Fathers, one Hail Mary and file it under "shit happens" sub folder "nobody died".

jun 23, 2025, 1:58 pm • 5 0 • view

😂 I'm starting to regret it - I've made many more serious mistakes than this one that have received a lot less attention!

jun 23, 2025, 2:00 pm • 3 0 • view

AI is pretty good at writing Visual Basic; I had a thing I needed to do in Access that the API didn't seem to support doing with menus (changing a linked table's data source from a file to a table on a SQL server); and AI gave me a working subroutine on the first try. But I did extensively test it!

jun 23, 2025, 3:06 pm • 1 0 • view

One way to (maybe?) get a better outcome in your case: ask it to 1/ process the text and add markers to the region/places 2/ write a python script to collect these markers and do the stats you want. you get a usable doc to check its marking work & the code is testable (and probably reliable)

jun 23, 2025, 2:10 pm • 5 0 • view

Thank you, that's a great idea.

jun 23, 2025, 2:11 pm • 2 0 • view

I would add… Using an LLM to generate text, music or images? Absolutely not…at a minimum, not until we have addressed the rot/exploitation at the heart of current AI models. No more strip mining of our culture & humanity to build commercial products. That shouldn’t be controversial.

jun 23, 2025, 1:26 pm • 25 1 • view

💯

jun 23, 2025, 1:27 pm • 7 0 • view

oh god don't trust ai code (as a programmer) This is one of my biggest things I am afraid of. Code that looks correct on the surface and produces valid results for the test cases at hand, but doesn't take edge cases into account in the way that a human programmer can learn to.

jun 23, 2025, 3:03 pm • 22 4 • view

I hear you - and I wouldn't want to overstate what I said. I've used it to write some Excel macros and very simple Python scripts - to perform stuff I've done manually for a long time, know well (in terms of whether the output works), and can test. I would seek expert help to do more!

jun 23, 2025, 3:27 pm • 0 0 • view

Software support engineer here, and that is my experience as well - it's great for "write me a script/query for blah" or "parse this well-defined file format" things that I COULD do but would require some effort on my part, and I would never even consider letting it write production code.

jun 23, 2025, 3:44 pm • 3 0 • view

I have gotten myself into such a rat hole with it trying to do something moderately complicated.

jun 23, 2025, 3:36 pm • 2 0 • view

Also if you start using it to try and fix code it already produced the situation often becomes a tangle of crap where you are always shifting one more thing to try and fix the messes the last iteration produced.

jun 23, 2025, 3:36 pm • 2 0 • view

For simple tasks, highly specific functions and scripts and interview questions it’s amazing, which makes sense as it’s basically condensed example code, tutorials and stack overflow answers.

jun 23, 2025, 3:37 pm • 1 0 • view

Since going over to Windows 11 I cannot see the cursor on Excel spreadsheets.

jun 23, 2025, 12:59 pm • 1 0 • view

There are approximately 8,464,265 ways to improve cursor visibility both in Office options and Windows Settings. (AI may have overestimated the true count there, but... there's a lot of them.)

jun 23, 2025, 2:32 pm • 1 0 • view

My firm works within something called Citrix. Access their external IT contractor has had no joy. They achieved a temporary fix by backdating to an earlier graphics card but the system soon updated that to one that restored the problem. Outside Citrix on my laptop on my Excel there is no problem

jun 23, 2025, 5:40 pm • 1 0 • view

It's bad at grammar and spelling It *might* produce a decent macro/script.. but you have to check. It's *only* potentially useful for things that are much easier to verify than to create/locate. --even then it's a climate-apocalypse plagiarism machine and should not be used unless truly desperate

jun 23, 2025, 5:27 pm • 2 1 • view

I’m not clear exactly what happened or the extent of the wrongness. (Your original post is deleted) Can you say how different the AI analysis was from the actual document content? (For Wales, for example.)

jun 23, 2025, 1:05 pm • 0 0 • view

Sorry. Basically the AI response was right for a lot of places, badly wrong for others - eg it massively undercounted references to Yorkshire. And while I should have spotted that, some other similarly-sized regions had genuinely a small number of references, so it wasn't implausible.

jun 23, 2025, 2:06 pm • 4 0 • view

Out of curiosity, have you tried similar exercises with other AI platforms, or was this a one-off with Copilot?

jun 23, 2025, 2:51 pm • 0 0 • view

Thanks for the details. A key danger with AI seems to me to be just that: plausibility. It’s programmed, in effect, to give plausible answers. It ruins all the short cuts and rules of thumb we have to detecting dodgy work.

jun 23, 2025, 2:09 pm • 7 1 • view

📌 Pinning this because "ruins all the shortcuts and rules of thumb for detecting dodgy work" is an excellent explanation of what I've been trying to explain to my boss.

jun 23, 2025, 3:44 pm • 3 0 • view

I’m still confused as to why LLMs are bad at counting. I’ve run into the same problem, even though it seems like something that they should be good at. Counting is a basic enough computing function that it makes me doubt everything else

jun 23, 2025, 2:54 pm • 5 0 • view

Because AI's don't think, they just generate plausible text. If prompted "2+2", it answers "=4", is not because it learned to add - it's just because statistically "=4" was the most probable next sequence of letters based on its training. LLMs *do not* reason or think...

jun 23, 2025, 6:37 pm • 1 0 • view

You can still do cool stuff, mind. If your AI tool recognises it's being asked to do maths, instead of actually trying to guess the result it can (behind the scenes) say "show me a Python script that would calculate this sum" (essentially a text task), and then run the script...

jun 23, 2025, 6:39 pm • 1 0 • view

This is how things like ChatGPT give the impression of being smarter than the LLMs they are based on. LLMs *are* super cool, for some things. The problem is that they have been bullshitted to infinity by the Altmans/Musks of this world, and people are going to be very disappointed.

jun 23, 2025, 6:41 pm • 1 0 • view

I'd like to understand this as well. I asked Google AI how far along my DIL's pregnancy is and even with spitting out all the correct dates it confidently said 20 weeks and three days..my first, strong reaction was why did she want to wait so long before letting us know, then figured it can't be.

jun 23, 2025, 3:25 pm • 0 0 • view

She was eight weeks and two days along. I just can't rely on AI even as a starting off point if it can't count.

jun 23, 2025, 3:25 pm • 1 0 • view

It can't count *at all*. It doesn't even know what a number is. What it's actual giving you essentially is the most probable text that follows that question. Simplistically, it's seen more training where the answer to that question was "20 weeks" than "8 weeks", so that's the answer it gives.

jun 23, 2025, 6:46 pm • 3 0 • view

Wow, I did not understand that.

jun 23, 2025, 6:49 pm • 0 0 • view

Because counting is not something they do. They don't answer questions but produce answer-like objects by predicting what you want to read. At no point do they engage with the meaning of the question.

jun 23, 2025, 5:32 pm • 1 0 • view

LLMs are "bad at counting" because all they do is produce the statistically most likely next thing to come in a sentence. They don't actually count anything.

jun 23, 2025, 3:44 pm • 6 0 • view

i feel for you 🫡 never ask them to count or do simple math!

jun 23, 2025, 12:05 pm • 7 0 • view

Yes. And god knows how many times I've seen people post stuff like: "I asked this AI if 5 is less than 4 and look what it told me..."

jun 23, 2025, 12:07 pm • 4 0 • view

I often use the chrome browser bar to calculate a quick percentage for non important stuff while on a page. (Eg, 5.2% of a number). Once I forgot I was using a different browser (Brave) which gave me an AI generated answer when I typed in x percent of y, which I knew instantly was (massively) wrong.

jun 23, 2025, 2:07 pm • 2 0 • view

Don't be so blindly accepting of the macro and script. IF you know the langugages and the expected outcomes, you can check them. But you still need to check them -- imitative AI makes weird, hard to find mistakes even in coding.

jun 23, 2025, 3:25 pm • 3 0 • view

I hear you - and I wouldn't want to overstate what I said. I've used it to write some Excel macros and very simple Python scripts - to perform stuff I've done manually for a long time, know well (in terms of whether the output works), and can test. I would seek expert help to do more!

jun 23, 2025, 3:29 pm • 2 0 • view

You should be

jun 23, 2025, 12:42 pm • 1 0 • view

And that's the problem with the term AI. There is useful AI - supervised learning for medical imaging, or Alpha Fold. Then there are LLM's, which are basically bullshit generators. And we have enough human ones of those already eprints.gla.ac.uk/327588/1/327...

jun 23, 2025, 2:24 pm • 3 1 • view

Admitting a mistake, on the internet? What the hell man!

jun 23, 2025, 11:48 am • 4 0 • view

Refreshing perspective. Genuinely, thank you.

jun 23, 2025, 1:50 pm • 2 0 • view

Hey, thanks for the introspection and willingness to go "my bad", and for not doubling down. I appreciate the examination of what happened.

jun 23, 2025, 5:53 pm • 1 0 • view

Yeah, it's innumerate. You can have it write a python script to make to do things like count things in a document, but if you ask it to do anything requiring understanding numbers, it'll fail.

jun 23, 2025, 3:03 pm • 0 0 • view

Andy your retraction and explanation does you great credit, but I'm left with some questions: You say you're "encouraged to use Copilot at work"— Do you think that the policy should be questioned? Is it allowed to be questioned? Is this damaging work product? Are others having the same experiences?

jun 23, 2025, 2:16 pm • 5 0 • view

Whether he is able to challenge or ignore it I have no idea but it's definitely a horrible policy that's damaging a lot of businesses.

jun 23, 2025, 5:33 pm • 1 0 • view

There’s also the question of the environmental damage ai does.

jun 23, 2025, 2:56 pm • 2 0 • view

As someone who works with AI on a daily basis, I would say don't trust AI to get anything right. You have to read every word it spits out and know the subject matter inside out, so you can catch all the errors.

jun 23, 2025, 6:49 pm • 1 0 • view

Bummer! Would love to see the model and prompt you used if you’re up for screenshotting.

jun 23, 2025, 1:54 pm • 0 0 • view

Replies