Chris Carollo
@crazytalk.bsky.social
Developer on L4D2, Dota 2, and Steam at Valve. Thief and DX:IW before that. Gameplay and machine learning. Comments my own. BLM Accountability in Policing
created August 12, 2023
64 followers 25 following 620 posts
view profile on Bluesky Posts
Chris Carollo (@crazytalk.bsky.social) reply parent
I mean, "AI" is incredibly ill-defined. Does it include anything that uses ML? That uses an LLM? That uses any sort of deep neural net? Generating an embedding from an image isn't "AI" as most people understand it (a chatbot-ish LLM). Either way, it's not very relevant re: this Apple feature.
Chris Carollo (@crazytalk.bsky.social) reply parent
I'm sorry that you said something that was objectively false and I called you on it. All the rest of this Covid policy relitigating isn't something that I'm interested in, nor did I assert anything about. "Distraction", indeed, I suppose.
Chris Carollo (@crazytalk.bsky.social) reply parent
The same 25% and 50% capacity rules applied for bars as restaurants. But again, you claimed both were not open. Which was wrong! They were, in fact, open at 50% capacity ~9 months before schools were. That's all I'm saying, and it's objectively true.
Chris Carollo (@crazytalk.bsky.social) reply parent
Restaurants 25% June 5 2020: www.capitolhillseattle.com/2020/06/king... Restaurants 50% June 20 2020: seattle.eater.com/2020/6/16/21... You're the one that claimed "Bars and restaurants were not open", but they were! That there were reasons for that does not make you less wrong.
Chris Carollo (@crazytalk.bsky.social) reply parent
You definitely can do partial-capacity schooling: Seattle's original reopening required 30% in-person with the rest allowed remote. But this is just moving goalposts from your original incorrect claim: in point of fact, bars and restaurants were open way earlier than schools.
Chris Carollo (@crazytalk.bsky.social) reply parent
In Seattle, restaurants were allowed some amount of indoor dining (25% capacity) on June 5 2020, and schools didn't offer partial in-person schooling until April 5 2021.
Chris Carollo (@crazytalk.bsky.social) reply parent
Why? It's useful and this seems entirely reasonable: support.apple.com/en-us/122033
Chris Carollo (@crazytalk.bsky.social) reply parent
It was pretty foreseeable though? Lots of people were worried about it at the time, and were advocating for low-cost effective things like HEPA filters + box fans. Bars/restaurants being open but schools being closed was pretty galling and felt backward, at the time, to a lot of people.
Chris Carollo (@crazytalk.bsky.social) reply parent
Broader studies than just these two states show a different and meaningful pattern: www.nytimes.com/2025/02/11/b...
Chris Carollo (@crazytalk.bsky.social) reply parent
This seem entirely legitimately positive though? Initial results look good, needs more clinical time to see real-world results, but it's exactly the kind of the life-improving thing that we should be working on. I don't get the need to crap on any advance in this space.
Chris Carollo (@crazytalk.bsky.social) reply parent
This was lot less true in 2024 than 2020 though -- it was only a 37%/24% split D/R, a 60% decrease for Dems and 32% decrease for Reps. Increased Dem engagement likely means more previously-mail-voters would actually show up at the polls too. Feels like it's plausibly a wash?
Chris Carollo (@crazytalk.bsky.social) reply parent
I think this point is vastly more influential than any of your 1/2/3/4. Musk's turn began when Covid precautions started messing with Tesla factories, etc. It was simply easier to end it through denial than though vaccination.
Chris Carollo (@crazytalk.bsky.social) reply parent
The Cracker Barrel logo is a distraction. Militarizing DC and taking over the Fed are both not. This doesn't seem that hard.
Chris Carollo (@crazytalk.bsky.social) reply parent
I follow a small number of people and only use the Following feed, and I almost entirely avoid it. If I couldn't, I wouldn't still be there, but it is possible to have a well-curated list of follows.
Chris Carollo (@crazytalk.bsky.social) reply parent
Or you just use the Following feed with a small number of follows and mostly avoid the algorithm.
Chris Carollo (@crazytalk.bsky.social) reply parent
The thing is, outdoor protests *were* much safer than going to (indoor) work or church. This advice made people mad, but it wasn’t wrong!
Chris Carollo (@crazytalk.bsky.social) reply parent
They didn’t buy the policy shift, and saw Harris as more extreme than Trump. But that doesn’t mean the valence of the change, or trying to focus on it, was politically wrong. Or that not making the policy change would have made that narrative change easier.
Chris Carollo (@crazytalk.bsky.social) reply parent
My sense is that this was necessary-but-not-sufficient, in that it’s harder to get people to buy your messaging when it’s fully manufactured, but you need to do the work to generate the narrative too. And Dems failed more on the latter than the former.
Chris Carollo (@crazytalk.bsky.social) reply parent
It was to a large extent about the *narrative* of inflation (which real inflation enabled), which is a maybe subtle but significant difference.
Chris Carollo (@crazytalk.bsky.social) reply parent
That’s mostly a computational expense issue though; there’s nothing inherent about LLMs that make them static. You could have some RL score function that continuously fine-tuned based on interactions if you had the compute for it.
Chris Carollo (@crazytalk.bsky.social) reply parent
They might be worse at *arithmetic* because they see the world as tokens, but they’re way way better at math as a whole than anything we’ve made before.
Chris Carollo (@crazytalk.bsky.social) reply parent
The Regional
Chris Carollo (@crazytalk.bsky.social) reply parent
The Novemberists
Chris Carollo (@crazytalk.bsky.social) reply parent
ChatGPT isn't great at image generation, so everyone wanting to dunk is making it do that. Just ask it the question and I think it does fine? chatgpt.com/share/689a7b...
Chris Carollo (@crazytalk.bsky.social) reply parent
Maybe, or some sentiment analysis of follow up replies? For sure hard/unclear. Mostly my point is just that it’s not necessarily immutable.
Chris Carollo (@crazytalk.bsky.social) reply parent
There’s no reason that an LLM couldn’t continually train and update weights based on interactions other than computational expense. Its immutability isn’t an inherent feature.
Chris Carollo (@crazytalk.bsky.social) reply parent
This is a weird screed of you making up someone to be mad at in response to me just disagreeing that LLMs are wrong more often than they're right.
Chris Carollo (@crazytalk.bsky.social) reply parent
Cool because I didn't say the latter either.
Chris Carollo (@crazytalk.bsky.social) reply parent
This is dumb bluesky tribalism stuff. It's possible to think there's a bunch of annoying stuff and overpromising happening in the AI space, and also that the tech is fundamentally useful. Just judge the stuff on its own merits.
Chris Carollo (@crazytalk.bsky.social) reply parent
I'm not doing anything of the sort though?
Chris Carollo (@crazytalk.bsky.social) reply parent
You certainly will not find me supporting the injection of AI into everything.
Chris Carollo (@crazytalk.bsky.social) reply parent
I mean, again, my job for sure requires correctness. You can't be "close" with code and expect it to work correctly. Most of what I do is very technical. Which means I can't accept results blindly, but does not mean that those results are useless. You don't need to be so snarky about this.
Chris Carollo (@crazytalk.bsky.social) reply parent
I _definitely_ would not rank the Google results high on the "should be trusted" meter. Though thankfully they generally link to their sources.
Chris Carollo (@crazytalk.bsky.social) reply parent
I've for sure had LLMs hallucinate things too, but in my experience they've gotten a lot better about that sort of thing, and the "thinking" models especially are much better. Mostly I was curious of a literal actual example that modern models get wrong.
Chris Carollo (@crazytalk.bsky.social) reply parent
I'm hardly an evangelist! I just find it personally useful and think too many people make statements about it that aren't true (like that it's wrong more often than it's right in general), and think that people should try it for themselves and see if there are places where it can be useful for them.
Chris Carollo (@crazytalk.bsky.social) reply parent
Can you give an example? Genuinely curious. (It's not like "close enough" typically works for code, either)
Chris Carollo (@crazytalk.bsky.social) reply parent
I don't blindly use LLM-generated code, but having them generate code for something (or in a language) that I'm not super familiar with, then look it over and go "yep makes sense" it's a godsend. Also great for working through technical issues at a slightly higher level than code.
Chris Carollo (@crazytalk.bsky.social) reply parent
I'm pretty unconvinced by vibe coding, but I do use LLMs regularly and find them to be incredibly useful. They're not always right, but for getting an answer to a specific thing that I want to do (as well as being able to follow up with questions/refinements) they're pretty revolutionary IMO.
Chris Carollo (@crazytalk.bsky.social) reply parent
"it will fall over more often than not" -- I feel like that's not really true anymore? Especially with models that have a "thinking" phase? Or at least WAY less true than it once was. It's easy to find cases online where they fail, but in general use I find it very rare.
Chris Carollo (@crazytalk.bsky.social) reply parent
Neural net transformers are not markov chains dude, and it's not just scale that makes that true.
Chris Carollo (@crazytalk.bsky.social) reply parent
I do know what overfitting is, and the corpus is so big in this case that LLMs are absolutely not overfitting and memorizing things in the general case. It's just not possible, they are orders of magnitude smaller than their training data.
Chris Carollo (@crazytalk.bsky.social) reply parent
Yeah a real issue with LLMs is that their intelligence is "jagged" as Karpathy calls it -- our normal cues that we use to gauge reliability aren't particularly well-suited to LLM outputs. Once you sort of wrap your head around the contours of this jagged edge, they become a lot more useful.
Chris Carollo (@crazytalk.bsky.social) reply parent
Not even remotely? I don't get why people need to be this performatively jaded and negative. It's far from perfect! But it's crazy that this sort of training results in me getting good answers to questions about both JavaScript and Middlemarch.
Chris Carollo (@crazytalk.bsky.social) reply parent
I think the answer is going to have to be "last-mile" training on vastly smaller, higher-quality datasets. Basically "LLM college".
Chris Carollo (@crazytalk.bsky.social) reply parent
I'm not super bullish on "vibe coding" but I ask dumb little things like that all the time. "I want to do X in php, how?", or "why does this code execute this way?". The ability to ask follow-up questions, in particular, really feels revolutionary.
Chris Carollo (@crazytalk.bsky.social) reply parent
I had a great discussion today that cleared up a misconception I had about JavaScript promises and how they work with the await keyword, and it was clear and accurate and included some back-and-forth follow-up questions. It was super useful and also kind of crazy!
Chris Carollo (@crazytalk.bsky.social) reply parent
Like the fact that you can hand trillions of text tokens to billions of floating point weights and have them adjust themselves by tiny amounts over and over to try to predict the next token....and then have a fully coherent conversation with the result is WILD.
Chris Carollo (@crazytalk.bsky.social) reply parent
I don't see what's so hard for thinking that some of the claims are outlandish and have bad monetary incentives, and also that the tech is genuinely useful and a shockingly impressive achievement that I didn't think would have been possible as little as five years ago.
Chris Carollo (@crazytalk.bsky.social) reply parent
LLMs have weird sharp edges around tokenization, yeah. Doesn't really extrapolate to their skill in other areas though. Also, the "thinking" models will pretty much always catch tokenization issues and give correct responses.
Chris Carollo (@crazytalk.bsky.social) reply parent
Yeah, makes sense, though I don't know how to reconcile that with this that indicates an 8-year swing in age over the last 15 years. www.axios.com/2023/11/20/a...
Chris Carollo (@crazytalk.bsky.social) reply parent
Overall, yes, I agree, but housing specifically has problems and is worth a bunch of investment to try to fix. First-time homebuyer median age rising to 38 (!) seems like it indicates a real issue. Doesn't mean an *overall* CoL crisis, but does point to a housing one.
Chris Carollo (@crazytalk.bsky.social) reply parent
Well, you can see in the chain-of-thought part of the o3 answer how it's handling the ambiguity of the question. But yeah, you generally need enough knowledge to have a sense if the answer is good or not (also true for a lot of things you get answers to, especially online).
Chris Carollo (@crazytalk.bsky.social) reply parent
This is again pushing against tokenization which is a known weak point but...that answer seems like a good start to me? This is o3's answer: chatgpt.com/share/6891a4...
Chris Carollo (@crazytalk.bsky.social) reply parent
That's my point -- I don't care *how* it's keeping my food cold, whether it's a compressor or blocks of ice or space lasers. I take it for what it can deliver: cold food! I don't care that much that LLMs are doing word prediction, because I just care about what utility they can deliver.
Chris Carollo (@crazytalk.bsky.social) reply parent
Ack, *their*, not they're. (a LLM would have caught that, lol)
Chris Carollo (@crazytalk.bsky.social) reply parent
I'm just explaining how I think they can be useful, how I approach them, and why I think they're not being infallible isn't a deal-breaker. How they work under-the-hood isn't super relevant.
Chris Carollo (@crazytalk.bsky.social) reply parent
Frontier models are still improving pretty quickly, it's hard to keep up with what the latest models are. (And OpenAI's naming scheme isn't doing anyone any favors there lol)
Chris Carollo (@crazytalk.bsky.social) reply parent
FWIW ChatGPT o3 did pretty well at both your examples: chatgpt.com/share/68919e...
Chris Carollo (@crazytalk.bsky.social) reply parent
Sure, though I think that we don't have a particularly quantifiable description of "intelligence" though -- it's more of a "can a human do this" test. Which LLMs for sure can fail sometimes! (but so do humans)
Chris Carollo (@crazytalk.bsky.social) reply parent
I'm not a "vibe coding" person and I wouldn't use LLM-generated results without reading and understanding it myself. I just think the "they're useless" camp is missing the boat, and should really explore when and where they can be useful. Because I think they really can be!
Chris Carollo (@crazytalk.bsky.social) reply parent
It's for sure weird that it can do that sort of thing, with deep domain knowledge and a great ability to explore the issue with me and eventually explain what the problem was...and still maybe get state palindromes wrong! But that just means its a tool people should carefully, IMO.
Chris Carollo (@crazytalk.bsky.social) reply parent
For example, I used an LLM to help me debug a complicated, subtle issue with a ML model I'd set up and how I was using its results (my cosine similarity was flipping its sign randomly train-to-train). Though a bunch of back-and-forth description, it finally helped me figure out the issue.
Chris Carollo (@crazytalk.bsky.social) reply parent
Their intelligence is for sure "jagged" as Karapthy describes it, which means you can't really do that sort of interpolation on their capabilities. Sometimes they can do very complicated things well and still get easy-to-us-things wrong.
Chris Carollo (@crazytalk.bsky.social) reply parent
That second one is an artifact of tokenization not "meaning of words" -- that's a sharp corner you need to be aware of. More complicated models like o3 are much better at dealing with words/letters/numbers because they check the answer rather than trying to one-shot it.
Chris Carollo (@crazytalk.bsky.social) reply parent
That's not my experience at all. Honestly curious about what kinds of problems people are throwing at them, and how they're describing them, to have these sorts of results.
Chris Carollo (@crazytalk.bsky.social) reply parent
/shrug I use LLMs all the time, on fairly technical stuff, and find them really useful. Understand the sharp edges, check the results, etc. They're not a magic bullet, but rather a tool like lots of other things, and a very useful one IME.
Chris Carollo (@crazytalk.bsky.social) reply parent
I dunno, treat it like a very knowledgeable but not infallible coworker? Humans aren't consistently dependable either, but they're still pretty useful!
Chris Carollo (@crazytalk.bsky.social) reply parent
Too many people do not remember what pre-Uber taxi service was like. "I can hit a button on my phone and they know where I am and they have my CC info already and I can track their location and they show up in minutes" was a huge deal!
Chris Carollo (@crazytalk.bsky.social) reply parent
Curious your thoughts on Beshar, who seems to me to be very successful with a pretty practical, concrete, kitchen-table agenda in a pretty red state.
Chris Carollo (@crazytalk.bsky.social) reply parent
I don't get why people feel the need to trash Manchin. I didn't agree with him on most things, but he was an absolute gift to the Dem party. Think how much better things would be if we had four more of him right now! We need to stop shooting ourselves in the foot and focus on winning elections.
Chris Carollo (@crazytalk.bsky.social) reply parent
Who are you talking about here?
Chris Carollo (@crazytalk.bsky.social) reply parent
Do what you want, but my No Soliciting sign means No Soliciting, and that includes soliciting for my vote. I'm less likely to vote for any candidate that comes to my door and ignores it. I absolutely do not want to be interrupted by politicians who want to talk, and I like talking politics!
Chris Carollo (@crazytalk.bsky.social) reply parent
Yeah, I definitely do agree that analogy is kinda specious and ultimately not helpful. If nothing else, it sets up expectations that aren't likely to be met, since the failure conditions of humans and LLMs are very different right now.
Chris Carollo (@crazytalk.bsky.social) reply parent
I'm not going to assert that it's emulating real brains, but at a certain point (which we're headed towards but have not yet reached) I'm not sure that it matters. It's not like we've got a solid conception of "consciousness" in humans, either.
Chris Carollo (@crazytalk.bsky.social) reply parent
Why is their being undocumented asserted as part of the question? The whole point of a court hearing is to establish that.
Chris Carollo (@crazytalk.bsky.social) reply parent
I mean, voters did seem to punish Carter for bad material conditions! I'm mostly with Will and you on this, but there are differences re: inflation between Reagan and Biden that seem pretty significant.
Chris Carollo (@crazytalk.bsky.social) reply parent
Whether that inflation started on your predecessor's watch or not seems very relevant to this comparison.
Chris Carollo (@crazytalk.bsky.social) reply parent
I think this is the correct take but also that inflation, specifically, is an unusually potent seed to build a narrative on.
Chris Carollo (@crazytalk.bsky.social) reply parent
I see the Abundance stuff way more as governance advice than explicitly electioneering.
Chris Carollo (@crazytalk.bsky.social) reply parent
I don't think we should be begging him for support, but if we can firm up his opposition to Trump, that seems like a win.
Chris Carollo (@crazytalk.bsky.social) reply parent
The stuff he's done is truly evil and awful. I'd rather have him stop doing that stuff and focus on rockets and EV cars. Having him in clear opposition to Trump helps with that.
Chris Carollo (@crazytalk.bsky.social) reply parent
He's pretty damn unpopular so I'm not convinced of the upside either, but I don't think rejecting him outright when there are things where he and Dems have some overlap is the necessarily right move. I'd obviously rather having him opposing Trump than not, and the dude is persuadable (to a fault).
Chris Carollo (@crazytalk.bsky.social) reply parent
At the point where Musk is getting concessions that Democrats are upset with, sure, but there's a lot of potential overlap before you get to that point. Why reject it out of hand?
Chris Carollo (@crazytalk.bsky.social) reply parent
Sure but rather than him doing that stuff, lets get him to focus on opposing this awful tax bill, and getting Democrats elected in '28 because we support EVs and energy transition and (more) responsible budgeting. The guy is obviously susceptible to persuasion!
Chris Carollo (@crazytalk.bsky.social) reply parent
How is this materially different than the Cheney stuff? Is Khanna actually making concessions, or just welcoming the enemy of my enemy?
Chris Carollo (@crazytalk.bsky.social) reply parent
It's interesting that ALL of Trump's metrics went down in that period -- seems odd that Garcia would drive down his numbers on inflation too.
Chris Carollo (@crazytalk.bsky.social) reply parent
The question to me is whether *repeated* trips matter, or if they stop being news.
Chris Carollo (@crazytalk.bsky.social) reply parent
Yeah I don't think it's that different from clicks driving what media writes about. In this case, people like being flattered, and that shows up in the A/B response testing they do, so it tends towards flattery. I'd be surprised if execs had much of a thumb on the scale.
Chris Carollo (@crazytalk.bsky.social) reply parent
I agree but I think he’s dismissing inflation too much, whether it’s material-based anger or the seed of the narrative, it played a large role! It’s much harder to build a narrative out of whole cloth.
Chris Carollo (@crazytalk.bsky.social) reply parent
Voters don't have to be actually responding to prices going up for "inflation", which they do in fact not like, to be an incredibly powerful and effective thing to build a holistic narrative out of.
Chris Carollo (@crazytalk.bsky.social) reply parent
You keep making this point but: 1) Less time for narrative to take hold 2) Midterm which favors high-info voters (Democrats) 3) Mere months after Dobbs. I'm with you on the econ actually being good and it being mostly media environment, but inflation was the anchor that the narrative hung on.
Chris Carollo (@crazytalk.bsky.social) reply parent
Eh, Perfect 10 v. Amazon makes me think it's not at all clear-cut. And there are lots of cases where fair use includes "material parts of the product".
Chris Carollo (@crazytalk.bsky.social) reply parent
My sense is that at some point there are going to be semi-autonomic agents that inherently "learn" by existing in the world, and it's nonsensical for them to be blind and deaf, so we're going to have to sort this out one way or another, and I think the rules that apply to humans work pretty well.
Chris Carollo (@crazytalk.bsky.social) reply parent
I think that's an interesting the case that the courts would probably have to sort out. But it also hinges on what the program *does* more than anything else.
Chris Carollo (@crazytalk.bsky.social) reply parent
Why? My looking at copyrighted work and using it to generate derivative, transformative works isn't a violation. A computer looking at copyrighted work to generate search indices isn't a violation. We have mechanisms to judge whether output is violating. Why is that insufficient?
Chris Carollo (@crazytalk.bsky.social) reply parent
It's very rare for generative AI to reproduce any of its training materials in their "complete and unaltered form" -- that's just not how training works; they fundamentally don't have enough weights to encode all of that.
Chris Carollo (@crazytalk.bsky.social) reply parent
I don't see how any of that is relevant? (though I'm sure there are generative AI models that can do a pointalist piece based off a single example)
Chris Carollo (@crazytalk.bsky.social) reply parent
Of course, but what generative art AIs do is pretty much entirely the former. If they start doing the latter, or in rare cases where they do something similar enough, that's pretty clear infringement.
Chris Carollo (@crazytalk.bsky.social) reply parent
I don't really see it as stealing any more than a person going to a museum is stealing. In either case, if the work generated is infringing, then the courts are ready and capable of handling that. And they should! If it's generating new content, then that seems fine with me.
Chris Carollo (@crazytalk.bsky.social) reply parent
If the images are too similar to some original source material (many are not), go ahead and sue, just like with humans. Either way, the issue is with the output, not that they saw copyrighted material.