carl24k (@carl24k.bsky.social) reply parent
business owners and customers are humans and they shouldn’t have to bear the cost. I live in Berkeley so trust me I know what I’m talking about. You probably live in a gated community You Bastard 😉
#DataScience & #MachineLearning - #Neuroscience & #NeuroAI - Author of Fighting Churn With Data - https://linktr.ee/carl24k
146 followers 273 following 275 posts
view profile on Bluesky carl24k (@carl24k.bsky.social) reply parent
business owners and customers are humans and they shouldn’t have to bear the cost. I live in Berkeley so trust me I know what I’m talking about. You probably live in a gated community You Bastard 😉
carl24k (@carl24k.bsky.social) reply parent
I’m going to stick my neck out here: the camps do real economic harm to nearby businesses. The reality is people don’t want to walk past camps to shop or rent an apartment. If you want to, invite some unhoused to camp at your place. But it needs to be banned and enforced anywhere near businesses.
carl24k (@carl24k.bsky.social)
Need some last minute reading for your Labor Day Weekend? Just in time, its @manning.com #labordaysale! Labor can meet the beach as you #learntocode or #learndatascience this weekend! www.manning.com/books/fighti...
carl24k (@carl24k.bsky.social)
Are #AI and #churn your two favorite subjects? Then you are my BFF! But seriously, these two subjects combine in one great article... AI companies have set unbelievable records for rapid growth in ARR, but these companies have equally astonishing churn rates. #AIHype sifted.eu/articles/ai-...
carl24k (@carl24k.bsky.social) reply parent
I ❤️ #streamlit
carl24k (@carl24k.bsky.social) reply parent
I say brace for my AI induced market volatility 😬
carl24k (@carl24k.bsky.social)
This #AI report caused stocks to drop this week. Learnings & takeaways: - Buying tools is better than in-house apps - Back office automation most successful; most companies focus on sales and marketing. - managers should drive adoption, not an AI group #aihype finance.yahoo.com/news/mit-rep...
carl24k (@carl24k.bsky.social) reply parent
And they might have to make some changes in the way productivity is measured, just as they adapted some GDP measurements to account for quality improvements.
carl24k (@carl24k.bsky.social) reply parent
Nor should we be surprised that a lot of projects are cancelled. How many IT projects get cancelled? Lots, I assume. And it was probably even worse back in the 90's.
carl24k (@carl24k.bsky.social)
If the "productivity paradox" of the 90's is any guide, we should start to see AI impacting productivity numbers in about 10 years. So not seeing AI "pay off" yet should not come as a surprise. www.nytimes.com/2025/08/13/b... #AI #LLM #AIHYPE
carl24k (@carl24k.bsky.social) reply parent
Yeah what’s a blue sky vibe anyway? But I do think it would be nice if there were more posts that expressed Manning’s character as an organization that supports learning, not just promoting specific products
carl24k (@carl24k.bsky.social)
A Warm welcome to my publisher @manning.com who has joined us on Bluesky 🎉. #learntocode #learndatascience
carl24k (@carl24k.bsky.social) reply parent
There are now academic theories of consciousness, most notably, the global workspace theory (GWS), and the integrative information theory (IIT) Maybe Claude can tell us which theory of consciousness would explain its own?
carl24k (@carl24k.bsky.social) reply parent
This was a massive task, consuming a combined 3,000 hours of labor and a small army of 21 researchers! TheAgentCompany is meant to resemble a small software development company. The researchers made a number of tasks for the benchmarking — some easy, some hard, and some hard but with easy parts.
carl24k (@carl24k.bsky.social) reply parent
The researchers in Carnegie Mellon University's School of Computer Science made a simulation environment to benchmark the performance of AI agents and test their abilities on real-world tasks.
carl24k (@carl24k.bsky.social) reply parent
IMHO this was a sincere effort by a highly qualified group of researchers (this is Carnegie Mellon!) and we should take it seriously. Its great progress for this field that someone has made a benchmark that will actually accelerate progress by making different systems objectively comparable.
carl24k (@carl24k.bsky.social) reply parent
I expect true believers and those who profit from AI Hype will find reasons to doubt this result by claiming the tasks or environment were not realistic enough, there was something wrong with the testing methodology, or they didn't use the best LLM agents out there.
carl24k (@carl24k.bsky.social) reply parent
The results are comforting for people worried about AI replacing them: - Best: Claude 3.5 Sonnet from Anthropic, completed 24% of the tasks. - 2nd Place: Google's Gemini 2.0 Flash, completed 11.4% of the tasks. - 3rd Place OpenAI's GPT-4o, completed 8.6%. of the tasks.
carl24k (@carl24k.bsky.social)
Do you want to know if "AI Agents" are ready for prime time or just hype? So did researchers at Carnegie Mellon University -- the results may surprise you! www.cs.cmu.edu/news/2025/ag... #AI #AIAgents #LLM #AIHype #AIResearch #CMU #Claude #Gemini #GPT4o
carl24k (@carl24k.bsky.social) reply parent
TLDR: a little knowledge is dangerous
carl24k (@carl24k.bsky.social) reply parent
In fact it goes back a long time - Check out this 2012 book "The Flaw of Averages: Why We Underestimate Risk in the Face of Uncertainty" - as long as people have been collecting averages for business, people have been trying to interpret them without confidence bounds. It's kind of the same, right?
carl24k (@carl24k.bsky.social) reply parent
Personally I think whoever said "#XGBoost killed data science" had it right (I can't take credit for that quip) - as the tools for making models got easier and easier, people stopped asking if their models make sense. Whether or not they performed a cross validation.
carl24k (@carl24k.bsky.social) reply parent
You've heard of #vibecoding? Well now you get #vibedatascience - a non-data scientist asks an AI to write code to solve data science problem. But not knowing how to evaluate a data science solution, they end up with garbage.
carl24k (@carl24k.bsky.social)
Has #DataScience Become a Pseudo Science? I came across this pretty scary/sad Reddit, Inc. thread - has "democratization" of data science with #AI tools lead to a collapse in standards? Here's my rant on the topic... www.reddit.com/r/datascienc...
carl24k (@carl24k.bsky.social) reply parent
I wonder if this is related to the nature of the task - these were experienced developers doing something they already knew well how to do. Personally I find that LLM's speed me up the most when I'm trying to do something on a new platform that I'm not familiar with. WDYT?
carl24k (@carl24k.bsky.social) reply parent
The gap between perception and reality is striking: developers expected AI to speed them up by 24%, and even after experiencing the slowdown, they still believed AI had sped them up by 20%. In fact it slowed them down by 19% ⁉️
carl24k (@carl24k.bsky.social)
How much does AI REALLY speed up programmers? The answer from the latest study may surprise you: For experienced open source developers, it actually slowed them down! According to a new study from metr.org (Model Evaluation and Threat Research). #LLM #AI #codegen
carl24k (@carl24k.bsky.social) reply parent
I wonder if this is related to the nature of the task - these were experienced developers doing something they already knew well how to do. Personally I find that LLM's speed me up the most when I'm trying to do something on a new platform that I'm not familiar with. WDYT?
carl24k (@carl24k.bsky.social) reply parent
The gap between perception and reality is striking: developers expected AI to speed them up by 24%, and even after experiencing the slowdown, they still believed AI had sped them up by 20%. In fact it slowed them down by 19% ⁉️
carl24k (@carl24k.bsky.social) reply parent
So I added a function where you can calculate a score for both the first stage and final stage models with any sklearn metric, or even any function that you pass. This will help enable more use of DML in industry where the data quality is low and the model performance is hard to interpret.
carl24k (@carl24k.bsky.social) reply parent
Before the PR only the mean square error (MSE) was implemented. But MSE is hard to interpret and compare across models. So if you are fitting a regression model with the MSE it is helpful to look at r-squared; or if you are fitting a classification model with log-loss it is helpful to look at AUC.
carl24k (@carl24k.bsky.social)
My PR to the #EconML #PyWhy #opensource #causalai project was merged! 🎉 I made a small contribution by allowing a flexible choice of evaluation metric for scoring both the first stage and final stage models in Double Machine Learning (#DML). #CausalInference #machinelearning #datascience
carl24k (@carl24k.bsky.social)
It's time for everyone's favorite annual American tradition, the Manning Publications Co. 4th of July Sale! 🎆 Now through Sunday - Great time to fight #churn with #datascience and #machine learning. www.manning.com/books/fighti...
carl24k (@carl24k.bsky.social) reply parent
5. Series finales and cancellations generate measurable subscriber losses. Services have to invest big in content rights to offset the standing average churn in the streaming era.
carl24k (@carl24k.bsky.social) reply parent
3. Netflix's cancellation rate remained stable at 1-2% despite implementing a pricing increase in January 2025 across all tiers 4. Every major price increase in 2024 corresponded with churn spikes: Peacock's $2 increase in July, Paramount+'s $1-2 increase in August, and Hulu's October.
carl24k (@carl24k.bsky.social) reply parent
1. Max maintained a remarkably stable cancellation rates of 6-7% throughout 2024, even amidst content changes, price increases, and "the attempted burial of the greatest TV brand ever". 2. Hulu's cancellation rate doubled from 3% to 6% following its password-sharing restrictions in late 2024
carl24k (@carl24k.bsky.social)
Do you want to know whats driving streaming churn? StreamTV Insider released a nice breakdown of the Antenna streaming report with some good insights: www.streamtvinsider.com/video/ring-c... #svod #churn #streaming #subscription 🧵
carl24k (@carl24k.bsky.social) reply parent
There is a website where you can track the latest stats: livecodebenchpro.com. And the research methodology paper is on Arxiv: arxiv.org/pdf/2506.11928
carl24k (@carl24k.bsky.social) reply parent
❗ All the coding assistants completely fail (0% pass rate) at the "hard" level problems (there are NO red bars in the accuracy chart) So don't hold your breath if you're expecting these systems totally replace human coders anytime soon. Expert level coders, your jobs are safe for now. 😅
carl24k (@carl24k.bsky.social) reply parent
✅ The best freely available coding assistant at this time is Google Gemini 2.5 flash. The GPT-4o based assistant from ChatGPT which is now freely available is trash. (This is AFAIK - if any of the better models are freely available please correct me!)
carl24k (@carl24k.bsky.social)
Are you someone who works with code? Do you want to tell the hype from reality in #LLM coding assistants? Apple created a new coding benchmark #livecodebench with help from human Olympiad medalists, preventing contamination with continuously updated problems. Top findings 🧵
carl24k (@carl24k.bsky.social) reply parent
I think these findings are specific to Stripe's customer base of smaller businesses, but still useful! The report also gives useful details on the breakdown of churn between voluntary cancels and "involuntary" or "passive" churns due to card issues. Link: go.stripe.global/rs/072-MDK-2...
carl24k (@carl24k.bsky.social) reply parent
These findings are a bit surprising since I have seen a wider spreads in other churn benchmark datasets: - Churn is somewhat lower for SaaS and Retail (36%) and higher for education and personal services (40%). - Churn is lower on B2B (36%) than B2C (38%)
carl24k (@carl24k.bsky.social) reply parent
These findings put useful numbers on common sense: - Churn declines with average order value, from 38% for orders of $10 or less down to 17% for orders of $10,000 or more. - Churn is lowest for credit cards (34%), higher for debit (37%) and highest for prepaid cards (42%)
carl24k (@carl24k.bsky.social)
Want a new reference on subscription churn rates to benchmark against? Look no further! Stripe has entered the subscription benchmarks games with "The Stripe Guide to Churn" (Personally I can't get enough churn benchmarks 🙂 ) Highlights... #churn #churnrate #subscription
carl24k (@carl24k.bsky.social) reply parent
And that buffed out llama thing is creepy
carl24k (@carl24k.bsky.social) reply parent
#dumpsterfire? 11 of the 14 researchers on that original model have left the company. Senior executives are blaming the remaining research team. How long do you think they'll stay on for and how will they find a new, better team?
carl24k (@carl24k.bsky.social)
Meta will spend $72 BILLION this year to realize Zuck’s ambitions for AI. But they are delaying rollout, and it was caught cheating using a model hacked to do well on the benchmark test, according to WSJ. #LLM #AIHype #Genai www.wsj.com/tech/ai/meta...
carl24k (@carl24k.bsky.social) reply parent
Clear analogy to those who are saying that LLM's are going to be the end of programmers - LLM's are making programmers more efficient, and perhaps that may lead to less need - Or it may just lead to more ambitious goals for technology organizations that leads to MORE need for programmers!
carl24k (@carl24k.bsky.social)
I don't think Hinton is ever going to live down saying “stop training radiologists”... But at least in this article he shows some contrition and acknowledges that he underestimated what radiologists do with their time. #AIhype #LLM #DataScience #machinelearning www.nytimes.com/2025/05/14/t...
carl24k (@carl24k.bsky.social)
I commissioned an abstract painting of visual cortex from Anjiolina, so I can vouch for the fact that she is for real and does amazing work!
carl24k (@carl24k.bsky.social) reply parent
Still, it's nice that someone published about churn in a journal!
carl24k (@carl24k.bsky.social) reply parent
TBH I'm taking the conclusions with a grain of salt because (1) They only looked at a single Kaggle dataset; (2) The dataset was not really very imbalanced, with a 20% churn rate; (3) they don't address the calibration consequences of using an oversampling technique like SMOTE.
carl24k (@carl24k.bsky.social)
#Churn is in Nature Magazine Scientific reports! Its an interesting study looking at #SMOTE and a variety of different classifiers. www.nature.com/articles/s41...
carl24k (@carl24k.bsky.social) reply parent
What are Augustinian values? For those of us who know nothing
carl24k (@carl24k.bsky.social)
#AI #Hallucinations are getting worse, just when we were told they were going to get better 🤯 Apparently hallucinations snowball when the new gen of AI tries to "reason" #AIhype #GenAI #LLM Gift link to the full article: www.nytimes.com/2025/05/05/t...
carl24k (@carl24k.bsky.social) reply parent
Thanks I’m going to check that out!
carl24k (@carl24k.bsky.social) reply parent
There is the Brain Inspired podcast discord. It’s not a super active server, but there’s usually some discussion or another going on. Note that you need to support the podcast on Patreon to join. Lol I just saw the new neuro ai tag line when I pasted this link braininspired.co
carl24k (@carl24k.bsky.social) reply parent
It’s a challenge as a datascientist and machine, learning engineer and industry too. Explaining things well takes a long time.
carl24k (@carl24k.bsky.social)
Big news in the small world of Fighting #Churn With #DataScience The #opensource #python code for my book has been updated to support Python up to version 3.12! pypi.org/project/figh... Shoutout to Shaolang Ai for doing all the work on this! I only helped with testing. I appreciate you!
carl24k (@carl24k.bsky.social) reply parent
Agreed. I think the reason coding is the one use case is because you can get instant feedback when the LLM answer is BS by copy-pasting the code and running it. I don’t think there’s any other domain where you can get that instant feedback. So unthinking programmers overestimate how useful it is
carl24k (@carl24k.bsky.social) reply parent
I’m an LLM skeptic, but even I use them for coding. Constantly. It is based on plagiarism from stack overflow. I feel a bit bad about that, but not enough to not do it.
carl24k (@carl24k.bsky.social) reply parent
Yeah they’re useful. Just not as useful as the startups hype them up to be. I estimate they increase my productivity 10%. But not 100%.
carl24k (@carl24k.bsky.social)
Check out this new compilation of #churn rates from Exploding Topics Compiled from a number of different sources, it includes #socialmedia apps, transportation, #finance, #retail and international comparisons. explodingtopics.com/blog/custome...
carl24k (@carl24k.bsky.social) reply parent
Sounds like a typical 80-20 situation: The AI completes the 80% of the task items that are actually easy, but can't handle the 20% the items that are actually hard and consume 80% of the effort.
carl24k (@carl24k.bsky.social)
Don't worry software engineers! Your jobs are safe a little while longer - Researchers say #AI can't replace humans because it cannot reliably debug. 🐛 #AIHype much? #llm #genai #chatgpt #programming #softwareengineering arstechnica.com/ai/2025/04/r...
carl24k (@carl24k.bsky.social) reply parent
My fave because I created it in 2016. Wow, time flies
carl24k (@carl24k.bsky.social)
My Favorite #Subscription report is back! The Zuora Subscription Economy index - "#Churn leveled off (in 2024) and declined slightly across industries in the SEI after spiking in 2023 due mainly to rising interest rates" www.zuora.com/resource/sub... #churnrate #analytics #subscriptioneconomy
carl24k (@carl24k.bsky.social) reply parent
I think people are waiting for clear proof that #Causal ML will do better on real world problems. Right now everyone uses supervised ML and ignores confounding. I’m working on a project at my company to try #CausalML. High memory usage for large datasets is a problem for CML
carl24k (@carl24k.bsky.social) reply parent
This misunderstanding plagues most self-trained (non-academic) AI researchers too - successful research is more "art" than "craft" - you can't brute force it by trying lots of hypotheses - there are too many. Deep understanding and good intuition are needed.
carl24k (@carl24k.bsky.social)
Heard about the AI 2027 forecast? I don't believe it. The least plausible part is a leap from AI coding to AI research - totally underestimates the vast spaces of plausible hypotheses that good researchers use their knowledge and understanding to prune down. #AI #aihype #GenAI #LLM #AI2027 #neuroai
carl24k (@carl24k.bsky.social) reply parent
Definitely
carl24k (@carl24k.bsky.social) reply parent
They also say: "Artificial intelligence (AI) has become indispensable. AI enables businesses to analyze subscriber behaviors, predict churn risks, and automate recovery efforts." I always said, Fight #Churn With Data!
carl24k (@carl24k.bsky.social)
Recurly published their 2025 State of #Subscriptions (I missed this back in Feb) - features industry #churnrates and a data on re-signups after churn: "Return acquisition percentage" for % of new signups that were previously subscribed. go.recurly.com/2025-state-o...
carl24k (@carl24k.bsky.social) reply parent
Weird. Is this an academic politics thing? As an outsider I would imagine a lot of people would want to go to both. But they forced people to choose.
carl24k (@carl24k.bsky.social)
As a #neuroscience watcher on Bluesky who is no longer in the biz, I’ve become a bit confused these last few days: are #Cosyne2025 and #CNS2025 both happening right now? Or are some people getting hashtags mixed up in their posts? Or what?
carl24k (@carl24k.bsky.social) reply parent
😆
carl24k (@carl24k.bsky.social) reply parent
Is there a paper / poster specifically on this? IIRC @tonyzador.bsky.social speaking on @braininspired.bsky.social podcast specifically said the brain is rather un-transformer like, and the power of transformers seems to stem from their alignment with GPU architectures, not brains.
carl24k (@carl24k.bsky.social)
Apple TV + feels the #Churn reported by Yahoo Finance: #churnrate 's from the Antenna report which I shared previously : Apple TV has 7% monthly churn, compared to Netflix's 2% finance.yahoo.com/news/apple-i....
carl24k (@carl24k.bsky.social) reply parent
Which LLM will give the correct answer "You cannot do X with Y" from the start? I mean this as a serious question - I'm not just trying to shade LLM's here.
carl24k (@carl24k.bsky.social)
Ask an #LLM "How can I do X with platform Y" the answer is "Absolutely, here is how to do X with Y..." But there are lots of hallucination from other platforms. After several errors I ask "Are you sure you can do X on Y?" - then it says, "You are correct, you cannot do X on Y. Here's why..." 🤯
carl24k (@carl24k.bsky.social)
🍀 SALE from Manning Publications Co. !!! Now thru Monday. #DataScience #machinelearning #learntocode www.manning.com/books/fighti...
carl24k (@carl24k.bsky.social) reply parent
I posted this on LI today and it’s not a joke: I would really like an LLM that would be less confident and think more.
carl24k (@carl24k.bsky.social) reply parent
No way. I use them every day and it’s definitely not thinking
carl24k (@carl24k.bsky.social)
Do you think current #LLM already empower ANYONE to code? Time to put down the kool aid... #aihype #learntocode #DataScience #MachineLearning www.theguardian.com/games/2025/m...
carl24k (@carl24k.bsky.social) reply parent
Is there really that much more funding if a neuro lab starts doing more AI inspired stuff? Because they can’t just drop biology and create new transformer models - they haven’t the background and likely not the temperament for that. what kind of convergence do you see?
carl24k (@carl24k.bsky.social) reply parent
Do you think many people would take them up on this offer? I don’t know PH. I have been to Taiwan and my sense is there is still anger towards JP in TW. My guess is that an offer like this may be met with indifference or even anger.
carl24k (@carl24k.bsky.social)
Seriously, I am willing to admit significant gains in my own productivity from LLM's but significant means like 10%, maybe 15%; not 1000%. So when I hear claims of 10X productivity improvement I think, what were those people doing before hand? #genai #chatgpt #llm #aihype #datascience
carl24k (@carl24k.bsky.social)
#Churn comes to #podcasts! "#Podcast #subscription businesses are maturing... they face a challenge that publishers have grappled with for years: churn." Fortunately, fighting churn for Podcasters is the same as for everybody else - the techniques are well known. digiday.com/media/how-po...
Stand Up for Science! (@standupforscience.bsky.social) reposted
Welcome to the Bluesky account for Stand Up for Science 2025! Keep an eye on this space for updates, event information, and ways to get involved. We can't wait to see everyone #standupforscience2025 on March 7th, both in DC and locations nationwide! #scienceforall #sciencenotsilence
carl24k (@carl24k.bsky.social)
Anntena Web TV has released a new "State of #Subscription Report". #Churn highlights: Churn has "settled" down after climbing since 2022 - 2024 Weighted average #SVOD monthly churn is 5%. BUT churn is less than 3% when including re-subscribers! Churn & Return lives!
carl24k (@carl24k.bsky.social)
Gary Marcus has more straight talk on #GenAI & #AGI in Fortune magazine. Spoiler alert - don't hold your breath for AGI. fortune.com/2025/02/19/g.... #AI #datascience #llm
carl24k (@carl24k.bsky.social)
#App #Churn is so HIGH! 80% churn on day 1, 98% churn by day 30 🤯 #SaaS and #Streaming are lucky. As David Curry from Business of Apps observes, "the majority of apps being free to download... users have invested very little into the app." www.businessofapps.com/data/app-ret...
carl24k (@carl24k.bsky.social)
#AI meets #Churn - who will win? Personally, I would never bet against churn. www.theinformation.com/articles/why.... I don't subscribe - can anyone share the article? DM me.
carl24k (@carl24k.bsky.social)
New #churn report by Subbly, covering a variety of #subscription products (mostly physical product subscriptions.) One of the key insights: churn did not depend significantly on price! Overall 2024 churn rate of 7.4% monthly for their client companies, ~60% per year. www.subbly.co/blog/subscri...
carl24k (@carl24k.bsky.social)
Are the #magnificent7 going to eat themselves with #AI? WDYT? This is kind of a 🤯 www.nytimes.com/2025/01/28/o...
carl24k (@carl24k.bsky.social) reply parent
Yes I remember that one - he thought his robot was going to save the day! It’s not just that he thinks tech will save the day, but he thinks he is an expert in everything - even when He knows nothing
carl24k (@carl24k.bsky.social)
#bayarea #weather is TOO nice - tshirt weather in January means it’s going to be a drought year 😢 where did all the great rain from December go? I guess this is linked to the socal fires, and we may be in for a fiery summer as well
carl24k (@carl24k.bsky.social) reply parent
Christof was my PhD advisor!