Yoav Goldberg (@yoavgo.bsky.social) reply parent
this last one is exactly the "hide keys" scenario right? because otherwise the client could also read/write from s3.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
this last one is exactly the "hide keys" scenario right? because otherwise the client could also read/write from s3.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
got it. i guess it can be done with views / stored procedures, but maybe people will prefer using a non-db language.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
but in the early days the PHP/CGI did also the rendering. now the rendering is done on the client, so why not call the SQL from the client as well?
Yoav Goldberg (@yoavgo.bsky.social) reply parent
i guess i also don't understand the majority of the "backend" thing, assuming it is stateless and only orchestrates calls to other services.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
i guess i also don't understand the majority of the "backend" thing, assuming it is stateless and only orchestrates calls to other services.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
how do you ensure it happens in a serverless function? always call some authentication server before calling the other server(s)? if so, why can't a token-based solution work just as well? (the auth servers gives the client a token which is then used with the other servers)?
Yoav Goldberg (@yoavgo.bsky.social) reply parent
wdym by "maintain code on the client"? its a browser and it runs whatever code you put in your static html/js files.
Yoav Goldberg (@yoavgo.bsky.social)
can someone explain "serverless backends" to me? it seems that they run functions on demand. but if these functions cannot access any persistent state, why not run them on the client? the only reason I see is to hide DBs/APIs tokens/secrets from the client, but is that really all there is to it?
Yoav Goldberg (@yoavgo.bsky.social) reply parent
its perhaps THE most fascinating thing about LLMs in my view.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
how do you know the ground truth for human steps?
Yoav Goldberg (@yoavgo.bsky.social) reply parent
(or we can see it as a metaphor maybe)
Yoav Goldberg (@yoavgo.bsky.social) reply parent
yup, happened to me too.. thats a bug. will attempt to fix tonight.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
not early - but i did get both balls on the same side after a day+, a bug on my part which should be fixed at some point...
Yoav Goldberg (@yoavgo.bsky.social) reply parent
cest ne une rosh shel dag
Yoav Goldberg (@yoavgo.bsky.social) reply parent
maybe. should be easy enough to check.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
the suspense is like 80% of the fun!!
Yoav Goldberg (@yoavgo.bsky.social) reply parent
it kinda looked like a pokeball at some point.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
wdym by "similar"? the outer circle cannot change. but it is quite diverse within that constraint.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
interesting. i'd say "take the code and try it out", but i suspect it is too brittle and will just trash the hacky collision mechanism and get you some very weird results... (so, "yes", i guess, but for the wrong reasons)
Yoav Goldberg (@yoavgo.bsky.social) reply parent
it is currently fully deterministic. it should be easy to introduce randomness in the starting directions of each "ball".
Yoav Goldberg (@yoavgo.bsky.social)
i created this thingy yesterday and now I cannot stop watching it. yoavg.github.io/eternal/
Yoav Goldberg (@yoavgo.bsky.social)
a trivia fact about this paper is that we submitted it to arxiv weeks ago, and it was hanging there in limbo for quite a while. apparently because we submitted to "AI" while they moved it to "HCI".
Yoav Goldberg (@yoavgo.bsky.social) reply parent
the appealing thing for me about this particular test is that we managed to somewhat-robustly measure something meaningful about the model's "actual" process, *and* sort-of measure how humans perceive it.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
i mean, i think about these topics a lot and i think the levels in which we don't understand these things are very much multi-faceted ;) this is just one of them.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
I was not familiar, and it looks very interesting. thanks!
Yoav Goldberg (@yoavgo.bsky.social) reply parent
what this means it that we don't have a good mental model of how different steps in the LLM generation process depend on each other, and what are the causal relations that make it tick. which in turn means that LLM's explanations might be "transparent", but also that we fail to interpret them.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
the test is very simple: given an AI reasoning text broken into steps, we highlight one step as the target, and then ask you to identify the one-out-of-four preceding steps that we choose, which, if removed, will change the target step. turns out this is really hard.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
how well do *you* understand how AI reasoning works? test yourself here: do-you-understand-ai.com
Yoav Goldberg (@yoavgo.bsky.social)
When reading AI reasoning text (aka CoT), we (humans) form a narrative about the underlying computation process, which we take as a transparent explanation of model behavior. But what if our narratives are wrong? We measure that and find it usually is. Now on arXiv: arxiv.org/abs/2508.16599
Yoav Goldberg (@yoavgo.bsky.social) reply parent
"attention" duh. also "zoneout", though it was admittedly short-lived.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
to be more precise, I did not say they were not "based on" linear algebra, only that linear algebra is for the most part not important to understand how they work, and hardly any improvements came about because someone "knew linear algebra". but yes, it's very similar.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
It's live and kicking! paperfinder.allen.ai/chat/ If it was down it was down it was a temporary technical issue.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
לג'יט למרות שאתפלא אם השימוש בנקדן מקנה להם בעלות על החומר המנוקד
Yoav Goldberg (@yoavgo.bsky.social) reply parent
גם התורמים אותם תורמים, ככל הידוע לי. אבל הארגונים נפרדים זה מזה.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
אני הפסקתי להקשיב לרוג'ר ווטרס אבל בעיקר כי הוא די משעמם תכלס.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
הי! אני לא שם כבר תקופה אך מדבר איתם מדי פעם. צוות טוב.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
wdym by "enable manipulation"? the metaphors i think are only "nice to have" are the "projections between spaces" kind of things.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
also, there ARE benefits to taking algebra in undergrad. you learn stuff, and the notions of proofs and abstraction are important to acquire. it just not central to DL.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
and there is also this: bsky.app/profile/yoav...
Yoav Goldberg (@yoavgo.bsky.social) reply parent
currently it is pre-req also here. but ideally there will be removed and replaced with a more suitable class.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
the algebraic terminology is here to stay unfortunately, and we should use it. it doesnt mean we DL is "built on" linear algebra, nor that a linear algebra class should be pre-req.
Yoav Goldberg (@yoavgo.bsky.social)
if you REALLY want to understand DL, you should start by honing your Category Theory skills, as almost everything in DL at its core can be mapped to a functor or an endofunctor.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
OR you could do what people actually do in ML these days and associate each symbolic token with a random list of numbers, and let the optimization take care of it.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
ah, likely. idk
Yoav Goldberg (@yoavgo.bsky.social) reply parent
("any" is a bit extreme because the purists will come and say "ohh but you use commutativity and associativity of addition! thats group theory!!". but i agree with you of course)
Yoav Goldberg (@yoavgo.bsky.social) reply parent
it seems to me that a large chunk of ML can actually be characterized as doing dim reduction numerically
Yoav Goldberg (@yoavgo.bsky.social) reply parent
ok, but where is PCA important as a building block in ML?
Yoav Goldberg (@yoavgo.bsky.social) reply parent
idk the sociology of it. tensors are popular because hardware/software support them, and its convenient to implement batches this way i guess.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
where is it used that is central / important?
Yoav Goldberg (@yoavgo.bsky.social) reply parent
btw, re the hardware example: the hatdware is not built to perform "norm". it is built to approximate norm over a floating point representation of real numbers. this is highly specific and doesnt enjoy the generality of algebra at all. i think it may not even be commutative and associative.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
bsky.app/profile/yoav...
Yoav Goldberg (@yoavgo.bsky.social)
taking it a step further, I'd say in many cases using the algebra jargon is harmful to understanding, and its better to just describe whats really going on. ie, "we add an L2 penalty term" --> want the sum of squares to be small. "project to vocab space" --> compute similarity to each vocab item.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
(and the useful parts of the metaphor is also much more geometric than algebric, imo)
Yoav Goldberg (@yoavgo.bsky.social) reply parent
i think these are nice metaphors, and i use them daily. i am not at all convinced that they are essential, except for very few select places.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
because when we use a term such as "a norm" we get the definition, which is nice, but also a bunch of properties that hold for items of this kind. and if we dont actually rely on these properties later on, then its a waste in terms of what we communicated.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
so i would argue that efficiency of computation is nice but also somewhat accidental and not that important. but lets focus on efficiency of communication: my point is kinda exactly around this. in a field that is actually built around algebra, the communication efficiency would be MUCH higher.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
טוב גם הטענה "משין לרנינג בנויה על תורת הקבוצות" היא הזויה בעיני. אנחנו כנראה לא מסכימים על ההגדרה של "בנויה על".
Yoav Goldberg (@yoavgo.bsky.social) reply parent
אם הטענה היא שאין כלום במתמטיקה בלי אלגברה אז אוקיי, אבל זה קצת טיעון ריק בעיני
Yoav Goldberg (@yoavgo.bsky.social) reply parent
איפה אנחנו משתמשים בקונספט הזה? (ואנחנו קצת דוחקים פה את ההגדרה של אלגברה לדעתי, כי אני גם יכול להגיד כך גם שכל אנליזה פונקציונאלית זה בעצם אלגברה, אבל בוא נזניח את זה לרגע)
Yoav Goldberg (@yoavgo.bsky.social) reply parent
תרחיב על לפרמל ייצוג וקירוב? למה זה מעניין אותנו ומה האספקטים האלגבריים שם?
Yoav Goldberg (@yoavgo.bsky.social) reply parent
what efficiency are we discussing here? efficiency of communication, or efficiency of computation?
Yoav Goldberg (@yoavgo.bsky.social) reply parent
סליחה התכוונתי '' "משין לרנינג על אופטימיזציה" זו טענה שאני בכיף ''
Yoav Goldberg (@yoavgo.bsky.social) reply parent
"משין לרנינג בנוי על אופטימיזציה" זו טענה שאני מקבל בכייף
Yoav Goldberg (@yoavgo.bsky.social) reply parent
i agree that the language is being used. but the language is pretty much the only part that is being meaningfully used.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
linear algebra is also concerned with solving systems of linear equations, the representations of linear equations as matrices/vectors, and related objects like spans and bases.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
for me, the "algebra" part is the realization that the real numbers and addition/multiplication over them are just a special case of a "group" or a "field", and that many other kinds of groups and fields exist, and can be manipulated similarly, and share many properties.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
LoRA can be described without discussing ranks at all, and it would be just as effective. it doesn't rely on any property of rank to work.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
i would say maybe geometric intuitions and how matrix operators relate to geometry? distance functions? JL lemma? not sure. this sounds like a fun class to have, i must say! happy to learn what you *do* teach.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
what do you teach?
Yoav Goldberg (@yoavgo.bsky.social) reply parent
i think you need to learn the terminology. you do not need a traditional linear algebra course, which will be a waste of time from the perspective of ML (of course, its a beautiful topic on its own).
Yoav Goldberg (@yoavgo.bsky.social) reply parent
well if you consider the commutativity and associativity of addition to be linear algebra, then sure, its in. what about convolutions? what mathematical/algebraic properties of them are important?
Yoav Goldberg (@yoavgo.bsky.social) reply parent
i do agree that some amount of LA creep in through optimization, yes. but i don't see how the perceived motivations for representation learning is relevant here.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
what intuitions would these be, for the case of linalg use in ML? matmul is defined the way it is in linalg for a reason, but i dont see how these reasons matter for the ML use.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
why is this statement useful / needed for ML?
Yoav Goldberg (@yoavgo.bsky.social) reply parent
there are many results and concepts in linalg beyond determinants. they are not used in ML. even norms arent really used as norms (what properties of norms are needed, beyond the definitions?)
Yoav Goldberg (@yoavgo.bsky.social) reply parent
good thing that we have GPUs now, then.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
writing matmul by hand in C wouldnt be that bad in terms of performance, and also adding SIMD support to this code. it is nice that we had BLAS routines available, but i dont think it was a deal breaker.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
we could probably invent DL without any linalg concepts if these didnt exist. all we need is scalar addition and multiplication (and associated calculus rules). yes, this includes regularization and things like LoRA. figures like the one below kinda proves the point:
Yoav Goldberg (@yoavgo.bsky.social)
i'll elaborate: a common computation pattern in DL happens to coincide with a known operator in linear algebra (matmul), and so we conveniently borrow linalg notation and terminology (matrices, vectors, ranks, norms). but this is just jargon. the algebric properties arent needed.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
it is wrong.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
prove me wrong by listing linear algebra topics / results that are central (or even just important) for ML, modern or otherwise.
Yoav Goldberg (@yoavgo.bsky.social)
"Modern ML is built on Linear Algebra". lol no its not.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
yes, but this can be achieved also if i provide a json schema for describing tools and point to files containing such descriptions, with the implementations being done externally to the protocol. what's the benefit of coupling in the implementation?
Yoav Goldberg (@yoavgo.bsky.social) reply parent
can you elaborate on these "more than API calls"? what else does it provide?
Yoav Goldberg (@yoavgo.bsky.social) reply parent
the issue is that the MCP couples two things: tool discovery (an LLM friendly description of what tools are available and how to invoke them), and tool implementation. but "implementation" both takes effort, and doesn't need new standards. so why not focus only on discovery?
Yoav Goldberg (@yoavgo.bsky.social)
why is "MCP" implemented as server exposing a set of endpoints, rather than as some JSON schema for defining tool descriptions and allowing these JSON files to be accessed over http? what is the purpose/benefit of the middleman server?
Yoav Goldberg (@yoavgo.bsky.social)
you know what, nah, we don't want to close it. it will be just 80% closed.
Yoav Goldberg (@yoavgo.bsky.social)
and now, we will proceed to peacefully close the strait of Hormuz. you know, for the environment.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
for some reason, iran saw this mostly peaceful operation as an act of aggression. they will retaliate in a peaceful manner by causing chaos.
Yoav Goldberg (@yoavgo.bsky.social)
today, during a peaceful flight over an iranian mountain, a US airplane dropped a mostly peaceful bunker buster bomb, who flew peacefully until it hit the mountain and mostly peaceful facility underneath it. there was a brief period of violent detonation on impact, then peace again.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
אולי זה תלוי את מי קוראים ראשון
Yoav Goldberg (@yoavgo.bsky.social) reply parent
אחרי קבליר וקליי הכל מתגמד
Yoav Goldberg (@yoavgo.bsky.social)
אחד מלקחי ליל המקלטים אמש הוא שאין לי סבלנות לקרוא ספרות מקצועית, אבל לקרוא ספרות קלה זה די סבבה. מצד שני הספר שהיה לי בנייד, הוא כזה שהתחלתי לקרוא והפסקתי והיתה סיבה שהפסקתי, הוא מייגע ומעפן. בקיצור שילחו המלצות לספרים. עברית או אנגלית, אבל באנגלית ככהנ יהיה לי יותר קל להתארגן.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
לא יודע אם זה שונה (אני לא חזק בעולמות הטקסונומיה של הדמגוגיה). אני רק אומר שזה שהשמאל "לא ממציא דברים כמו הימין", אז זה אולי נכון אבל זה רק כי יש לו טכניקות אחרות לאותה מטרה
Yoav Goldberg (@yoavgo.bsky.social) reply parent
טוב זה לא ממש מקרי קיצון, כי המקרים שונים זה מזה, אבל יש פה מגוון של טכניקות שלא ממציאות עובדות שגויות ועדיין יוצרות דמגוגיה אפקטיבית
Yoav Goldberg (@yoavgo.bsky.social) reply parent
במקרי הקיצון יותר יש את מקס בלומנטל עם 'הישראלים הרגו את עצמם בשביעי באוקטובר' ו(להבדיל, בכל זאת..) את הכתבות 'בינה מלאכותית בצהל' של יובל אברהם בבמה מקומית ו972.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
(וספציפית פה, זה לא שהם המציאו את הסיפור. הם מהדהדים נרטיב שמישהו אחר דוחף, עד כמה שאני רואה)
Yoav Goldberg (@yoavgo.bsky.social) reply parent
כן זה בדיוק מה שאני אומר. הטכניקה אחרת.
Yoav Goldberg (@yoavgo.bsky.social) reply parent
שונרא