Post by Kevin Riggle / Redsky

Kevin Riggle • Feed

Isn’t the answer here in 2025 to figure out what the embedding vector of your specific topic is and use an LLM to search for that embedding vector in your corpus?

sep 1, 2025, 7:31 pm • 1 0

Replies

Iiuc this is the whole genius of word embedding models

sep 1, 2025, 7:32 pm • 1 0 • view

That they are good at catching associations with topics that wouldn’t be included in a straight classifier approach

sep 1, 2025, 7:33 pm • 0 0 • view

is there a good out of the box way to do that? I'm pretty handy with things like structural topic models and supervised learning at this point, but that might be a little outside my toolset at the moment.

sep 1, 2025, 7:46 pm • 0 0 • view

OpenAI offers the mapping to embeddings as a service; then it's just a matter of finding the centroid of the ones with your word and then (probably) everything within some distance of that. Big problem is that I think you'd need a networked API call for each Could also do like a tiny llama model?

sep 1, 2025, 8:40 pm • 2 0 • view

idk what the size of your data or this class is but I bet you can be more efficient than this, though.

sep 1, 2025, 8:42 pm • 1 0 • view

this is, sadly, way outside my current expertise

sep 1, 2025, 7:47 pm • 1 0 • view