avatar
the beastly fido @theophite.bsky.social

(it does this, i think, because of amplification of signal between the text encoder and the UNet's x-attention domain, which then saturates at least one VAE channel and produces incorrect colors.)

aug 28, 2025, 10:18 pm • 3 0

Replies

avatar
the beastly fido @theophite.bsky.social

er, -0.99, but you get my point.

aug 28, 2025, 10:18 pm • 5 0 • view
avatar
the beastly fido @theophite.bsky.social

like, right? if you have just an absolutely enormous error on a parameter (because you initialized a token to a random value), then v_t is going to be something like 1e-12, making the effective LR = eps2, which is 0.01, even if your actual LR is 1e-7.

aug 28, 2025, 10:58 pm • 4 0 • view
avatar
The Flaky Wanderer @flakywanderer.bsky.social

Line search time? If the new point is worse than the old one, back up until it isn't worse

aug 29, 2025, 12:04 am • 0 0 • view
avatar
The Flaky Wanderer @flakywanderer.bsky.social

Hmm... is there a way to apply line search when you don't have the actual function (or a good approximation) close at hand?

aug 29, 2025, 12:10 am • 0 0 • view
avatar
the beastly fido @theophite.bsky.social

this is a @nsaphra.bsky.social paper

aug 29, 2025, 12:21 am • 0 0 • view
avatar
Naomi Saphra @nsaphra.bsky.social

lmao yes

aug 29, 2025, 2:13 am • 2 0 • view