this is not ragebait rule

Smorty [she/her]@lemmy.blahaj.zone · 1 month ago

this is not ragebait rule

lime! · 1 month ago

i’m personally not too fond of llms, because they are being pushed everywhere, even when they don’t make sense and they need to be absolutely massive to be of any use, meaning you need a data center.

i’m also hesitant to use the term “ai” at all since it says nothing and encompasses way too much.

i like using image generators for my own amusement and to “fix” the stuff i make in image editors. i never run any online models for this, i bought extra hardware specifically to experiment. and i live in a city powered basically entirely by hydro power so i’m pretty sure i’m personally carbon neutral. otherwise i wouldn’t do it.

the main things that bother me is partially the scale of operations, partially the philosophy of the people driving this. i’ve said it before but open ai seem to want to become e/acc tech priests. they release nothing about their models, they hide them away and insinuate that we normal hoomans are unworthy of the information and that we wouldn’t understand it anyway. which is why deepseek caused such a market shake, it cracked the pedestal underneath open ai.

as for the training process, i’m torn. on the one hand it’s shitty to scrape people’s work without consent, and i hope open ai gets their shit smacked out of them by copyright law. on the other hand i did the math on the final models, specifically on stable diffusion 1.0: it used the LAION 5B scientific dataset of tagged images, which has five billion ish data points as the name suggests. stable diffusion 1.0 is something like 4GB. this means there’s on average less than eight bits in the model per image and description combination. given that the images it trained on were 512x512 on average, that gives a shocking 0.00003 bits per pixel. and stable diffusion 1.5 has more than double the training data but is the same size. at that scale there is nothing of the original image in there.

the environmental effect is obviously bad, but the copying argument? i’m less certain. that doesn’t invalidate the people who are worried it will take jobs, because it will. mostly through managers not understanding how their businesses work and firing talented artists to replace with what is basically noise machines.