image description (contains clarifications on background elements)

Lots of different seemingly random images in the background, including some fries, mr. crabs, a girl in overalls hugging a stuffed tiger, a mark zuckerberg “big brother is watching” poser, two images of fluttershy (a pony from my little pony) one of them reading “u only kno my swag, not my lore”, a picture of parkzer parkzer from the streamer “dougdoug” and a slider gameplay element from the rhythm game “osu”. The background is made light so that the text can be easily read. The text reads:

i wanna know if we are on the same page about ai.
if u diagree with any of this or want to add something,
please leave a comment!
smol info:
- LM = Language Model (ChatGPT, Llama, Gemini, Mistral, ...)
- VLM = Vision Language Model (Qwen VL, GPT4o mini, Claude 3.5, ...)
- larger model = more expensivev to train and run
smol info end
- training processes on current AI systems is often
clearly unethical and very bad for the environment :(
- companies are really bad at selling AI to us and
giving them a good purpose for average-joe-usage
- medical ai (e.g. protein folding) is almost only positive
- ai for disabled people is also almost only postive
- the idea of some AI machine taking our jobs is scary
- "AI agents" are scary. large companies are training
them specifically to replace human workers
- LMs > image generation and music generation
- using small LMs for repetitive, boring tasks like
classification feels okay
- using the largest, most environmentally taxing models
for everything is bad. Using a mixture of smaller models
can often be enough
- people with bad intentions using AI systems results
in bad outcome
- ai companies train their models however they see fit.
if an LM "disagrees" with you, that's the trainings fault
- running LMs locally feels more okay, since they need
less energy and you can control their behaviour
I personally think more positively about LMs, but almost
only negatively about image and audio models.
Are we on the same page? Or am I an evil AI tech sis?

IMAGE DESCRIPTION END


i hope this doesn’t cause too much hate. i just wanna know what u people and creatures think <3

  • lime!
    link
    fedilink
    English
    arrow-up
    3
    ·
    1 day ago

    i’m personally not too fond of llms, because they are being pushed everywhere, even when they don’t make sense and they need to be absolutely massive to be of any use, meaning you need a data center.

    i’m also hesitant to use the term “ai” at all since it says nothing and encompasses way too much.

    i like using image generators for my own amusement and to “fix” the stuff i make in image editors. i never run any online models for this, i bought extra hardware specifically to experiment. and i live in a city powered basically entirely by hydro power so i’m pretty sure i’m personally carbon neutral. otherwise i wouldn’t do it.

    the main things that bother me is partially the scale of operations, partially the philosophy of the people driving this. i’ve said it before but open ai seem to want to become e/acc tech priests. they release nothing about their models, they hide them away and insinuate that we normal hoomans are unworthy of the information and that we wouldn’t understand it anyway. which is why deepseek caused such a market shake, it cracked the pedestal underneath open ai.

    as for the training process, i’m torn. on the one hand it’s shitty to scrape people’s work without consent, and i hope open ai gets their shit smacked out of them by copyright law. on the other hand i did the math on the final models, specifically on stable diffusion 1.0: it used the LAION 5B scientific dataset of tagged images, which has five billion ish data points as the name suggests. stable diffusion 1.0 is something like 4GB. this means there’s on average less than eight bits in the model per image and description combination. given that the images it trained on were 512x512 on average, that gives a shocking 0.00003 bits per pixel. and stable diffusion 1.5 has more than double the training data but is the same size. at that scale there is nothing of the original image in there.

    the environmental effect is obviously bad, but the copying argument? i’m less certain. that doesn’t invalidate the people who are worried it will take jobs, because it will. mostly through managers not understanding how their businesses work and firing talented artists to replace with what is basically noise machines.