- cross-posted to:
- technology@lemmy.zip
- cross-posted to:
- technology@lemmy.zip
- Rabbit R1 AI box is actually an Android app in a limited $200 box, running on AOSP without Google Play.
- Rabbit Inc. is unhappy about details of its tech stack being public, threatening action against unauthorized emulators.
- AOSP is a logical choice for mobile hardware as it provides essential functionalities without the need for Google Play.
“Fairly high” is still useless (and doesn’t actually quantify anything, depending on context both 1% and 99% could be ‘fairly high’). As long as these models just hallucinate things, I need to double-check. Which is what I would have done without one of these things anyway.
Hallucinations are largely dealt with if you use agents. It won’t be long until it gets packaged well enough that anyone can just use it. For now, it takes a little bit of effort to get a decent setup.
1% correct is never “fairly high” wtf
Also if you want a computer that you don’t have to double check, you literally are expecting software to embody the concept of God. This is fucking stupid.
It’s all about context. Asking a bunch of 4 year olds questions about trigonometry, 1% of answers being correct would be fairly high. ‘Fairly high’ basically only means ‘as high as expected’ or ‘higher than expected’.
Hence, it is useless. If I cannot expect it to be more or less always correct, I can skip using it and just look stuff up myself.
Obviously the only contexts that would apply here are ones where you expect a correct answer. Why would we be evaluating a software that claims to be helpful against 4 year old asked to do calculus? I have to question your ability to reason for insinuating this.
So confirmed. God or nothing. Why don’t you go back to quills? Computers cannot read your mind and write this message automatically, hence they are useless
That’s the whole point, I don’t expect correct answers. Neither from a 4 year old nor from a probabilistic language model.
And you don’t expect a correct answer because it isn’t 100% of the time. Some lemmings are basically just clones of Sheldon Cooper
I don’t expect a correct answer because I’ve used these models quite a lot last year. At least half the answers were hallucinated. And it’s still a common complaint about this product as well if you look at actual reviews (e.g., pretty sure Marques Brownlee mentions it).
Something seems to fly above your head: quality is not optional and it’s good engineering practice to seek reliable methods of doing our work. As a mature software person, you look for tools that give less room for failure and want to leave as little as possible for humans to fuck up, because you know they’re not reliable, despite being unavoidable. That’s the logic behind automated testing, Rust’s borrow checker, static typing…
If you’ve done code review, you know it’s not very efficient at catching bugs. It’s not efficient because you don’t pay as much attention to details when you’re not actually writing the code. With LLMs, you have to do code review to ensure you meet quality standards, because of the hallucinations, just like you’ve got to test your work before committing it.
I understand the actual software engineers that care about delivering working code and would rather write it in order to be more confident in the quality of the output.
Like most people, I have no interest in engaging in conversation with someone who gives me zero reason to.
Not that it’s any of your business, but quality matters to me more than anything else, which is why I like tools that help me deliver it
Truth is, your complete misunderstanding of the person you replied to seems to suggest otherwise, and the arrogant delivery doesn’t help.
Seems like that one hit a nerve, uh?