What is this? (Its OC!)

Smokeydope@lemmy.world · edit-2 3 months ago

What is this? (Its OC!)

brucethemoose@lemmy.world · edit-2 3 months ago

Oh I got you mixed up with the other commenter, apologies.

I’m not sure when llama 8b starts to degrade at long context, but I wanna say its well before 128K, and where other “long context” models start to look much more attractive depending on the task. Right now I am testing Amazon’s mistral finetune, and it seems to be much better than Nemo or llama 3.1 out there.