• 0 Posts
  • 137 Comments
Joined 2 years ago
cake
Cake day: June 20th, 2023

help-circle







  • This can be correct, if they’re talking about training smaller models.

    Imagine this case. You are an automotive manufacturer that uses ML to detect pedestrians, vehicles, etc with cameras. Like what Tesla does, for example. This needs to be done with a small, relatively low power footprint model that can run in a car, not a datacentre. To improve its performance you need to finetune it with labelled data of traffic situations with pedestrians, vehicles, etc. That labeling would be done manually…

    … except when we get to a point where the latest Gemini/LLAMA/GPT/Whatever, which is so beefy that could never be run in that low power application… is also beefy enough to accurately classify and label the things that the smaller model needs to get trained.

    It’s like an older sibling teaching a small kid how to do sums, not an actual maths teacher but does the job and a lot cheaper or semi-free.













  • I’m talking about running them in GPU, which favours the GPU even when the comparison is between an AMD Epyc and a mediocre GPU.

    If you want to run a large version of deepseek R1 locally, with many quantized models being over 50GB, I think the cheapest Nvidia GPU that fits the bill is an A100 which you might find used for 6K.

    For well under that price you can get a whole Mac Studio with those 192 GB the first poster in this thread mentioned.

    I’m not saying this is for everyone, it’s certainly not for me, but I don’t think we can dismiss that there is a real niche where Apple has a genuine value proposition.

    My old flatmate has a PhD in NLP and used to work in research, and he’d have gotten soooo much use out of >100 GB of RAM accessible to the GPU.