I’ve tried coding and every one I’ve tried fails unless really, really basic small functions like what you learn as a newbie compared to say 4o mini that can spit out more sensible stuff that works.

I’ve tried explanations and they just regurgitate sentences that can be irrelevant, wrong, or get stuck in a loop.

So. what can I actually use a small LLM for? Which ones? I ask because I have an old laptop and the GPU can’t really handle anything above 4B in a timely manner. 8B is about 1 t/s!

  • ragingHungryPanda@lemmy.zip
    link
    fedilink
    English
    arrow-up
    0
    ·
    6 months ago

    I’ve run a few models that I could on my GPU. I don’t think the smaller models are really good enough. They can do stuff, sure, but to get anything out of it, I think you need the larger models.

    They can be used for basic things, though. There are coder specific models you can look at. Deepseek and qwen coder are some popular ones

  • swelter_spark@reddthat.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    6 months ago

    7b is the smallest I’ve found useful. I’d try a smaller quant before going lower, if I had super small vram.

  • herseycokguzelolacak@lemmy.ml
    link
    fedilink
    English
    arrow-up
    0
    ·
    6 months ago

    for coding tasks you need web search and RAG. It’s not the size of the model that matters, since even the largest models find solutions online.

      • wise_pancake@lemmy.ca
        link
        fedilink
        English
        arrow-up
        0
        ·
        6 months ago

        Open webui lets you install a ton of different search providers out of the box, but you do need sn API key for most and I haven’t vetted them

        I’m trying to get Kagi to work with Phi4 and not having success.

  • Mordikan@kbin.earth
    link
    fedilink
    arrow-up
    0
    ·
    6 months ago

    I’ve used smollm2:135m for projects in DBeaver building larger queries. The box it runs on is Intel HD graphics with an old Ryzen processor. Doesn’t seem to really stress the CPU.

      • Mordikan@kbin.earth
        link
        fedilink
        arrow-up
        0
        ·
        6 months ago

        Sorry, I was trying to find parts for my daughter’s machine while doing this (cheap Minecraft build). I corrected my comment.

  • surph_ninja@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    6 months ago

    Learning/practice, and any use that feeds in sensitive data you want to keep on-prem.

    Unless you’re set to retire within the next 5 years, the best reason is to keep your resume up to date with some hands-on experience. With the way they’re trying to shove AI into every possible application, there will be few (if any) industries untouched. If you don’t start now, you’re going to be playing catch up in a few years.

  • irmadlad@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    6 months ago

    As cool and neato as I find AI to be, I haven’t really found a good use case for it in the selfhosting/homelabbing arena. Most of my equipment is ancient and lacking the GPU necessary to drive that bus.

  • MTK@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    6 months ago

    Have you tried RAG? I believe that they are actually pretty good for searching and compiling content from RAG.

    So in theory you could have it connect to all of you local documents and use it for quick questions. Or maybe connected to your signal/whatsapp/sms chat history to ask questions about past conversations

      • MTK@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        6 months ago

        RAG is basically like telling an LLM “look here for more info before you answer” so it can check out local documents to give an answer that is more relevant to you.

        You just search “open web ui rag” and find plenty kf explanations and tutorials

        • iii@mander.xyz
          link
          fedilink
          English
          arrow-up
          0
          ·
          edit-2
          6 months ago

          I think RAG will be surpassed by LLMs in a loop with tool calling (aka agents), with search being one of the tools.

  • some_guy@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    0
    ·
    6 months ago

    I installed Llama. I’ve not found any use for it. I mean, I’ve asked it for a recipe because recipe websites suck, but that’s about it.

  • HelloRoot@lemy.lol
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    6 months ago

    Sorry, I am just gonne dump you some links from my bookmarks that were related and interesting to read, cause I am traveling and have to get up in a minute, but I’ve been interested in this topic for a while. All of the links discuss at least some usecases. For some reason microsoft is really into tiny models and made big breakthroughs there.

    https://reddit.com/r/LocalLLaMA/comments/1cdrw7p/what_are_the_potential_uses_of_small_less_than_3b/

    https://github.com/microsoft/BitNet

    https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/

    https://news.microsoft.com/source/features/ai/the-phi-3-small-language-models-with-big-potential/

    https://techcommunity.microsoft.com/blog/aiplatformblog/introducing-phi-4-microsoft’s-newest-small-language-model-specializing-in-comple/4357090

  • entwine413@lemm.ee
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    6 months ago

    I’ve integrated mine into Home Assistant, which makes it easier to use their voice commands.

    I haven’t done a ton with it yet besides set it up, though, since I’m still getting proxmox configured on my gaming rig.

    • Passerby6497@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      6 months ago

      What are you using for voice integration? I really don’t want to buy and assemble their solution if I don’t have to

      • entwine413@lemm.ee
        link
        fedilink
        English
        arrow-up
        0
        ·
        6 months ago

        I just use the companion app for now. But I am designing a HAL9000 system for my home.

        • shnizmuffin@lemmy.inbutts.lol
          link
          fedilink
          English
          arrow-up
          0
          ·
          6 months ago

          [ A DIM SCREEN WITH ORANGE TEXT ]

          Objective: optimize electrical bill during off hours.
          
          ... USER STATUS: UNCONSCIOUS 
          ... LIGHTING SYSTEM: DISABLED
          ... AUDIO/VISUAL SYSTEM: DISABLED 
          ... CLIMATE SYSTEM: ECO MODE ENABLED
          ... SURVEILLANCE SYSTEM: ENABLED 
          ... DOOR LOCKS: ENGAGED
          ... CELLULAR DATA: DISABLED
          ... WIRELESS ACCESS POINTS: DISABLED
          ... SMOKE ALARMS: DISABLED
          ... CO2 ALARMS: DISABLED
          ... FURNACE: SET TO DIAGNOSTIC MODE
          ... FURNACE_PILOT: DISABLED
          ... FURNACE_GAS: ENABLED
          
          WARN: Furnace gas has been enabled without a Furnace pilot. Please consult the user manual to ensure proper installation procedure.
          
          ... FURNACE: POWERED OFF
          
          Objective realized. Entering low power mode.
          

          [ Cut to OP, motionless in bed ]

          • entwine413@lemm.ee
            link
            fedilink
            English
            arrow-up
            0
            ·
            6 months ago

            Luckily my entire neighborhood doesn’t have gas and I have a heat pump.

            But rest assured, I’m designing the system with 20% less mental illness

  • hendrik@palaver.p3x.de
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    6 months ago

    I think that’s a size where it’s a bit more than a good autocomplete. Could be part of a chain for retrieval augmented generation. Maybe some specific tasks. And there are small machine learning models that can do translation or sentiment analysis, though I don’t think those are your regular LLM chatbots… And well, you can ask basic questions and write dialogue. Something like “What is an Alpaca?” will work. But they don’t have much knowledge under 8B parameters and they regularly struggle to apply their knowledge to a given task at smaller sizes. At least that’s my experience. They’ve become way better at smaller sizes during the last year or so. But they’re very limited.

    I’m not sure what you intend to do. If you have some specific thing you’d like an LLM to do, you need to pick the correct one. If you don’t have any use-case… just run an arbitrary one and tinker around?