The AI Horde is a project I started in order to provide access to Generative AI to everyone in the world, regardless of wealth and resources. The objective is to provide a truly open REST API that anyone is free integrate with for their own software and games and allows people to experiment without requiring online payment that is not always possible for everyone.

It is fully FOSS and relies on people volunteering their idle compute from their PCs. In exchange, you receive more priority for your own generations. We already have close to 100 workers, providing generations from stable diffusion to 70b LLM models!

Also the lemmy community is at [email protected]

If you are interested in democratizing access to Generative AI, consider joining us!

  • Blaed@lemmy.worldM
    link
    fedilink
    English
    arrow-up
    6
    ·
    edit-2
    1 year ago

    Hey! Appreciate your post. The AI Horde has been one of my favorite projects to see evolve over the course of this year. Consider me subbed.

    For myself (and others not as knowledgeable on the project), do you think you could briefly describe the main differences between how The AI Horde approaches crowd compute / inference compared to something like Petals? I know you mentioned here that the horde doesn’t do training. Is that the biggest difference to note?

    Thanks again for your contribution to democratizing AI. Excited to see what The AI Horde can do with more supporters. I’ll be dedicating a few more nodes when I have a chance to spin them up.

    • db0@lemmy.dbzer0.comOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      Petals does distributed inference, which even with their enhanced system, is still pretty slow. The AI Horde sends the whole prompt to one individual worker and expects the reply back from that one worker. As such the speed is only dependent on the GPU that worker has. This limits us to how big models we can run, but then again I’ve seen people onboarding some crazy GPUs on the horde.

      • Blaed@lemmy.worldM
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        1 year ago

        Appreciate the insight. I like that approach. I just learned you can become an Alchemist too. That’s a nice touch.

        Alternatively you can become an Alchemist worker which is much more lightweight and can even run on CPU (i.e. without a GPU).

        If you’re reading this, consider turning your PC into a Horde Worker! It’s a great way to contribute to AI (text & image-based) if you have the power to spare.

  • INeedMana@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    1 year ago

    Out of pure curiosity: isn’t communicating via internet a bottleneck for distributed neural networks implementation? I mean the distribution of the work, not the API part

    • db0@lemmy.dbzer0.comOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      1 year ago

      We’re not doing distributed inference. We’re doing distributed clustering The inference runs on individual PCs

        • db0@lemmy.dbzer0.comOP
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 year ago

          Yes, each prompt goes to a single worker. The AI Horde also doesn’t do training. We only do inference.

          • INeedMana@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            Does that mean that with a big (parameter-wise) model and not that powerful worker it can take a long time to respond?

            Or the distributed clustering prevents from choking a worker?

            • db0@lemmy.dbzer0.comOP
              link
              fedilink
              English
              arrow-up
              2
              ·
              1 year ago

              Yes, it could. But we have a timeout. If a worker is unreasonably slow to respond, they get put into maintenance, so we expect people to only serve what they can run.