• TheKingBee@lemmy.world
    link
    fedilink
    English
    arrow-up
    81
    ·
    5 months ago

    “can you draw a room with absolutely no elephants in it? not a picture not in the background, none, no elephants at all. seriously, no elephants anywhere in the room. Just a room any at all, with no elephants even hinted at.”

    • Fishbone@lemmy.world
      link
      fedilink
      arrow-up
      32
      ·
      5 months ago

      “Can you a room as aboluteyy no eleephant it all?”

      Dunno what’s giving more “clone of a clone” vibes, the dialogue or the 3 small standing “elephants” in that image.

    • Magnetar@feddit.de
      link
      fedilink
      arrow-up
      29
      ·
      5 months ago

      I’m getting the impression, the “Elephant Test” will become famous in AI image generation.

      • barsoap@lemm.ee
        link
        fedilink
        arrow-up
        4
        ·
        edit-2
        5 months ago

        It’s not a test of image generation but text comprehension. You could rip CLIP out of Stable Diffusion and replace it with something that understands negation but that’s pointless, the pipeline already takes two prompts for exactly that reason: One is for “this is what I want to see”, the other for “this is what I don’t want to see”. Both get passed through CLIP individually which on its own doesn’t need to understand negation, the rest of the pipeline has to have a spot to plug in both positive and negative conditioning.

        Mostly it’s just KISS in action, but occasionally it’s actually useful as you can feed it conditioning that’s not derived from text, so you can tell it “generate a picture which doesn’t match this colour scheme here” or something. Say, positive conditioning text “a landscape”, negative conditioning an image, archetypal “top blue, bottom green”, now it’ll have to come up with something more creative as the conditioning pushes it away from things it considers normal for “a landscape” and would generally settle on.