A great, slightly more in depth (without being mathy) explanation of transformer models. Mostly talking about AlexNet, an image classifier from 2012. Goes over some history and has some very interesting looks under the hood.

He does use some personifying language for these models, but that’s unfortunately the case for most information on the topic.

  • KnilAdlez [none/use name]@hexbear.net
    link
    fedilink
    English
    arrow-up
    5
    ·
    1 month ago

    This is mostly about convolution neural networks, which don’t really work the same way as transformers. transformers weren’t invented until 2017 and they are most like a more complex version of a recurrent neural network (even that is simplifying it)

    • BountifulEggnog [she/her]@hexbear.netOP
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 month ago

      ohnoes Of course I messed it up. I thought the transformer paper was newer then 2012, but I remembered them being mentioned in the beginning of the video. I should have rewatched to make sure I understood.