GPT-4's Secret Has Been Revealed

Ayhem@lemmy.world · edit-2 1 year ago

GPT-4's Secret Has Been Revealed

trafficnab@kbin.social · 1 year ago

7b q4 (quantized, basically compressed down to only using 4 bit precision) is about 4gb of ram, 13b q4 is about 8gb, and 30b q4 is the one that’s about 25gb

30b generates slowly, but more or less usably on a CPU, the rest generate on CPU just fine

gbuttersnaps@lemmy.world · 1 year ago

Okay awesome, that’s even better than I thought. I had a friend showing it to me last night, I was thinking about trying it out today. I run a 12th gen I9 and a 2080TI, I assume I would probably get better performance on my gpu right?

trafficnab@kbin.social · 1 year ago

It should yeah, it used to be that if you wanted to run the model on your GPU it needed to fit entirely within its VRAM (which really limited what models people could use on consumer GPUs), but I think recently they’ve added the ability to run part of the model on your GPU+VRAM and part of it on your CPU+RAM, although I don’t know the specifics as I’ve only briefly played around with it

GPT-4's Secret Has Been Revealed

GPT-4's Secret Has Been Revealed

GPT-4’s Secret Has Been Revealed