Reversal knowledge in this case being, if the LLM knows that A is B, does it also know that B is A, and apparently the answer is pretty resoundingly no! I’d be curious to see if some CoT affected the results at all
Reversal knowledge in this case being, if the LLM knows that A is B, does it also know that B is A, and apparently the answer is pretty resoundingly no! I’d be curious to see if some CoT affected the results at all
Meh. Either I’m doing something wrong. Or we should stop linking (only) twitter posts. I can only see the original 42 words and a picture. No mentioned paper or thread that clarifies what this means.
For other people with the same problem, here’s the website of the person: https://owainevans.github.io/
And here’s the mentioned paper: https://owainevans.github.io/reversal_curse.pdf
thank you
Yeah fair point, I’ll make sure to include better links in the future :) typically post from mobile so it’s annoying but doable