all major LLMs in the west are programmed to have bias. i think if you actually let it train on all data and didn’t censor it it would call out propaganda
I don’t think it would help much, let’s look at CNN’s coverage of the pager attack to see what I mean.
So, starting off with the headline, the article is already biased in a way that an LLM couldn’t detect: the headline claims the attack was targeting Hezbollah, which is already contradictory with the facts shown immediately below, and contextual information from the real world. As humans, we can think about what it means for pagers to be rigged with explosives and detonated simultaneously at scale. We can understand that there’s no way to do that while “targeting” any specific group, because you can’t control where rigged consumer goods go. But an LLM would never know that, because an LLM doesn’t reason about the world and doesn’t actually know what the words it reads and writes represents. The best an LLM could do is notice something wrong between the headline saying the attack targeted Hezbollah and then the body showing that most injured people were civilians. But how could an LLM ever know that, even trained on better data? The article has this statement,
NNA reported that “hacked” pager devices exploded in the towns of Ali Al-Nahri and Riyaq in the central Beqaa valley, resulting in a significant number of injuries. The locations are Hezbollah strongholds.
that an AI would just have to know to mistrust to still identify the problem. As far as the AI knows, when the article says those towns are Hezbollah strongholds, that could literally mean those towns are… Hezbollah strongholds, in the literal military sense, rather than just places where they have strong presences. How would an LLM know any better?
Similar argument can be made about the information the article gives about the context more broadly. It mentions the tit for tat battles since Oct 7 but has no mention of Israel’s history of aggression against Lebanon. Could an LLM identify that omission as a source of bias? It’d need to be very sure of its own understanding of history to do so, and if you’ve ever tried to have an LLM fact check something for you, you’d already know they’re not designed to hold a consistent grip on reality, so we can reasonably say that LLMs are not the right tools to identify omission as a source of bias. Much less fabrication, because to identify that something is a lie you’d need a way to find a contradicting source and verify the contradicting source, which an LLM can never do.
all major LLMs in the west are programmed to have bias. i think if you actually let it train on all data and didn’t censor it it would call out propaganda
I don’t think it would help much, let’s look at CNN’s coverage of the pager attack to see what I mean.
So, starting off with the headline, the article is already biased in a way that an LLM couldn’t detect: the headline claims the attack was targeting Hezbollah, which is already contradictory with the facts shown immediately below, and contextual information from the real world. As humans, we can think about what it means for pagers to be rigged with explosives and detonated simultaneously at scale. We can understand that there’s no way to do that while “targeting” any specific group, because you can’t control where rigged consumer goods go. But an LLM would never know that, because an LLM doesn’t reason about the world and doesn’t actually know what the words it reads and writes represents. The best an LLM could do is notice something wrong between the headline saying the attack targeted Hezbollah and then the body showing that most injured people were civilians. But how could an LLM ever know that, even trained on better data? The article has this statement,
that an AI would just have to know to mistrust to still identify the problem. As far as the AI knows, when the article says those towns are Hezbollah strongholds, that could literally mean those towns are… Hezbollah strongholds, in the literal military sense, rather than just places where they have strong presences. How would an LLM know any better?
Similar argument can be made about the information the article gives about the context more broadly. It mentions the tit for tat battles since Oct 7 but has no mention of Israel’s history of aggression against Lebanon. Could an LLM identify that omission as a source of bias? It’d need to be very sure of its own understanding of history to do so, and if you’ve ever tried to have an LLM fact check something for you, you’d already know they’re not designed to hold a consistent grip on reality, so we can reasonably say that LLMs are not the right tools to identify omission as a source of bias. Much less fabrication, because to identify that something is a lie you’d need a way to find a contradicting source and verify the contradicting source, which an LLM can never do.