A Blog by Jonathan Low

 

Jun 27, 2019

How Machine Learning Is Weaponizing Fact Checking To Identify More Specific Claims

The more accuracy, and the more specific it is, the better. JL

Kalev Leetaru comments in Forbes:

An article with a single false or misleading claim will be flagged in its entirety at the URL level, while a Website with a history of falsehoods will be flagged at the domain level, blacklisting even accurate content. Could natural language processing help transform the way in which fact checking results are applied, flagging the individual sentences within articles that are false or misleading, rather than blacklisting entire Websites? Algorithms could identify the individual claim, separating them from the rest of the verbiage unrelated to the dispute. External links to disputed content could be dynamically rewritten to highlight problematic passages. Information science describes this as “mobilizing” information.
Fact checking today operates at the level of claims but yields blacklists that are enacted primarily at the level of entire articles and outlets. An article with a single false or misleading claim will be flagged in its entirety at the URL level, while a Website with a history of falsehoods will be flagged at the domain level, blacklisting even accurate content. Such coarse blacklisting creates a significant amount of collateral damage as social media platforms block or deemphasize large amounts of accurate statements to reduce the distribution of a few false ones. Could natural language processing help transform the way in which fact checking results are applied, flagging the individual sentences within articles that are false or misleading, rather than blacklisting entire Websites?
The fact checking process involves evaluating specific claims and passing verdicts on their veracity. A fact check will typically confirm some elements of an article or post and refute others. However, information on the Web does not exist in a vacuum, it is published in the form of an object, typically a Web page or social media post, that may group together multiple claims under a single URL.
Today’s Web operates at the resolution of a URL, meaning a Webpage containing a false claim is typically blocked or deemphasized in its entirety, since there is no way for a social media site to only block the false claims in a Webpage while allowing the accurate claims to survive.
While this is largely a technological issue with how the Web functions, it is also a result of the current human-centered nature of fact checking. It would simply be intractable to ask fact checkers to review every page and post that repeated a claim and identify just the text related to the false claim.
What if machine learning could offer such a capability?
Imagine using natural language algorithms to scan every post and link being shared on social platforms to identify whether any of the statements within match claims identified by fact checkers. Such algorithms could identify the individual sentences containing the claim, separating them from the rest of the verbiage unrelated to the disputed claim.
In turn, rather than blocking a post in its entirety, social platforms could highlight the problematic passages within that post, connecting them back to the fact check and evidence calling them into question. External links to pages containing disputed content could similarly be dynamically rewritten to highlight the problematic passages.Information science describes this as “mobilizing” information. Rather than treating documents as immutable objects, they are recognized as containers holding many discrete arguments and details, any of one of which can be extracted from that container to be shared or cited on its own. In the context of fact checking, one can think of disputed claims as being distinct from the other claims in a document and thus isolable.
While current natural language algorithms are extremely brittle, such a system would certainly be achievable today, even if in its current form it would require significant linguistic overlap between claim and fact check and less able to read between the lines of ambiguous language.
Putting this all together, perhaps as natural language processing improves we will someday be able to move beyond blacklisting entire posts and pages towards highlighting only the individual disputed sentences.
Perhaps someday autonomous fact checking algorithms could even work alongside spell and grammar checkers as a transparent part of our daily communicative lives.

0 comments:

Post a Comment