Causal Confusion

LLMs

Date

March 2025 (HTI master graduation)

Links

Github

Causal language plays a critical role in scientific communication, as it shapes public understanding, informs policy, and impacts healthcare decisions. When causal statements are ambiguous or misleading, they can lead to confusion and misinterpretation of research findings. This study explores how large language models (LLMs) can improve causal language in academic writing. It employs a two-step approach: first, distinguishing non-causal from causal statements, and second, classifying causal sentences as correlational, conditional causal, or direct causal. The models were fine-tuned on a blended dataset of general-purpose (news, web) and scientific (social science, biomedical) human-labeled sentences. The BERT-based classifier achieved a macro F1-score of 0.94 for detecting causal versus non-causal sentences, while SciBERT attained 0.83 in distinguishing correlational, conditional causal, and direct causal statements. To explore how these classifiers can be applied in practice, a tool was developed to analyze scientific papers and texts, offering personalized warnings and highlighting potential inconsistencies in causal reasoning. By providing researchers with a (visual) overview of causal strength and alignment with study design, the tool supports clearer, more precise communication of research findings. This study demonstrates how LLMs can enhance the clarity and precision of causal language in academic writing, offering a scalable approach to improving scientific communication.