Authors
Mohammad Rashedul Hasan, University of Nebraska-Lincoln, USA
Abstract
We investigate the generalizability of the deep contextual models along two dimensions: (i) when data includes unreliable or noisy categories and (ii) when data is out-of-distribution (OOD). Specifically, we focus on the Transformer-based BERT (Bidirectional Encoder Representation from Transformer) model for recognizing COVID-19 misinformation data from online social media. A set of studies are designed to examine the generalizability of a diverse array of BERT-based transfer learning techniques. The investigation also includes shallow non-contextual models. Results obtained from extensive systematic experimentation show that the BERT-based models generalize poorly on the OOD data as well as when the domain contains unverified samples. Notably, these deep contextual models are not more effective, and at times worse, than hallow non-contextual models. We explain possible reasons for the poor generalizability of deep contextual models.
Keywords
Deep Contextual Models, Generalizability, Natural Language Processing, COVID-19, Noisy Data