How Generalizable Are Deep Contextual Models?

Mohammad Rashedul Hasan, University of Nebraska-Lincoln, USA; Mohammad Rashedul Hasan, University of Nebraska-Lincoln, USA

How Generalizable Are Deep Contextual Models?

Authors

Mohammad Rashedul Hasan, University of Nebraska-Lincoln, USA

Abstract

We investigate the generalizability of the deep contextual models along two dimensions: (i) when data includes unreliable or noisy categories and (ii) when data is out-of-distribution (OOD). Specifically, we focus on the Transformer-based BERT (Bidirectional Encoder Representation from Transformer) model for recognizing COVID-19 misinformation data from online social media. A set of studies are designed to examine the generalizability of a diverse array of BERT-based transfer learning techniques. The investigation also includes shallow non-contextual models. Results obtained from extensive systematic experimentation show that the BERT-based models generalize poorly on the OOD data as well as when the domain contains unverified samples. Notably, these deep contextual models are not more effective, and at times worse, than hallow non-contextual models. We explain possible reasons for the poor generalizability of deep contextual models.

Keywords

Deep Contextual Models, Generalizability, Natural Language Processing, COVID-19, Noisy Data

CS&IT Conference Proceedings

How Generalizable Are Deep Contextual Models?