Abstract.
There is growing evidence that the classical notion of adversarial robustness originally introduced for images has been adopted as a 𝑑𝑒 𝑓𝑎𝑐𝑡𝑜 standard by a large part of the NLP research community. We show that this notion is problematic in the context of NLP as it considers a narrow spectrum of linguistic phenomena. In this paper, we argue for 𝑠𝑒𝑚𝑎𝑛𝑡𝑖𝑐 𝑟𝑜𝑏𝑢𝑠𝑡𝑛𝑒𝑠𝑠, which is better aligned with the human concept of linguistic fidelity. We characterize 𝑠𝑒𝑚𝑎𝑛𝑡𝑖𝑐 𝑟𝑜𝑏𝑢𝑠𝑡𝑛𝑒𝑠𝑠 in terms of biases that it is expected to induce in a model. We study 𝑠𝑒𝑚𝑎𝑛𝑡𝑖𝑐 𝑟𝑜𝑏𝑢𝑠𝑡𝑛𝑒𝑠𝑠 of a range of 𝑣𝑎𝑛𝑖𝑙𝑙𝑎 and robustly trained architectures using a template-based generative test bed. We complement the analysis with empirical evidence that, despite being harder to implement, 𝑠𝑒𝑚𝑎𝑛𝑡𝑖𝑐 𝑟𝑜𝑏𝑢𝑠𝑡𝑛𝑒𝑠𝑠 can improve performance on complex linguistic phenomena where models robust in the classical sense fail.
|