skip to main content
10.1145/3442188.3445922acmconferencesArticle/Chapter ViewAbstractPublication PagesfacctConference Proceedingsconference-collections
Article
Open Access

On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜

Published:01 March 2021Publication History

ABSTRACT

The past 3 years of work in NLP have been characterized by the development and deployment of ever larger language models, especially for English. BERT, its variants, GPT-2/3, and others, most recently Switch-C, have pushed the boundaries of the possible both through architectural innovations and through sheer size. Using these pretrained models and the methodology of fine-tuning them for specific tasks, researchers have extended the state of the art on a wide array of tasks as measured by leaderboards on specific benchmarks for English. In this paper, we take a step back and ask: How big is too big? What are the possible risks associated with this technology and what paths are available for mitigating those risks? We provide recommendations including weighing the environmental and financial costs first, investing resources into curating and carefully documenting datasets rather than ingesting everything on the web, carrying out pre-development exercises evaluating how the planned approach fits into research and development goals and supports stakeholder values, and encouraging research directions beyond ever larger language models.

References

  1. Hussein M Adam, Robert D Bullard, and Elizabeth Bell. 2001. Faces of environmental racism: Confronting issues of global justice. Rowman & Littlefield.Google ScholarGoogle Scholar
  2. Chris Alberti, Kenton Lee, and Michael Collins. 2019. A BERT Baseline for the Natural Questions. arXiv:1901.08634 [cs.CL]Google ScholarGoogle Scholar
  3. Larry Alexander. 1992. What makes wrongful discrimination wrong? Biases, preferences, stereotypes, and proxies. University of Pennsylvania Law Review 141, 1 (1992), 149--219.Google ScholarGoogle ScholarCross RefCross Ref
  4. American Psychological Association. 2019. Discrimination: What it is, and how to cope. https://www.apa.org/topics/discrimination (2019).Google ScholarGoogle Scholar
  5. Dario Amodei and Daniel Hernandez. 2018. AI and Compute. https://openai. com/blog/ai-and-compute/Google ScholarGoogle Scholar
  6. David Anthoff, Robert J Nicholls, and Richard SJ Tol. 2010. The economic impact of substantial sea-level rise. Mitigation and Adaptation Strategies for Global Change 15, 4 (2010), 321--335.Google ScholarGoogle ScholarCross RefCross Ref
  7. Mikhail J Atallah, Victor Raskin, Christian F Hempelmann, Mercan Karahan, Radu Sion, Umut Topkara, and Katrina E Triezenberg. 2002. Natural Language Watermarking and Tamperproofing. In International Workshop on Information Hiding. Springer, 196--212.Google ScholarGoogle Scholar
  8. Alexei Baevski and Abdelrahman Mohamed. 2020. Effectiveness of Self-Supervised Pre-Training for ASR. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 7694--7698.Google ScholarGoogle Scholar
  9. Michael Barera. 2020. Mind the Gap: Addressing Structural Equity and Inclusion on Wikipedia. (2020). Accessible at http://hdl.handle.net/10106/29572.Google ScholarGoogle Scholar
  10. Russel Barsh. 1990. Indigenous peoples, racism and the environment. Meanjin 49, 4 (1990), 723.Google ScholarGoogle Scholar
  11. Christine Basta, Marta R Costa-jussà, and Noe Casas. 2019. Evaluating the Underlying Gender Bias in Contextualized Word Embeddings. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing. 33--39.Google ScholarGoogle ScholarCross RefCross Ref
  12. Iz Beltagy, Kyle Lo, and Arman Cohan. 2019. SciBERT: A Pretrained Language Model for Scientific Text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 3615--3620. https://doi.org/10.18653/v1/D19-1371Google ScholarGoogle Scholar
  13. Emily M. Bender and Batya Friedman. 2018. Data statements for natural language processing: Toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics 6 (2018), 587--604.Google ScholarGoogle ScholarCross RefCross Ref
  14. Emily M. Bender and Alexander Koller. 2020. Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 5185--5198. https://doi.org/10.18653/v1/2020.acl-main.463Google ScholarGoogle Scholar
  15. Ruha Benjamin. 2019. Race After Technology: Abolitionist Tools for the New Jim Code. Polity Press, Cambridge, UK.Google ScholarGoogle Scholar
  16. Elettra Bietti and Roxana Vatanparast. 2020. Data Waste. Harvard International Law Journal 61 (2020).Google ScholarGoogle Scholar
  17. Steven Bird. 2016. Social Mobile Technologies for Reconnecting Indigenous and Immigrant Communities.. In People.Policy.Place Seminar. Northern Institute, Charles Darwin University, Darwin, Australia. https://www.cdu.edu.au/sites/default/files/the-northern-institute/ppp-bird-20160128-4up.pdfGoogle ScholarGoogle Scholar
  18. Abeba Birhane and Vinay Uday Prabhu. 2021. Large Image Datasets: A Pyrrhic Win for Computer Vision?. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1537--1547.Google ScholarGoogle ScholarCross RefCross Ref
  19. Su Lin Blodgett, Solon Barocas, Hal Daumé III, and Hanna Wallach. 2020. Language (Technology) is Power: A Critical Survey of "Bias" in NLP. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 5454--5476. https://doi.org/10.18653/v1/2020.acl-main.485Google ScholarGoogle ScholarCross RefCross Ref
  20. Thorsten Brants, Ashok C. Popat, Peng Xu, Franz J. Och, and Jeffrey Dean. 2007. Large Language Models in Machine Translation. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL). Association for Computational Linguistics, Prague, Czech Republic, 858--867. https://www.aclweb.org/anthology/D07-1090Google ScholarGoogle Scholar
  21. Ronan Le Bras, Swabha Swayamdipta, Chandra Bhagavatula, Rowan Zellers, Matthew E Peters, Ashish Sabharwal, and Yejin Choi. 2020. Adversarial Filters of Dataset Biases. In Proceedings of the 37th International Conference on Machine Learning.Google ScholarGoogle Scholar
  22. Luke Breitfeller, Emily Ahn, David Jurgens, and Yulia Tsvetkov. 2019. Finding Microaggressions in the Wild: A Case for Locating Elusive Phenomena in Social Media Posts. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 1664--1674. https://doi.org/10.18653/v1/D19-1176Google ScholarGoogle Scholar
  23. Susan E Brennan and Herbert H Clark. 1996. Conceptual pacts and lexical choice in conversation. Journal of Experimental Psychology: Learning, Memory, and Cognition 22, 6 (1996), 1482.Google ScholarGoogle ScholarCross RefCross Ref
  24. Robin Brewer and Anne Marie Piper. 2016. "Tell It Like It Really Is" A Case of Online Content Creation and Sharing Among Older Adult Bloggers. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 5529--5542.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.htmlGoogle ScholarGoogle Scholar
  26. Cristian Buciluă, Rich Caruana, and Alexandru Niculescu-Mizil. 2006. Model Compression. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Philadelphia, PA, USA) (KDD '06). Association for Computing Machinery, New York, NY, USA, 535--541. https://doi.org/10.1145/1150402.1150464Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Robert D Bullard. 1993. Confronting environmental racism: Voices from the grassroots. South End Press.Google ScholarGoogle Scholar
  28. Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, and Colin Raffel. 2020. Extracting Training Data from Large Language Models. arXiv:2012.07805 [cs.CR]Google ScholarGoogle Scholar
  29. Herbert H. Clark. 1996. Using Language. Cambridge University Press, Cambridge.Google ScholarGoogle Scholar
  30. Herbert H. Clark and Adrian Bangerter. 2004. Changing ideas about reference. In Experimental Pragmatics. Springer, 25--49.Google ScholarGoogle Scholar
  31. Herbert H. Clark and Meredyth A Krych. 2004. Speaking while monitoring addressees for understanding. Journal of Memory and Language 50, 1 (2004), 62--81.Google ScholarGoogle ScholarCross RefCross Ref
  32. Herbert H. Clark, Robert Schreuder, and Samuel Buttrick. 1983. Common ground at the understanding of demonstrative reference. Journal of Verbal Learning and Verbal Behavior 22, 2 (1983), 245--258. https://doi.org/10.1016/S0022-5371(83)90189-5Google ScholarGoogle ScholarCross RefCross Ref
  33. Herbert H. Clark and Deanna Wilkes-Gibbs. 1986. Referring as a collaborative process. Cognition 22, 1 (1986), 1--39. https://doi.org/10.1016/0010-0277(86) 90010-7Google ScholarGoogle ScholarCross RefCross Ref
  34. Kimberlé Crenshaw. 1989. Demarginalizing the intersection of race and sex: A Black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics. The University of Chicago Legal Forum (1989), 139.Google ScholarGoogle Scholar
  35. Benjamin Dangl. 2019. The Five Hundred Year Rebellion: Indigenous Movements and the Decolonization of History in Bolivia. AK Press.Google ScholarGoogle Scholar
  36. Christian Davenport. 2009. Media bias, perspective, and state repression: The Black Panther Party. Cambridge University Press.Google ScholarGoogle Scholar
  37. Ferdinand de Saussure. 1959. Course in General Linguistics. The Philosophical Society, New York. Translated by Wade Baskin.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Terrance de Vries, Ishan Misra, Changhan Wang, and Laurens van der Maaten. 2019. Does object recognition work for everyone?. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 52--59.Google ScholarGoogle Scholar
  39. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171--4186. https://doi.org/10.18653/v1/N19-1423Google ScholarGoogle Scholar
  40. Maeve Duggan. 2017. Online Harassment 2017. Pew Research Center.Google ScholarGoogle Scholar
  41. Jennifer Earl, Andrew Martin, John D. McCarthy, and Sarah A. Soule. 2004. The use of newspaper data in the study of collective action. Annual Review of Sociology 30 (2004), 65--80.Google ScholarGoogle ScholarCross RefCross Ref
  42. Ethan Fast, Tina Vachovsky, and Michael Bernstein. 2016. Shirtless and Dangerous: Quantifying Linguistic Signals of Gender Bias in an Online Fiction Writing Community. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 10.Google ScholarGoogle Scholar
  43. William Fedus, Barret Zoph, and Noam Shazeer. 2021. Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. arXiv:2101.03961 [cs.LG]Google ScholarGoogle Scholar
  44. Anjalie Field, Doron Kliger, Shuly Wintner, Jennifer Pan, Dan Jurafsky, and Yulia Tsvetkov. 2018. Framing and Agenda-setting in Russian News: a Computational Analysis of Intricate Political Strategies. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 3570--3580. https://doi.org/10.18653/v1/D18-1393Google ScholarGoogle ScholarCross RefCross Ref
  45. Darja Fišer, Ruihong Huang, Vinodkumar Prabhakaran, Rob Voigt, Zeerak Waseem, and Jacqueline Wernimont (Eds.). 2018. Proceedings of the 2nd Workshop on Abusive Language Online (ALW2). Association for Computational Linguistics, Brussels, Belgium. https://www.aclweb.org/anthology/W18-5100Google ScholarGoogle Scholar
  46. Susan T Fiske. 2017. Prejudices in cultural contexts: shared stereotypes (gender, age) versus variable stereotypes (race, ethnicity, religion). Perspectives on psychological science 12, 5 (2017), 791--799.Google ScholarGoogle Scholar
  47. Antigoni Founta, Constantinos Djouvas, Despoina Chatzakou, Ilias Leontiadis, Jeremy Blackburn, Gianluca Stringhini, Athena Vakali, Michael Sirivianos, and Nicolas Kourtellis. 2018. Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 12.Google ScholarGoogle ScholarCross RefCross Ref
  48. Batya Friedman and David Hendry. 2012. The Envisioning Cards: A Toolkit for Catalyzing Humanistic and Technical Imaginations. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Austin, Texas, USA) (CHI '12). Association for Computing Machinery, New York, NY, USA, 1145--1148. https://doi.org/10.1145/2207676.2208562Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Batya Friedman and David G. Hendry. 2019. Value Sensitive Design: Shaping Technology with Moral Imagination. MIT Press.Google ScholarGoogle Scholar
  50. Batya Friedman, Peter H. Kahn, Jr., and Alan Borning. 2006. Value sensitive design and information systems. In Human-Computer Interaction in Management Information Systems: Foundations, P Zhang and D Galletta (Eds.). M. E. Sharpe, Armonk NY, 348--372.Google ScholarGoogle Scholar
  51. Leo Gao, Stella Biderman, Sid Black, Laurence Golding, Travis Hoppe, Charles Foster, Jason Phang, Horace He, Anish Thite, Noa Nabeshima, Shawn Presser, and Connor Leahy. 2020. The Pile: An 800GB Dataset of Diverse Text for Language Modeling. arXiv:2101.00027 [cs.CL]Google ScholarGoogle Scholar
  52. Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford. 2020. Datasheets for Datasets. arXiv:1803.09010 [cs.DB]Google ScholarGoogle Scholar
  53. Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, and Noah A. Smith. 2020. RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 3356--3369. https://doi.org/10.18653/v1/2020.findings-emnlp.301Google ScholarGoogle Scholar
  54. Wei Guo and Aylin Caliskan. 2020. Detecting Emergent Intersectional Biases: Contextualized Word Embeddings Contain a Distribution of Human-like Biases. arXiv preprint arXiv:2006.03955 (2020).Google ScholarGoogle Scholar
  55. Melissa Hart. 2004. Subjective decisionmaking and unconscious discrimination. Alabama Law Review 56 (2004), 741.Google ScholarGoogle Scholar
  56. Deborah Hellman. 2008. When is Discrimination Wrong? Harvard University Press.Google ScholarGoogle Scholar
  57. Peter Henderson, Jieru Hu, Joshua Romoff, Emma Brunskill, Dan Jurafsky, and Joelle Pineau. 2020. Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning. Journal of Machine Learning Research 21, 248 (2020), 1--43. http://jmlr.org/papers/v21/20-312.htmlGoogle ScholarGoogle Scholar
  58. Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).Google ScholarGoogle Scholar
  59. Chao-Wei Huang and Yun-Nung Chen. 2019. Adapting Pretrained Transformer to Lattices for Spoken Language Understanding. In Proceedings of 2019 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 2019). Sentosa, Singapore, 845--852.Google ScholarGoogle ScholarCross RefCross Ref
  60. Hongzhao Huang and Fuchun Peng. 2019. An Empirical Study of Efficient ASR Rescoring with Transformers. arXiv:1910.11450 [cs.CL]Google ScholarGoogle Scholar
  61. Ben Hutchinson, Vinodkumar Prabhakaran, Emily Denton, Kellie Webster, Yu Zhong, and Stephen Denuyl. 2020. Social Biases in NLP Models as Barriers for Persons with Disabilities. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 5491--5501. https://doi.org/10.18653/v1/2020.acl-main.487Google ScholarGoogle ScholarCross RefCross Ref
  62. Eun Seo Jo and Timnit Gebru. 2020. Lessons from archives: strategies for collecting sociocultural data in machine learning. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 306--316.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Leslie Kay Jones. 2020. #BlackLivesMatter: An Analysis of the Movement as Social Drama. Humanity & Society 44, 1 (2020), 92--110.Google ScholarGoogle ScholarCross RefCross Ref
  64. Leslie Kay Jones. 2020. Twitter wants you to know that you're still SOL if you get a death threat --- unless you're President Donald Trump. (2020). https://medium.com/@agua.carbonica/twitter-wants-you-to-know-that-youre-still-sol-if-you-get-a-death-threat-unless-you-re-a5cce316b706.Google ScholarGoogle Scholar
  65. Pratik Joshi, Sebastin Santy, Amar Budhiraja, Kalika Bali, and Monojit Choudhury. 2020. The State and Fate of Linguistic Diversity and Inclusion in the NLP World. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 6282--6293. https://doi.org/10.18653/v1/2020.acl-main.560Google ScholarGoogle ScholarCross RefCross Ref
  66. Nurul Shamimi Kamaruddin, Amirrudin Kamsin, Lip Yee Por, and Hameedur Rahman. 2018. A Review of Text Watermarking: Theory, Methods, and Applications. IEEE Access 6 (2018), 8011--8028. https://doi.org/10.1109/ACCESS.2018. 2796585Google ScholarGoogle ScholarCross RefCross Ref
  67. Brendan Kennedy, Drew Kogon, Kris Coombs, Joseph Hoover, Christina Park, Gwenyth Portillo-Wightman, Aida Mostafazadeh Davani, Mohammad Atari, and Morteza Dehghani. 2018. A typology and coding manual for the study of hate-based rhetoric. PsyArXiv. July 18 (2018).Google ScholarGoogle Scholar
  68. Gary Klein. 2007. Performing a project premortem. Harvard business review 85, 9 (2007), 18--19.Google ScholarGoogle Scholar
  69. Keita Kurita, Nidhi Vyas, Ayush Pareek, Alan W Black, and Yulia Tsvetkov. 2019. Measuring Bias in Contextualized Word Representations. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing. 166--172.Google ScholarGoogle ScholarCross RefCross Ref
  70. Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. arXiv preprint arXiv:1909.11942 (2019).Google ScholarGoogle Scholar
  71. Amanda Lazar, Mark Diaz, Robin Brewer, Chelsea Kim, and Anne Marie Piper. 2017. Going gray, failure to hire, and the ick factor: Analyzing how older bloggers talk about ageism. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 655--668.Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Christopher A Le Dantec, Erika Shehan Poole, and Susan P Wyche. 2009. Values as lived experience: evolving value sensitive design in support of value discovery. In Proceedings of the SIGCHI conference on human factors in computing systems. 1141--1150.Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer, and Zhifeng Chen. 2020. GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding. arXiv:2006.16668 [cs.CL]Google ScholarGoogle Scholar
  74. Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).Google ScholarGoogle Scholar
  75. Kadan Lottick, Silvia Susai, Sorelle A. Friedler, and Jonathan P. Wilson. 2019. Energy Usage Reports: Environmental awareness as part of algorithmic accountability. arXiv:1911.08354 [cs.LG]Google ScholarGoogle Scholar
  76. Mette Edith Lundsfryd. 2017. Speaking Back to a World of Checkpoints: Oral History as a Decolonizing Tool in the Study of Palestinian Refugees from Syria in Lebanon. Middle East Journal of Refugee Studies 2, 1 (2017), 73--95.Google ScholarGoogle ScholarCross RefCross Ref
  77. Marianna Martindale and Marine Carpuat. 2018. Fluency Over Adequacy: A Pilot Study in Measuring User Trust in Imperfect MT. In Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track). Association for Machine Translation in the Americas, Boston, MA, 13--25. https://www.aclweb.org/anthology/W18-1803Google ScholarGoogle Scholar
  78. Sally McConnell-Ginet. 1984. The Origins of Sexist Language in Discourse. Annals of the New York Academy of Sciences 433, 1 (1984), 123--135.Google ScholarGoogle ScholarCross RefCross Ref
  79. Sally McConnell-Ginet. 2020. Words Matter: Meaning and Power. Cambridge University Press.Google ScholarGoogle ScholarCross RefCross Ref
  80. Kris McGuffie and Alex Newhouse. 2020. The Radicalization Risks of GPT-3 and Advanced Neural Language Models. Technical Report. Center on Terrorism, Extremism, and Counterterrorism, Middlebury Institute of International Studies at Monterrey. https://www.middlebury.edu/institute/sites/www.middlebury.edu.institute/files/2020-09/gpt3-article.pdf.Google ScholarGoogle Scholar
  81. Douglas M McLeod. 2007. News coverage and social protest: How the media's protect paradigm exacerbates social conflict. Journal of Dispute Resolution (2007), 185.Google ScholarGoogle Scholar
  82. Oren Melamud, Jacob Goldberger, and Ido Dagan. 2016. context2vec: Learning Generic Context Embedding with Bidirectional LSTM. In Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning. Association for Computational Linguistics, Berlin, Germany, 51--61. https://doi.org/10. 18653/v1/K16-1006Google ScholarGoogle ScholarCross RefCross Ref
  83. Julia Mendelsohn, Yulia Tsvetkov, and Dan Jurafsky. 2020. A Framework for the Computational Linguistic Analysis of Dehumanization. Frontiers in Artificial Intelligence 3 (2020), 55. https://doi.org/10.3389/frai.2020.00055Google ScholarGoogle ScholarCross RefCross Ref
  84. Kaitlynn Mendes, Jessica Ringrose, and Jessalynn Keller. 2018. # MeToo and the promise and pitfalls of challenging rape culture through digital feminist activism. European Journal of Women's Studies 25, 2 (2018), 236--246.Google ScholarGoogle ScholarCross RefCross Ref
  85. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and Their Compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2 (Lake Tahoe, Nevada) (NIPS'13). Curran Associates Inc., Red Hook, NY, USA, 3111--3119.Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model cards for model reporting. In Proceedings of the conference on fairness, accountability, and transparency. 220--229.Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. Robert C. Moore and William Lewis. 2010. Intelligent Selection of Language Model Training Data. In Proceedings of the ACL 2010 Conference Short Papers. Association for Computational Linguistics, Uppsala, Sweden, 220--224. https://www.aclweb.org/anthology/P10-2041Google ScholarGoogle Scholar
  88. Kevin L. Nadal. 2018. Microaggressions and Traumatic Stress: Theory, Research, and Clinical Treatment. American Psychological Association. https://books.google.com/books?id=ogzhswEACAAJGoogle ScholarGoogle Scholar
  89. Clifford Nass, Jonathan Steuer, and Ellen R Tauber. 1994. Computers are social actors. In Proceedings of the SIGCHI conference on Human factors in computing systems. 72--78.Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. Lisa P. Nathan, Predrag V. Klasnja, and Batya Friedman. 2007. Value Scenarios: A Technique for Envisioning Systemic Effects of New Technologies. In CHI'07 Extended Abstracts on Human Factors in Computing Systems. ACM, 2585--2590.Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. Wilhelmina Nekoto, Vukosi Marivate, Tshinondiwa Matsila, Timi Fasubaa, Taiwo Fagbohungbe, Solomon Oluwole Akinola, Shamsuddeen Muhammad, Salomon Kabongo Kabenamualu, Salomey Osei, Freshia Sackey, Rubungo Andre Niyongabo, Ricky Macharm, Perez Ogayo, Orevaoghene Ahia, Musie Meressa Berhe, Mofetoluwa Adeyemi, Masabata Mokgesi-Selinga, Lawrence Okegbemi, Laura Martinus, Kolawole Tajudeen, Kevin Degila, Kelechi Ogueji, Kathleen Siminyu, Julia Kreutzer, Jason Webster, Jamiil Toure Ali, Jade Abbott, Iroro Orife, Ignatius Ezeani, Idris Abdulkadir Dangana, Herman Kamper, Hady Elsahar, Goodness Duru, Ghollah Kioko, Murhabazi Espoir, Elan van Biljon, Daniel Whitenack, Christopher Onyefuluchi, Chris Chinenye Emezue, Bonaventure F. P. Dossou, Blessing Sibanda, Blessing Bassey, Ayodele Olabiyi, Arshath Ramkilowan, Alp Öktem, Adewale Akinfaderin, and Abdallah Bashir. 2020. Participatory Research for Low-resourced Machine Translation: A Case Study in African Languages. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 2144--2160. https://doi.org/10.18653/v1/2020.findings-emnlp.195Google ScholarGoogle Scholar
  92. Maggie Nelson. 2015. The Argonauts. Graywolf Press, Minneapolis.Google ScholarGoogle Scholar
  93. Timothy Niven and Hung-Yu Kao. 2019. Probing Neural Network Comprehension of Natural Language Arguments. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 4658--4664. https://doi.org/10.18653/v1/P19-1459Google ScholarGoogle ScholarCross RefCross Ref
  94. Safiya Umoja Noble. 2018. Algorithms of Oppression: How Search Engines Reinforce Racism. NYU Press.Google ScholarGoogle Scholar
  95. Debora Nozza, Federico Bianchi, and Dirk Hovy. 2020. What the [MASK]? Making Sense of Language-Specific BERT Models. arXiv:2003.02912 [cs.CL]Google ScholarGoogle Scholar
  96. David Ortiz, Daniel Myers, Eugene Walls, and Maria-Elena Diaz. 2005. Where do we stand with newspaper data? Mobilization: An International Quarterly 10, 3 (2005), 397--419.Google ScholarGoogle Scholar
  97. Charlotte Pennington, Derek Heim, Andrew Levy, and Derek Larkin. 2016. Twenty Years of Stereotype Threat Research: A Review of Psychological Mediators. PloS one 11 (01 2016), e0146487. https://doi.org/10.1371/journal.pone.0146487Google ScholarGoogle Scholar
  98. Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. GloVe: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1532--1543. https://doi.org/10.3115/v1/D14-1162Google ScholarGoogle ScholarCross RefCross Ref
  99. Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 2227--2237. https://doi.org/10.18653/v1/N18-1202Google ScholarGoogle ScholarCross RefCross Ref
  100. Pew. 2018. Internet/Broadband Fact Sheet. (2 2018). https://www.pewinternet. org/fact-sheet/internet-broadband/Google ScholarGoogle Scholar
  101. Aidan Pine and Mark Turin. 2017. Language Revitalization. Oxford Research Encyclopedia of Linguistics.Google ScholarGoogle Scholar
  102. Francesca Polletta. 1998. Contending stories: Narrative in social movements. Qualitative sociology 21, 4 (1998), 419--446.Google ScholarGoogle Scholar
  103. Vinodkumar Prabhakaran, Ben Hutchinson, and Margaret Mitchell. 2019. Perturbation Sensitivity Analysis to Detect Unintended Model Biases. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 5740--5745. https://doi.org/10.18653/v1/D19-1578Google ScholarGoogle Scholar
  104. Laura Pulido. 2016. Flint, environmental racism, and racial capitalism. Capitalism Nature Socialism 27, 3 (2016), 1--16.Google ScholarGoogle ScholarCross RefCross Ref
  105. Xipeng Qiu, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai, and Xuanjing Huang. 2020. Pre-trained Models for Natural Language Processing: A Survey. arXiv:2003.08271 [cs.CL]Google ScholarGoogle Scholar
  106. Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019), 9.Google ScholarGoogle Scholar
  107. Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21, 140 (2020), 1--67. http://jmlr.org/papers/v21/20-074.htmlGoogle ScholarGoogle Scholar
  108. Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100,000+ Questions for Machine Comprehension of Text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, Texas, 2383--2392. https://doi.org/10.18653/v1/D16-1264Google ScholarGoogle ScholarCross RefCross Ref
  109. Sarah T. Roberts, Joel Tetreault, Vinodkumar Prabhakaran, and Zeerak Waseem (Eds.). 2019. Proceedings of the Third Workshop on Abusive Language Online. Association for Computational Linguistics, Florence, Italy. https://www.aclweb.org/anthology/W19-3500Google ScholarGoogle Scholar
  110. Anna Rogers, Olga Kovaleva, and Anna Rumshisky. 2021. A Primer in BERTology: What We Know About How BERT Works. Transactions of the Association for Computational Linguistics 8 (2021), 842--866.Google ScholarGoogle ScholarCross RefCross Ref
  111. Ronald Rosenfeld. 2000. Two decades of statistical language modeling: Where do we go from here? Proc. IEEE 88, 8 (2000), 1270--1278.Google ScholarGoogle ScholarCross RefCross Ref
  112. Corby Rosset. 2020. Turing-NLG: A 17-billion-parameter language model by Microsoft. Microsoft Blog (2020).Google ScholarGoogle Scholar
  113. Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019).Google ScholarGoogle Scholar
  114. Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A. Smith, and Yejin Choi. 2020. Social Bias Frames: Reasoning about Social and Power Implications of Language. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 5477--5490. https://doi.org/10.18653/v1/2020.acl-main.486Google ScholarGoogle ScholarCross RefCross Ref
  115. Roy Schwartz, Jesse Dodge, Noah A. Smith, and Oren Etzioni. 2020. Green AI. Commun. ACM 63, 12 (Nov. 2020), 54--63. https://doi.org/10.1145/3381831Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. Sabine Sczesny, Janine Bosak, Daniel Neff, and Birgit Schyns. 2004. Gender stereotypes and the attribution of leadership traits: A cross-cultural comparison. Sex roles 51, 11-12 (2004), 631--645.Google ScholarGoogle Scholar
  117. Claude Elwood Shannon. 1949. The Mathematical Theory of Communication. University of Illinois Press, Urbana.Google ScholarGoogle Scholar
  118. Sheng Shen, Zhen Dong, Jiayu Ye, Linjian Ma, Zhewei Yao, Amir Gholami, Michael W. Mahoney, and Kurt Keutzer. 2019. Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT. arXiv:1909.05840 [cs.CL]Google ScholarGoogle Scholar
  119. Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, and Nanyun Peng. 2019. The Woman Worked as a Babysitter: On Biases in Language Generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 3407--3412. https://doi.org/10.18653/v1/D19-1339Google ScholarGoogle Scholar
  120. Katie Shilton, Jes A Koepfler, and Kenneth R Fleischmann. 2014. How to see values in social computing: methods for studying values dimensions. In Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing. 426--435.Google ScholarGoogle ScholarDigital LibraryDigital Library
  121. Joonbo Shin, Yoonhyung Lee, and Kyomin Jung. 2019. Effective Sentence Scoring Method Using BERT for Speech Recognition. In Asian Conference on Machine Learning. 1081--1093.Google ScholarGoogle Scholar
  122. Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper, and Bryan Catanzaro. 2019. Megatron-lm: Training multi-billion parameter language models using gpu model parallelism. arXiv preprint arXiv:1909.08053 (2019).Google ScholarGoogle Scholar
  123. Irene Solaiman, Miles Brundage, Jack Clark, Amanda Askell, Ariel Herbert-Voss, Jeff Wu, Alec Radford, Gretchen Krueger, Jong Wook Kim, Sarah Kreps, et al. 2019. Release strategies and the social impacts of language models. arXiv preprint arXiv:1908.09203 (2019).Google ScholarGoogle Scholar
  124. Karen Spärck Jones. 2004. Language modelling's generative model: Is it rational? Technical Report. Computer Laboratory, University of Cambridge.Google ScholarGoogle Scholar
  125. Robyn Speer. 2017. ConceptNet Numberbatch 17.04: better, less-stereotyped word vectors. (2017). Blog post, https://blog.conceptnet.io/2017/04/24/conceptnet-numberbatch-17-04-better-less-stereotyped-word-vectors/.Google ScholarGoogle Scholar
  126. Steven J. Spencer, Christine Logel, and Paul G. Davies. 2016. Stereotype Threat. Annual Review of Psychology 67, 1 (2016), 415--437. https://doi.org/10.1146/annurev-psych-073115-103235 arXiv:https://doi.org/10.1146/annurev-psych-073115-103235 PMID: 26361054.Google ScholarGoogle ScholarCross RefCross Ref
  127. Katrina Srigley and Lorraine Sutherland. 2019. Decolonizing, Indigenizing, and Learning Biskaaybiiyang in the Field: Our Oral History Journey1. The Oral History Review (2019).Google ScholarGoogle Scholar
  128. Greg J. Stephens, Lauren J. Silbert, and Uri Hasson. 2010. Speaker-listener neural coupling underlies successful communication. Proceedings of the National Academy of Sciences 107, 32 (2010), 14425--14430. https://doi.org/10.1073/pnas. 1008662107 arXiv:https://www.pnas.org/content/107/32/14425.full.pdfGoogle ScholarGoogle ScholarCross RefCross Ref
  129. Emma Strubell, Ananya Ganesh, and Andrew McCallum. 2019. Energy and Policy Considerations for Deep Learning in NLP. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 3645--3650.Google ScholarGoogle ScholarCross RefCross Ref
  130. Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu. 2019. ERNIE: Enhanced Representation through Knowledge Integration. arXiv:1904.09223 [cs.CL]Google ScholarGoogle Scholar
  131. Yu Sun, Shuohuan Wang, Yu-Kun Li, Shikun Feng, Hao Tian, Hua Wu, and Haifeng Wang. 2020. ERNIE 2.0: A Continual Pre-Training Framework for Language Understanding. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, 8968--8975. https://aaai.org/ojs/index.php/AAAI/article/view/6428Google ScholarGoogle ScholarCross RefCross Ref
  132. Yi Chern Tan and L Elisa Celis. 2019. Assessing social and intersectional biases in contextualized word representations. In Advances in Neural Information Processing Systems. 13230--13241.Google ScholarGoogle Scholar
  133. Ian Tenney, Dipanjan Das, and Ellie Pavlick. 2019. BERT Rediscovers the Classical NLP Pipeline. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 4593--4601. https://doi.org/10.18653/v1/P19-1452Google ScholarGoogle ScholarCross RefCross Ref
  134. Trieu H. Trinh and Quoc V. Le. 2019. A Simple Method for Commonsense Reasoning. arXiv:1806.02847 [cs.AI]Google ScholarGoogle Scholar
  135. Marlon Twyman, Brian C Keegan, and Aaron Shaw. 2017. Black Lives Matter in Wikipedia: Collective memory and collaboration around online social movements. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 1400--1412.Google ScholarGoogle ScholarDigital LibraryDigital Library
  136. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.Google ScholarGoogle Scholar
  137. Rob Voigt, David Jurgens, Vinodkumar Prabhakaran, Dan Jurafsky, and Yulia Tsvetkov. 2018. RtGender: A Corpus for Studying Differential Responses to Gender. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA), Miyazaki, Japan. https://www.aclweb.org/anthology/L18-1445Google ScholarGoogle Scholar
  138. Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel Bowman. 2018. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics, Brussels, Belgium, 353--355. https://doi.org/10. 18653/v1/W18-5446Google ScholarGoogle ScholarCross RefCross Ref
  139. Zeerak Waseem, Thomas Davidson, Dana Warmsley, and Ingmar Weber. 2017. Understanding Abuse: A Typology of Abusive Language Detection Subtasks. In Proceedings of the First Workshop on Abusive Language Online. Association for Computational Linguistics, Vancouver, BC, Canada, 78--84. https://doi.org/10.18653/v1/W17-3012Google ScholarGoogle ScholarCross RefCross Ref
  140. Joseph Weizenbaum. 1976. Computer Power and Human Reason: From Judgment to Calculation. WH Freeman & Co.Google ScholarGoogle ScholarDigital LibraryDigital Library
  141. Monnica T Williams. 2019. Psychology Cannot Afford to Ignore the Many Harms Caused by Microaggressions. Perspectives on Psychological Science 15 (2019), 38--43.Google ScholarGoogle ScholarCross RefCross Ref
  142. Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander Rush. 2020. Transformers: State-of-the-Art Natural Language Processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Online, 38--45. https://doi.org/10.18653/v1/2020.emnlp-demos.6Google ScholarGoogle ScholarCross RefCross Ref
  143. World Bank. 2018. Indiviuals Using the Internet. (2018). https://data.worldbank. org/indicator/IT.NET.USER.ZS?end=2017amp;locations=USamp;start=2015Google ScholarGoogle Scholar
  144. Shijie Wu and Mark Dredze. 2020. Are All Languages Created Equal in Multilingual BERT?. In Proceedings of the 5th Workshop on Representation Learning for NLP. Association for Computational Linguistics, Online, 120--130. https://doi.org/10.18653/v1/2020.repl4nlp-1.16Google ScholarGoogle ScholarCross RefCross Ref
  145. Dongling Xiao, Han Zhang, Yukun Li, Yu Sun, Hao Tian, Hua Wu, and Haifeng Wang. 2020. ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation. arXiv preprint arXiv:2001.11314 (2020).Google ScholarGoogle Scholar
  146. Canwen Xu, Wangchunshu Zhou, Tao Ge, Furu Wei, and Ming Zhou. 2020. BERT-of-Theseus: Compressing BERT by Progressive Module Replacing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, 7859--7869. https://doi.org/10.18653/v1/2020.emnlp-main.633Google ScholarGoogle ScholarCross RefCross Ref
  147. Peng Xu, Chien-Sheng Wu, Andrea Madotto, and Pascale Fung. 2019. Clickbait? Sensational Headline Generation with Auto-tuned Reinforcement Learning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 3065--3075. https://doi.org/10.18653/v1/D19-1303Google ScholarGoogle Scholar
  148. Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel. 2020. mT5: A massively multilingual pre-trained text-to-text transformer. arXiv:2010.11934 [cs.CL]Google ScholarGoogle Scholar
  149. Wei Yang, Yuqing Xie, Aileen Lin, Xingyu Li, Luchen Tan, Kun Xiong, Ming Li, and Jimmy Lin. 2019. End-to-End Open-Domain Question Answering with BERTserini. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations). Association for Computational Linguistics, Minneapolis, Minnesota, 72--77. https://doi.org/10.18653/v1/N19-4013Google ScholarGoogle Scholar
  150. Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. 2019. Xlnet: Generalized autoregressive pretraining for language understanding. In Advances in neural information processing systems. 5753--5763.Google ScholarGoogle Scholar
  151. Ze Yang, Can Xu, Wei Wu, and Zhoujun Li. 2019. Read, Attend and Comment: A Deep Architecture for Automatic News Comment Generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 5077--5089. https://doi.org/10.18653/v1/D19-1512Google ScholarGoogle Scholar
  152. Meg Young, Lassana Magassa, and Batya Friedman. 2019. Toward Inclusive Tech Policy Design: A Method for Underrepresented Voices to Strengthen Tech Policy Documents. Ethics and Information Technology (2019), 1--15.Google ScholarGoogle Scholar
  153. Ofir Zafrir, Guy Boudoukh, Peter Izsak, and Moshe Wasserblat. 2019. Q8BERT: Quantized 8Bit BERT. arXiv:1910.06188 [cs.CL]Google ScholarGoogle Scholar
  154. Nico Zazworka, Rodrigo O. Spínola, Antonio Vetro', Forrest Shull, and Carolyn Seaman. 2013. A Case Study on Effectively Identifying Technical Debt. In Proceedings of the 17th International Conference on Evaluation and Assessment in Software Engineering (Porto de Galinhas, Brazil) (EASE '13). Association for Computing Machinery, New York, NY, USA, 42--47. https://doi.org/10.1145/2460999.2461005Google ScholarGoogle ScholarDigital LibraryDigital Library
  155. Rowan Zellers, Yonatan Bisk, Roy Schwartz, and Yejin Choi. 2018. SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 93--104. https://doi.org/10.18653/v1/D18-1009Google ScholarGoogle ScholarCross RefCross Ref
  156. Haoran Zhang, Amy X Lu, Mohamed Abdalla, Matthew McDermott, and Marzyeh Ghassemi. 2020. Hurtful words: quantifying biases in clinical contextual word embeddings. In Proceedings of the ACM Conference on Health, Inference, and Learning. 110--120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  157. Jieyu Zhao, Tianlu Wang, Mark Yatskar, Ryan Cotterell, Vicente Ordonez, and Kai-Wei Chang. 2019. Gender Bias in Contextualized Word Embeddings. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 629--634. https://doi.org/10.18653/v1/N19-1064Google ScholarGoogle ScholarCross RefCross Ref
  158. Li Zhou, Jianfeng Gao, Di Li, and Heung-Yeung Shum. 2020. The Design and Implementation of XiaoIce, an Empathetic Social Chatbot. Computational Linguistics 46, 1 (March 2020), 53--93. https://doi.org/10.1162/coli_a_00368Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      FAccT '21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency
      March 2021
      899 pages
      ISBN:9781450383097
      DOI:10.1145/3442188

      Copyright © 2021 Owner/Author

      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 March 2021

      Check for updates

      Qualifiers

      • Article
      • Research
      • Refereed limited

      Upcoming Conference

      FAccT '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader