Language evolution 'in silico'
From large-scale data to artificial agents creating languages from scratch
DOI:
https://doi.org/10.7203/metode.15.27692Keywords:
language, evolution, artificial intelligence, typology, universalsAbstract
We all speak a language and have intuitions about it: from its vocabulary to the way words are put together according to its grammar. However, much is still to be understood about the processes that make language even possible and those that shape its evolution. Recent computational advances have enabled us to address these issues from new angles. This article highlights methods and findings that the age of computation has given rise to, from learning from large-scale data from thousands of languages to the evolution of languages created by artificial intelligence.
Downloads
References
Bouchacourt, D., & Baroni, M. (2018). How agents see things: On visual representations in an emergent language game. In E. Riloff, D. Chiang, J. Hockenmaier & J. Tsujii, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Progressing (p. 981–985). Association for Computational Linguistics.
BigScience Workshop. (2023). BLOOM: A 176B-parameter open-access multilingual language model. arXiv. https:/doi.org/10.48550/arxiv.2211.05100
Brochhagen, T., & Boleda, G. (2022). When do languages use the same word for different meanings? The Goldilocks principle in colexification. Cognition, 226, 105179. https://doi.org/10.1016/j.cognition.2022.105179
Brochhagen, T., Boleda, G., Gualdoni, E., & Xu, Y. (2023). From language development to language evolution: A unified view of human lexical creativity. Science, 381(6656), 431–436. https://doi.org/10.1126/science.ade7981
Chaabouni, R., Kharitonov, E., Dupoux, E., & Baroni, M. (2019). Anti-efficient encoding in emergent communication. In Proceedings of NeurIPS 2019 (33d Conference on Neural Information Processing Systems) (p. 6290–6300). Curran Associates.
Corballis, M. C. (2008). Not the last word. American Scientist, 96(1), 68–70.
Deng, J., Dong, W., Socher, R., Li, L.-J.,Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. In IEEE Computer Vision and Pattern Recognition (CVPR) (p. 248–255). https://doi.org/10.1109/CVPR.2009.5206848
Kemp, C., & Regier, T. (2012). Kinship categories across languages reflect general communicative principles. Science, 336(6084), 1049–1054. https://doi.org/10.1126/science.1218811
Lazaridou, A., & Baroni, M. (2020). Emergent multi-agent communication in the deep learning era. arXiv. https://doi.org/10.48550/arXiv.2006. 02419
Rzymski, C., Tresoldi, T., Greenhill, S. J., Wu, M.-S., Schweikhard, N. E., Koptjevskaja-Tamm, M., Gast, V., Bodt, T. A., Hantgan, A., Kaiping, G. A., Chang, S., Lai, Y., Morozova, N., Arjava, H., Hübler, N., Koile, E., Pepper, S., Proos, M., Van Epps, B., ... List, J.-M. (2020). The database of cross-linguistic colexifications, reproducible analysis of cross- linguistic polysemies. Scientific Data, 7, 13. https://doi.org/10.1038/s41597-019-0341-x
Seifart, F., Paschen, L., & Stave, M. (2022). Language Documentation Reference Corpus (DoReCo) 1.2. [Archive material]. Leibniz-Zentrum Allgemeine Sprachwissenschaft & laboratoire Dynamique Du Langage (UMR5596, CNRS & Université Lyon 2). https://doi.org/10.34847/nkl.7cbfq779
Passmore, S., Barth, W., Greenhill, S. J., Quinn, K., Sheard, C., Argyriou, P., Birchall, J., Bowern, C., Calladine, J., Deb, A., Diederen, A., Metsäranta, N. P., Araujo, L. H., Schembri, R., Hickey-Hall, J., Honkola, T., Mitchell, A., Poole, L., Rácz, P. M., ... Jordan, F. M. (2023). Kinbank: A global database of kinship terminology. PLOS ONE, 18(5), e0283218. https://doi.org/10.1371/journal.pone.0283218
Xu, Y., Duong, K., Malt, B. C., Jiang, S., & Srinivasan, M. (2020). Conceptual relations predict colexification across languages. Cognition, 201, 104280. https://doi.org/10.1016/j.cognition.2020.104280
Zaslavsky, N., Kemp, C., Regier, T., & Tishby, N. (2018). Efficient compression in color naming and its evolution. Proceedings of the National Academy of Sciences, 115(31), 7937–7942. https://doi.org/10.1073/pnas. 1800521115
Downloads
Published
How to Cite
-
Abstract244
-
PDF27
Issue
Section
License
Copyright (c) 2024 CC BY-NC-ND 4.0
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
All the documents in the OJS platform are open access and property of their respective authors.
Authors publishing in the journal agree to the following terms:
- Authors keep the rights and guarantee Metode Science Studies Journal the right to be the first publication of the document, licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License that allows others to share the work with an acknowledgement of authorship and publication in the journal.
- Authors are allowed and encouraged to spread their work through electronic means using personal or institutional websites (institutional open archives, personal websites or professional and academic networks profiles) once the text has been published.