The Languages AI (LLMs) Cannot Be Trained to Keep Up
2 months ago
All languages evolve. Middle English sounds phonetically different to Modern English, and Old French almost sounds like Latin and Italian mixed together.
LLMs trained today cannot predict how languages will evolve and change over time, that's a given. At most, LLMs can preserve and fossilise data today, and its possible to preserve them for another thousand years. But by the year 3000, the preserved language might as well look like Old English by future us.
I mean, take objects like: Filofax, Sony Walkman, Floppy Disk, Bakelites or Aerogramme. Pretty sure some of you reading this might not even know some of these objects. Then we have English pronouns like: Thou/Thee/Thy/Thine which actually used to be the informal forms of You/You/Your/Yours.
And now we have new words (or revived/repurposed words) like Skibidi, 6-7, Gooner, Tung Tung Tung Sahur, etc.
But there are languages today I feel that are impossible (no matter how much one tries) for LLMs to perfect. I'm talking about indigenous languages that practice avoidance speech.
In layperson terms, some languages replace words with alterations and/or sound changes due to events, seasons, coming of age, marriage, kinship, but especially upon the death of individuals. Let's say (hypothetically) the Prime Minister of Australia passes away suddenly. For many Indigenous Australian communities, there will be a certain period of time (usually a couple of years) in which the deceased Prime Minister's name cannot be uttered or named. Australian Aboriginal languages like Warlpiri, Yolŋu or Dyirbal (and hundreds more) alter words, or use substitute words if they want to address said decease PM during the mourning period.
Basically, what I'm trying to say is, AI (LLMs) cannot be trained on languages that rapidly evolve to the point that each new generation speaks a practically new language. In many small communities in Vanuatu, Solomon Islands and Papua New Guinea, whole villages as few as 100-200 people usually end up speaking entirely different languages within 2-3 generations, due to so many nouns/pronouns which cannot be named due to changing events, and death of loved ones.
Yes, for super conservative folk who hate English pronouns, remember there are languages in Oceania that change their pronouns once every 2-3 generation, or when an individual marries/dies/hits a certain age, etc.
LLMs preserve, but they do not have the social capacity to know what is a taboo; a taboo which changes by the generation, if not by the year.
Oh an don't get me started on creoles and pidgin languages. I'll be impressed if AI data collection could keep up with rapidly evolving creoles like Jamaican Patois or African American Vernacular English.
Sudoku,
~JAF1320
LLMs trained today cannot predict how languages will evolve and change over time, that's a given. At most, LLMs can preserve and fossilise data today, and its possible to preserve them for another thousand years. But by the year 3000, the preserved language might as well look like Old English by future us.
I mean, take objects like: Filofax, Sony Walkman, Floppy Disk, Bakelites or Aerogramme. Pretty sure some of you reading this might not even know some of these objects. Then we have English pronouns like: Thou/Thee/Thy/Thine which actually used to be the informal forms of You/You/Your/Yours.
And now we have new words (or revived/repurposed words) like Skibidi, 6-7, Gooner, Tung Tung Tung Sahur, etc.
But there are languages today I feel that are impossible (no matter how much one tries) for LLMs to perfect. I'm talking about indigenous languages that practice avoidance speech.
In layperson terms, some languages replace words with alterations and/or sound changes due to events, seasons, coming of age, marriage, kinship, but especially upon the death of individuals. Let's say (hypothetically) the Prime Minister of Australia passes away suddenly. For many Indigenous Australian communities, there will be a certain period of time (usually a couple of years) in which the deceased Prime Minister's name cannot be uttered or named. Australian Aboriginal languages like Warlpiri, Yolŋu or Dyirbal (and hundreds more) alter words, or use substitute words if they want to address said decease PM during the mourning period.
Basically, what I'm trying to say is, AI (LLMs) cannot be trained on languages that rapidly evolve to the point that each new generation speaks a practically new language. In many small communities in Vanuatu, Solomon Islands and Papua New Guinea, whole villages as few as 100-200 people usually end up speaking entirely different languages within 2-3 generations, due to so many nouns/pronouns which cannot be named due to changing events, and death of loved ones.
LLMs preserve, but they do not have the social capacity to know what is a taboo; a taboo which changes by the generation, if not by the year.
Oh an don't get me started on creoles and pidgin languages. I'll be impressed if AI data collection could keep up with rapidly evolving creoles like Jamaican Patois or African American Vernacular English.
Sudoku,
~JAF1320
FA+

Have you tried asking chatgpt to translate a sentence into gen z slang? 😭
But LLMs cannot predict what Gen Beta slang will sound like; it will be up to the human race to decide what it will sound like. And even IF LLMs dictated a whole new generation of slang, that's still on human volition to accept AI's hallucinated slang.