Experts warn we could run out of data to train AI by 2026
| Last updated
Featured Image Credit: Getty Stock Images
There's no doubt that artificial intelligence is reaching the peak of its popularity with social media users around the world.
I mean, is there anything more fun that sussing what a celebrity couple's future baby will look like? Or hearing how the late Freddie Mercury might have performed Doja Cat's 'Paint the Town Red'?
But according to some scientists, humans might soon run out of the type of data needed to fully train artificial intelligence by the year 2026.
Losing this data - which fuels powerful AI systems across the globe - could subsequently decrease the growth rate of AI models, particularly large language models.
This loss may even alter the trajectory of the AI revolution.
The need for this data is for training accurate, high-quality AI algorithms - an example being Chat GPT, which was trained using 570 gigabytes of text data (around 300 billion words).
If there's an insufficient amount of data to train these such programs (including DALL-E, Lensa and Midjourney), inaccurate/low-quality outputs could be produced.
The quality of this necessary data is also hugely important, as, though low-quality data (e.g. blurry pictures and social media posts) is easy to source, they aren't sufficient enough to train high-performing AI models.
Also, text taken from social media platforms might be biased or prejudiced, or may include disinformation or illegal content which could, in turn, be replicated.
This explains why high-quality content such as text from books, online articles, scientific papers, Wikipedia, and certain filtered web content is being sought out.
A group of researchers predicted in a paper published last year that we could be set to run out of this important data by the year 2026 if current AI training trends continue.
They also suspected that low-quality language data could run out between 2030-2050.
This has left a multitude of computer and data scientists around the world feeling concerned, being that AI is equally predicted to contribute up to $15.7 trillion US dollars to the world's economy by 2030.
Other experts are reassuring tech users that the situation may not be as bad as it seems, being that there are still hundreds of unknowns regarding AI models developing for the future.
They also say there are fews of addressing potential data shortages, including by AI developers improving algorithms to they use data more efficiently.
These scientists say that, in the coming years, it'll be likely that less data and less computational power will be needed to train high-performance models, which will in turn reduce AI's carbon footprint.
A move towards AI creating synthetic data to train systems they'll need, is also being suggested.