The thing statements like this are hiding is basically every LLM from the major players has already reached the end of the training data universe. They’ve absorbed everything there is to scrape. They’re in desperate need of new data to build larger models. That data isn’t being created fast enough.
I called Zoom for support after my login emails didn't arrive. They have a new disclaimer: 'All calls are being recorded and will be used to train our AI. By continuing with this call, you consent.' No, I don't consent. AND there is no opt-out. Yeah, they are absolutely trying to mine real voices.
What’s far more likely is, as they’re forced to train models on LLM-generated data, the quality of these models dramatically decreases. Garbage in, garbage out. They’re in a desperate race to squeeze all the value out of this bubble before it bursts.
Not to mention they're sucking up so much energy that companies are starting to invest in nuclear fusion startups in order to train bigger models. This is insanity for the purpose of keeping the VC money flowing.