Yeah, I have friend who was a stat major, he talks about how transformers are new and have novel ideas and implementations, but much of the work was held back by limited compute power, much of the math was worked out decades ago. Before AI or ML it was once called Statistical Learning, there were 2 or so other names as well which were use to rebrand the discipline (I believe for funding, don’t take my word for it).
It’s refreshing to see others talk about its history beyond the last few years. Sometimes I feel like history started yesterday.
Yeah, when I studied computer science 10 years ago most of the theory implemented in LLMs was already widely known, and the academic literature goes back to at least the early 90’s. Specific techniques may improve the performance of the algorithms, but they won’t fundamentally change their nature.
Obviously most people have none of this context, so they kind of fall for the narrative pushed by the media and the tech companies. They pretend this is totally different than anything seen before and they deliberately give a wink and a nudge toward sci-fi, blurring the lines between what they created and fictional AGIs. Of course they have only the most superficially similarity.
the first implementations go back to the 60s - the neural net approach was abandoned in the 80s because building a large network was impractical and it was unclear how to train anything beyond a simple perceptron. there hadn’t been much progress in decades. that changed in the early oughts, especially when combined with statistical methods. this bore fruit in the teens and gave rise to recent LLMs.
Yeah, I have friend who was a stat major, he talks about how transformers are new and have novel ideas and implementations, but much of the work was held back by limited compute power, much of the math was worked out decades ago. Before AI or ML it was once called Statistical Learning, there were 2 or so other names as well which were use to rebrand the discipline (I believe for funding, don’t take my word for it).
It’s refreshing to see others talk about its history beyond the last few years. Sometimes I feel like history started yesterday.
Yeah, when I studied computer science 10 years ago most of the theory implemented in LLMs was already widely known, and the academic literature goes back to at least the early 90’s. Specific techniques may improve the performance of the algorithms, but they won’t fundamentally change their nature.
Obviously most people have none of this context, so they kind of fall for the narrative pushed by the media and the tech companies. They pretend this is totally different than anything seen before and they deliberately give a wink and a nudge toward sci-fi, blurring the lines between what they created and fictional AGIs. Of course they have only the most superficially similarity.
the first implementations go back to the 60s - the neural net approach was abandoned in the 80s because building a large network was impractical and it was unclear how to train anything beyond a simple perceptron. there hadn’t been much progress in decades. that changed in the early oughts, especially when combined with statistical methods. this bore fruit in the teens and gave rise to recent LLMs.