wastes its time doing these repetitive and boring tasks
To me, this is sort of a code smell. I’m not going to say that every single bit of work that I have done is unique and engaging, but I think that if a lot of code being written is boring and repetitive, it’s probably not engineered correctly.
It’s easy for me to be flippant and say this and you’d be totally right to point that out. I just felt like getting it out of my head.
If most of the code you write is meaningful code that’s novel and interesting then you are incredibly privileged. Majority of code I’ve seen in the industry is mostly boring and a lot of it just boilerplate.
I’d argue that most of the code is conceptually boilerplate, even when you have a framework to paper over it. There’s really nothing exciting about declaring an HTTP endpoint that’s going to slurp some JSON, massage it a bit, and shove it n your db. It’s a boring repetitive task, and I’m happy to let a tool do it for me.
What I’m trying to say is that for Django, especially Django Rest Framework, you don’t even declare endpoints.
DRF has a ModelViewSet where you just create a class, inherit from MVS and set the model to point to your Django ORM model and that’s it. ModelViewSet already has all the implementation code for handling POST, PUT, PATCH and DELETE.
There is no boilerplate.
There isn’t anything that an LLM would add to this process.
Absolutely, coders should be spending time developing new and faster algorithms, things that AI cannot do, not figuring out the boilerplate of a dropbox menu on whatever framework. Heck, we dont even need frameworks with AI.
It’s more that the iterative slop generation is pretty energy intensive when you scale it up like this. Tons of tokens in memory, multiple iterations of producing slop, running tests to tell it’s slop and starting it over again automatically. I’d love the time savings as well. I’m just saying we should keep in mind the waste aspect as it’s bound to catch us up.
I don’t really find the waste argument terribly convincing myself. The amount of waste depends on how many tries it needs to get the answer, and how much previous work it can reuse. The quality of output has already improved dramatically, and there’s no reason to expect that this will not continue to get better over time. Meanwhile, there’s every reason to expect that iterative loop will continue to be optimized as well.
In a broader sense, we waste power all the time on all kinds of things. Think of all the ads, crypto, or consumerism in general. There’s nothing uniquely wasteful about LLMs, and at least they can be put towards producing something of value, unlike many things our society wastes energy on.
I do think there’s something uniquely wasteful about floating point arithmetic, which is why need specialized processors for it, and there is something uniquely wasteful about crypto and LLMs, both in terms of electricity but also in terms of waste heat. I agree that generative AI for solving problems is definitely better than crypto, and it’s better than using generative AI to produce creative works, do advertising and marketing, etc.
But it’s not without it’s externalities and putting that in an unmonitored iterative loop at scale requires us to at least consider the costs.
Eventually we most likely will see specialized chips for this, and there are already analog chips being produced for neural networks which are a far better fit. There are selection pressures to improve this tech even under capitalism, since companies running models end up paying for the power usage. And then we have open source models with people optimizing them to run things locally. Personally, I find it mind blowing that we’ve already can run local models on a laptop that perform roughly as well as models that required a whole data centre to run just a year ago. It’s hard to say when all the low hanging fruit is picked, will improvements start to plateau, but so far it’s been really impressive to watch.
Yeah, there is something to be said for changing the hardware. Producing the models is still expensive even if running the models is becoming more efficient. But DeepSeek shows us even production is becoming more efficient.
What’s impressive to me is how useful the concept of the stochastic parrot is turning out to be. It doesn’t seem to make a lot of sense, at first or even second glace, that choosing the most probable next word in a sentence based on the statistical distribution of word usages across a training set would actually be all that useful.
I’ve used it for coding before and it’s obvious that these things are most useful at reproducing code tutorials or code examples and not at all for reasoning, but there’s a lot of code examples and tutorials out there that I haven’t read yet and never will read. The ability of a stochastic parrot to reproduce that code using human language as it’s control input is impressive.
I’ve been amazed by this idea ever since I learned about Markov chains, and arguably LLMs aren’t fundamentally different in nature. It’s simply a huge token space encoded in a multidimensional matrix, but the fundamental idea is the same. It’s really interesting how you start getting emergent properties when you scale something conceptually simple up. It might say something about the nature of our own cognition as well.
You mentioned Markov Chains; for a laymen with regards to mathematics (one would need to brush up on basic calculus) would you know any good books (I was thinking textbooks?) or resources to better understand maths with view to gain a better understanding of LLMs/GenAI later down the line?
A few books that are fairly accessible depending on your math level.
Basic Math for AI is written for people with no prior AI or advanced math knowledge. It aims to demystify the essential mathematics needed for AI, and gives a broad beginner-friendly introduction.
Mathematics for Machine Learning is a bit more academic than Hinton’s book, and it covers linear algebra, vector calculus, probability, and optimization, which are the pillars of LLM math.
Naked Statistics: Stripping the Dread from the Data is phenomenal for building an intuitive understanding of probability and statistics, which are often the most intimidating subjects for beginners.
I’d much rather the slop generator wastes its time doing these repetitive and boring tasks so I can spend my time doing something more interesting.
To me, this is sort of a code smell. I’m not going to say that every single bit of work that I have done is unique and engaging, but I think that if a lot of code being written is boring and repetitive, it’s probably not engineered correctly.
It’s easy for me to be flippant and say this and you’d be totally right to point that out. I just felt like getting it out of my head.
If most of the code you write is meaningful code that’s novel and interesting then you are incredibly privileged. Majority of code I’ve seen in the industry is mostly boring and a lot of it just boilerplate.
This is possible but I doubt it. It’s your usual CRUD web application with some business logic and some async workers.
So then you do write a bunch of boilerplate such as HTTP endpoints, database queries, and so on.
Not really. It’s Django and Django Rest Framework so there really isn’t a lot of boilerplate. That’s all hidden behind the framework
I’d argue that most of the code is conceptually boilerplate, even when you have a framework to paper over it. There’s really nothing exciting about declaring an HTTP endpoint that’s going to slurp some JSON, massage it a bit, and shove it n your db. It’s a boring repetitive task, and I’m happy to let a tool do it for me.
What I’m trying to say is that for Django, especially Django Rest Framework, you don’t even declare endpoints.
DRF has a
ModelViewSet
where you just create a class, inherit from MVS and set themodel
to point to your Django ORM model and that’s it.ModelViewSet
already has all the implementation code for handlingPOST
,PUT
,PATCH
andDELETE
.There is no boilerplate.
There isn’t anything that an LLM would add to this process.
I’ve used Django before and I disagree. 🤷
Absolutely, coders should be spending time developing new and faster algorithms, things that AI cannot do, not figuring out the boilerplate of a dropbox menu on whatever framework. Heck, we dont even need frameworks with AI.
It’s more that the iterative slop generation is pretty energy intensive when you scale it up like this. Tons of tokens in memory, multiple iterations of producing slop, running tests to tell it’s slop and starting it over again automatically. I’d love the time savings as well. I’m just saying we should keep in mind the waste aspect as it’s bound to catch us up.
I don’t really find the waste argument terribly convincing myself. The amount of waste depends on how many tries it needs to get the answer, and how much previous work it can reuse. The quality of output has already improved dramatically, and there’s no reason to expect that this will not continue to get better over time. Meanwhile, there’s every reason to expect that iterative loop will continue to be optimized as well.
In a broader sense, we waste power all the time on all kinds of things. Think of all the ads, crypto, or consumerism in general. There’s nothing uniquely wasteful about LLMs, and at least they can be put towards producing something of value, unlike many things our society wastes energy on.
I do think there’s something uniquely wasteful about floating point arithmetic, which is why need specialized processors for it, and there is something uniquely wasteful about crypto and LLMs, both in terms of electricity but also in terms of waste heat. I agree that generative AI for solving problems is definitely better than crypto, and it’s better than using generative AI to produce creative works, do advertising and marketing, etc.
But it’s not without it’s externalities and putting that in an unmonitored iterative loop at scale requires us to at least consider the costs.
Eventually we most likely will see specialized chips for this, and there are already analog chips being produced for neural networks which are a far better fit. There are selection pressures to improve this tech even under capitalism, since companies running models end up paying for the power usage. And then we have open source models with people optimizing them to run things locally. Personally, I find it mind blowing that we’ve already can run local models on a laptop that perform roughly as well as models that required a whole data centre to run just a year ago. It’s hard to say when all the low hanging fruit is picked, will improvements start to plateau, but so far it’s been really impressive to watch.
Yeah, there is something to be said for changing the hardware. Producing the models is still expensive even if running the models is becoming more efficient. But DeepSeek shows us even production is becoming more efficient.
What’s impressive to me is how useful the concept of the stochastic parrot is turning out to be. It doesn’t seem to make a lot of sense, at first or even second glace, that choosing the most probable next word in a sentence based on the statistical distribution of word usages across a training set would actually be all that useful.
I’ve used it for coding before and it’s obvious that these things are most useful at reproducing code tutorials or code examples and not at all for reasoning, but there’s a lot of code examples and tutorials out there that I haven’t read yet and never will read. The ability of a stochastic parrot to reproduce that code using human language as it’s control input is impressive.
I’ve been amazed by this idea ever since I learned about Markov chains, and arguably LLMs aren’t fundamentally different in nature. It’s simply a huge token space encoded in a multidimensional matrix, but the fundamental idea is the same. It’s really interesting how you start getting emergent properties when you scale something conceptually simple up. It might say something about the nature of our own cognition as well.
You mentioned Markov Chains; for a laymen with regards to mathematics (one would need to brush up on basic calculus) would you know any good books (I was thinking textbooks?) or resources to better understand maths with view to gain a better understanding of LLMs/GenAI later down the line?
A few books that are fairly accessible depending on your math level.
Basic Math for AI is written for people with no prior AI or advanced math knowledge. It aims to demystify the essential mathematics needed for AI, and gives a broad beginner-friendly introduction.
https://www.goodreads.com/book/show/214340546-basic-math-for-ai
Mathematics for Machine Learning is a bit more academic than Hinton’s book, and it covers linear algebra, vector calculus, probability, and optimization, which are the pillars of LLM math.
https://www.goodreads.com/book/show/50419441-mathematics-for-machine-learning
Naked Statistics: Stripping the Dread from the Data is phenomenal for building an intuitive understanding of probability and statistics, which are often the most intimidating subjects for beginners.
https://www.goodreads.com/book/show/17986418-naked-statistics