AI coders think they’re 20% faster — but they’re actually 19% slower

cm0002@lemmy.cafe · 2 months ago

AI coders think they’re 20% faster — but they’re actually 19% slower

Phen · 2 months ago

Reading the paper, AI did a lot better than I would expect. It showed experienced devs working on a familiar code base got 19% slower. It’s telling that they thought they had been more productive, but the result was not that bad tbh.

I wish we had similar research for experienced devs on unfamiliar code bases, or for inexperienced devs, but those would probably be much harder to measure.

staircase@programming.dev · 2 months ago

I don’t understand your point. How is it good that the developers thought they were faster? Does that imply anything at all in LLMs’ favour? IMO that makes the situation worse because we’re not only fighting inefficiency, but delusion.

20% slower is substantial. Imagine the effect on the economy if 20% of all output was discarded (or more accurately, spent using electricity).

Phen · 2 months ago

I’m not saying it’s good, I’m saying I expected it to be even worse.

FizzyOrange@programming.dev · 2 months ago

Does that imply anything at all in LLMs’ favour?

Yes it suggest lower cognitive load.

vrighter@discuss.tchncs.de · 2 months ago

1% slowdown is pretty bad. You’d still do better just not using it. 19% is huge!

cley_faye@lemmy.world · 2 months ago

but the result was not that bad tbh

Using a tool that lowers your productivity by 1/5 instead of not using it at all is “not that bad” to you? A tool that costs an awful lot to run, requires heavy security compromises, too?

On the other parts… experienced devs on unfamiliar code base is a common task. They just get familiar with the code base in a few weeks. And they can base their familiarity on actual code, not hallucinated summaries which would be meaningless.

Inexperienced devs are not going to improve with something else doing the job for them, and LLM without guidance is not able to do any substantial work without deviating into garbage first, then broken garbage second. We already have data on that too.