If you dont know, already knowing what to look for is a very big help in finding vulnerabilities. Its like handing an architect a build for a house and saying „There may be a problem, but there may be none.” vs saying „There is a problem in the 3rd floor, because last time it collapsed after having to hold over 200kg”. For one, you don’t look into it too much, but for the other you already know where to look and what to look for.
What I understood from the article was that the developer was testing it on a vulnerability they found, and the AI detected it very occasionally. It found a random other problem, which yes can often be a false positive, but I gathered from the article that there was one that was a previously undiscovered vulnerability. But that’s where the developer verifies instead of taking ChatGPT at its word.
Of course I still don’t trust it to code the fix, but in terms of looking for problem areas in code. While its effectiveness in practice is marginal, as an application of AI in general, it can search big areas and try to come up with a few candidates, that I think is a legit use case.
I mean, if you run it 100 Times and spend like 2 Months chasing down the false positives, maybe.