Introduction
Constantly on the hunt to find new ways to disrupt IT infrastructures, cybercriminals are increasingly shifting their attention from penetrating networks to exploiting software vulnerabilities.
While this trend has been evident for some time, it is becoming significantly more popular as the usage of artificial intelligence (AI) tools by software developers increases. This is because AI is creating a growing number of security flaws in software that can be readily exploited by cybercriminals.
According to a recent survey by Stack Overflow[1], three-quarters of developers are either using or plan to use AI coding tools in their software development lifecycle (SDLC), up from 70% a year ago.
Developers are adopting this strategy because of the significant benefits it delivers. These include heightened productivity (cited by 81% of respondents), faster learning (62%) and improved efficiencies (58%).
Yet, despite these clear advantages, only 42% of developers actually trust the accuracy of AI output in their workflows. This is concerning when you consider that many developers routinely copy and paste insecure code from large language models (LLMs) directly into production environments.
This practice stems from the fact that developer teams are under immense pressure to produce more lines of code faster than ever. And, because security teams are also overworked, they aren’t able to provide the same level of scrutiny as before, causing overlooked and possibly harmful flaws to remain.
Creating Deployment-ready Code
The situation is deeply concerning as it creates the potential for widespread disruption. According to large language model (LLM) benchmarking service BaxBench[2], LLMs are not yet capable of generating deployment-ready code. This is especially concerning given the use of AI coding tools and agents in enterprise environments.
Also, BaxBench indicates that 62% of solutions produced by even the best models are either incorrect or contain a vulnerability. Even among the correct ones, about 50% are insecure.
The bottom line is that, despite the productivity boost, AI coding assistants represent another major threat vector. In response, security leaders must implement safe-usage policies as part of a governance effort.
However, such policies will fall far short of raising awareness among developers about the inherent risks. These developers will trust AI-generated code by default and, because they are proficient with some AI functions, they will leave a steady stream of vulnerabilities during the SDLC.
They also often lack the expertise to review and validate AI-enabled code, and this disconnect only further elevates their organisation’s risk profile, exposing governance gaps.
To keep this situation from becoming even more dire, chief information security officers (CISOs) must work with other organisational leaders to implement a comprehensive and automated governance plan. This plan needs to enforce policies and guardrails, especially within the repository workflow.
To ensure the plan leads to an ideal state of “secure by design” safe-coding practices, CISOs need to build it upon three key components:
- Observability: Effective governance is incomplete without proper oversight. Continuous observability brings granular insights into code health, suspicious patterns and compromised dependencies. To achieve this, both security and development teams need to work together to gain visibility into where AI-generated code is introduced, how developers are managing their tools, and what their overall security process includes.
- Benchmarking: Governance leaders must evaluate developers in terms of their security aptitude, so they can identify where any skills gaps might exist. Assessed skills should include the ability to write secure code, and sufficiently review code created with AI assistance, as well as code obtained from open-source repositories and third-party providers.Ultimately, leaders need to establish trust scores based upon continuous and personalised benchmarking-driven evaluations, to determine baselines for learning programs.
- Education: Once thorough benchmarking has been put in place, leaders will then know where to focus upskilling investments and efforts. By raising developers’ awareness of risks, they gain a greater appreciation for code review and testing. Education programs should be agile, delivering tools and learning with flexible schedules and formats.
These programs are most successful when they feature hands-on sessions that address real-world problems that developers encounter on the job. Lab exercises will, for example, simulate scenarios where an AI coding assistant makes changes to existing code, and the developer then properly reviews the code to decide whether to accept or reject the changes.
In Summary
Development teams face relentless pressure to deliver, yet they must continue to prioritise building secure, high-quality software. Leaders play a crucial role in reinforcing how a Secure by Design approach, supported by observability, benchmarking, and ongoing education, directly enhances code quality.
By embedding these practices, organisations can close governance gaps and fully capture the benefits of AI-driven productivity and efficiency, while reducing the risk of security flaws or costly rework during the SDLC.
[1] https://stackoverflow.blog/2025/01/01/developers-want-more-more-more-the-2024-results-from-stack-overflow-s-annual-developer-survey/
[2] https://baxbench.com/