The Hidden Dangers of AI in Coding: What You Need to Know

Watch the video!

If you are coding with ChatGPT or Copilot, you may be creating some terrible security leaks!

A recent Stanford study found that in 4 out of 5 tasks, participants assisted by AI wrote less secure code than those without AI. This is an 80% increase compared to coders without generative AI assistance with the same coding experience. Worst, they were significantly more likely to overestimate the security of their code, highlighting a 3.5-fold increase in false confidence about code security.

These security leaks were mostly authentication mistakes, SQL injections, increased buffer overflows, and symlink vulnerabilities that could be used to make the program crash, execute arbitrary code, or trick a program into reading or writing to an unintended location.

Let’s see how to avoid those threats and still profit from the gains of using generative AI when coding. But first, let me address one point some of you might still be wondering: where does Gen AI in coding come from, and do the benefits really outweigh the risks?

Why are AIs good to code? What exists? (ChatGPT, github copilot, others…)

AI in coding has evolved from simple autocompleters to sophisticated code-generation tools. Initially, AI provided basic syntax suggestions, but advancements in machine learning led to tools like GitHub Copilot, powered by OpenAI Codex. Released in 2021, Codex can generate entire functions and translate natural language into code across various languages, significantly enhancing developer productivity. We can now automate repetitive tasks, perform real-time code analysis, and suggest improvements, helping developers focus on complex problem-solving while maintaining high code quality​ (The GitHub Blog)​​ (GitHub Resources)​​ (OpenAI)​. Much more is coming thanks to Agents and complex systems with full access to your code, the internet and debugging features in IDEs.

Why use AI to code?

So, as you see, using AI to code is primarily for boosting productivity through the automation of repetitive tasks, like automatically generating easy functions or lines that we know how to do and can ask clearly, but don’t want to manually rewrite for a new project. It’s used to code for us.

Which is cool, it democratizes coding by merging the searching and adapting of code examples. This is similar to what developers do on Stack Overflow or GitHub with open-source code, looking for similar problems or bugs and copying them, but more efficiently, as AI tools already adapt the code to your problem and current variables.

It even helps you learn new programming languages easily translating your ideas into coding lines for languages you are not familiar with, helping you be proficient in them quite fast, though far from secure.

Thanks to that, AI-assisted coding is completely transforming career paths in software development, becoming integral parts of educational curriculums and professional training programs. Just ask any current undergrad software engineer about Copilot or ChatGPT, and you’ll see.

The issue is, will it create overconfidence, potentially making you create security leaks? The answer is yes, and for a simple reason: just like open-source code: you did not write it, someone or, in this case, something else did.

The risks of coding with AI

Such security leaks can happen to everyone, or everything. Understanding the risks associated with coding using AI is the first step to fixing them. You won’t be looking for leaks or bugs if you trust the system completely, like our overconfident participants in the Stanford study. So before diving into solutions, let’s have a quick look at the biggest risks of using generative AI for coding…

I just want to quickly thank the sponsor of this article, Sema AI, which we will come back to later on, as they’ve been of great help identifying the various risks of using generative AI in coding that we’ll see now, and the solutions to prevent them. I’ll share more about them in a few minutes.

The most obvious one is that using current AI systems to generate code often leads to outdated information, as the system may have been trained with now obsolete data. This is nothing new; the same issues existed using code from Stack Overflow.

Even though it’s often quite good, unchecked AI-generated code may lack proper documentation, utilize confusing variable names, and employ suboptimal algorithms and design patterns, which can harm the overall quality of the code.

The use of generative AI tools may expose companies to intellectual property (IP) risks, particularly regarding trade secrets and copyright. We see OpenAI and other companies being sued all the time.

It can also hurt the company’s technical credibility during mergers, acquisitions, or investments that cannot be overlooked. As part of technical due diligence, the presence of AI-generated code is scrutinized similarly to the use of open-source components, with potential impacts on IP security and commercial viability. If this isn’t done properly, it can hurt the whole company.

A last one that will surely apply to you, if it does not already, is developing an overreliance on AI for coding tasks. This will lead to a decrease in a developer’s ability to code independently and innovate creatively, impacting overall programming proficiency. Of course, we’ll always have access to generative AI help from now on, but we still must ensure we understand and follow what’s going on. Otherwise, who’s going to help us debug it?

How to mitigate those risks and still use AI for efficiency when coding?

The first steps to mitigate all those risks are (1) to keep learning and understanding what’s happening by reviewing and understanding each line of code, written or generated, and (2) to be fully transparent about what’s being generated and not generated when sharing with your colleagues and managers. Transparency will be key to preventing many issues, especially those related to IP or security.

Beyond these initial steps, it’s crucial to blend AI-generated code with human oversight. Ask the AI to explain its modifications; take the time to thoroughly understand its output to ensure it aligns with the project’s goals and adheres to security standards. Test features and see what breaks, don’t assume it will work because the code looks good. It often doesn’t, and it won’t tell you. As always, AI will hallucinate, and this is there to stay. By the way, I just published a book with our team at Towards AI teaching pretty much everything you need to know to reduce hallucinations, so learning more about LLMs and better understanding how to mitigate hallucination is for sure a good follow-up step here.

Speaking of which, work on your prompting techniques. It’s simple but quite useful. Clearly specify task instructions, include detailed function declarations and helper functions, and give it more context, we now have millions of context windows, so you’re good, most gen AI approaches use RAG, which I covered in a video, to look into your code, which facilitates the whole process even more, you just need to use some good and clear prompts to get what you want. You can also play with parameters like the AI’s temperature that can significantly impact the security and correctness of the code, minimizing risks and improving reliability.

Then, having done the maximum to reduce hallucination and risks of bugs, I think the last step is to treat AI-generated code with the same scrutiny as code written by human developers, and not any developer, treat it like when you review a PR for a colleague with questionable coding skills. Add some critical thinking in there, questioning if this line is really useful or if that function is overkill. When you spot something weird, you can start asking the AI what it is used for, if it can make this part simpler or just go back to the documentation to confirm it’s the right function to use.

Now, when it comes to companies, establishing clear organizational guidelines for using AI-generated code is also quite needed. Set transparent policies that dictate how and where AI can be used or even should be used, ensuring these guidelines are understood and followed across the entire team, and that people feel safe and not judged using it. Because they will.

Lastly, one of the most important things to do, as always with AI, is to monitor it. Monitoring AI code usage through a system like Sema’s AI Code Monitor can provide invaluable insights into the extent and impact of AI-generated code within your projects. Such tools help teams track AI usage metrics, identify potential issues, and refine their approach to integrating AI in coding. For example, Sema just released a feature that automatically checks for AI during code review Pull Requests.

It’s especially useful and important for larger companies or when working with multiple developers if you are trying to build a reliable and valuable codebase.

By the way, they are offering a two-week free pilot of the AI Code Monitor, and my viewers get a discount on the first three months if you want to go for a license.

Conclusion

While AI tools offer significant benefits for enhancing coding efficiency, they also introduce various risks, such as increased vulnerability to security flaws and over-reliance on automated systems.

Tools like Sema’s AI Code Monitor can help identify and fix vulnerabilities, ensuring higher code quality and security, but there’s nothing better than double-checking and ensuring you understand the generated code you use.

Thank you for reading. I will see you in the next one!