GenAI vs. human code: Understanding tradeoffs from a developer perspective

Table of contents

While the topic of Artificial Intelligence (AI) has dominated water-cooler conversation almost everywhere, people who write code as part of our craft are in a unique position. Even though many of us know much more than everyday people about AI, we still quickly find ourselves out of our comfort zones.

Perhaps more importantly, few professions offer such a direct line from tool to benefit. The internet is filled with programmers of all experience levels and expertise helping one another and sharing solutions to problems on sites like Stack Overflow and Github. The most robust AI models have been trained extensively on this data, and provide developers solutions in their native software languages.

I have toyed with AI coding tools as long as they’ve been available, starting with the earliest incantations of assistance provided by Visual Studio. While I initially dismissed earlier models as only capable of solving the most trivial problems with poor reliability, my latest experiences have proven this opinion to be incorrect.

With a correct approach to engineering prompts, I found that AI provided a major advantage, especially when working in languages in which I am not an expert. A recent programming task (a headless screen-scraping API) in an open source project that would have taken me ~8 hours on my own was sliced down to ~2 and included help on style and semantics (not just syntax).

I don’t view GenAI tools as a threat to great developers. Instead, I believe that they are an incredible way for us to do more enjoyable work and feel more competent while doing so. I’d like to take this opportunity to talk through the opportunities and challenges that come with using them.

What’s out there?

The current landscape of GenAI coding tools can be difficult to understand in its entirety. Small steps towards code generation have been taken by a variety of Integrated Developer Environment (IDE) tools or web portals. Additional strides have been here, but the largest change over the past year has come from more general-purpose AIs that understand code. Developers typically use these as an additional element of their existing workflows.

We won’t have time to go into every possible platform that’s available to developers, but there are 3 big players in the space as of this writing:

GitHub Copilot: Copilot was delivered as a result of Microsoft and OpenAI’s partnership. GitHub was already the biggest player in the Software Development Lifecycle (SDLC) or Application Lifecycle Management (ALM) space by a huge margin. The publicly available code stored in GitHub was a key source of data for many GenAI models.

Bard: Alphabet/Google developed AlphaCode as one of the first movers in the code-assistance space. This was a private, experimental model focused on solving code competition problems, but the lessons learned from it were doubtless used to improve Bard’s performance in this arena.

ChatGPT: OpenAI’s leading product is capable of solving simple problems quickly, after ingesting data from GitHub and StackOverflow over many years.

The market for software-focused GenAI is lucrative — the supply of high-quality developers is limited, and those who are available tend to be expensive! Because of this, the race to serve this market is unbelievably fast-paced and aggressive. What is available and what performs at the highest tier is best tracked in near-real-time via social media such as X, Reddit, and Hackernews.

Use cases

The most obvious use case for Generative AI as a software developer is writing small snippets of code. Are you working in a language you’re not quite familiar with, but are trying to accomplish a common task? An AI solution is going to do this quickly and cleanly. However, there’s a lot more:

Problem solving

Even before putting source code into a file for compilation, a significant amount of a software engineer’s time can be spent figuring out an approach. Many expert developers believe that this is a venerated practice that they have developed and is safe from the expertise of AI models, but my experience shows this is not the case.

With careful prompting, entire solution architecture problems can be put to the test through an AI engine. Whether you’re building a serverless system integration with complicated API rate limits, creating a new Salesforce .csv importation scheme, or getting data from an Enterprise Resource Planning (ERP) platform into a data warehouse, AI can help spot opportunities and challenge your assumptions.

The latest tools can create simple and utilitarian UI designs based on your specifications and desires. Better yet, they can translate these designs directly into code for the browser.

Communication

Try as many developers might to avoid it, communication is a key part of much of our work. A concise sentence or two describing a module can become an expansive section of handoff documentation with the help of AI. Complex concepts can be crunched down into a bulleted list suitable for the business. Diagrams can be created from text, code, or crudely drawn images on a whiteboard.

Development

Last, but not least, is general development. Rote tasks that eat up developer time can be automated away, such as class generation from database models or simple mapping. Letting an AI handle these tasks can help keep a codebase consistent and let an engineer quickly get to the good stuff.

AI can review your code as well. There’s no doubt that Code Reviews performed by humans are an art that should be curated (and Sema’s blog posts on the topic are a great place to start). CoPilot integration in GitHub reviews code in real time before you open that Pull Request. Continuous Integration and Deployment (CI/CD) pipelines can be configured to have AI chime in at a deeper level once your code is ready for review.

Considerations

This all sounds like a fantastic force multiplying opportunity for developers. But there’s a bit more to it than to “just add AI” to the mix

Prerequisite skills

Excellent use of AI is similar to knowing exactly how to search Google for what you need, and is a skill you’ll develop with iteration and practice.

For software engineering, though, you’ll both need and want additional prerequisite skills. If you’re a junior developer, you run the risk of never establishing these foundations if you rely too heavily on AI tools. Having a mid-level knowledge of the following topics is extremely helpful:

Data storage, formats, and movement patterns.
Object Orientation, and Domain-driven design.
At least one first-class programming language (C/C++, C#, Java, Javascript, Python, GO, or an equivalent).
Client-Server architecture and Internet connectivity
Basic algorithms

Without these building blocks, you’ll have difficulty developing concrete and reasonable solutions to problems.

Prompt engineering

Here’s the good news – many of the techniques that you must use as a developer to get the most out of an AI-enabled chatbot will help you better solve software problems in general. The bad news is that prompt engineering requires a more deft hand than you might imagine, given the pace of technological advances.

The topic of prompt engineering could be a blog post all on its own, so we’ll summarize some key points:

Keep it simple: Even the most advanced AI model with chat or voice interfaces can’t hold highly complex ideas in their “mind”. Each problem you’re using AI to solve must be broken down into smaller, bite-size chunks.
Sanitize your Inputs: Check to make sure you aren’t submitting any private or personal information. This is obvious for things like names, addresses, and social security numbers, but variables with company/system names or the URLs to proprietary servers are just as bad. Tokenize or substitute information this way if it’s needed for the prompt.
Add (limited) context: Humans make the best decisions with vast context – you’ll have to invent boundaries for your AI assistant far more aggressively. On the other hand, you’ll get far better results by providing essential context. 1-3 well-formed sentences should be sufficient.
Avoid slang and ambiguity while focusing on clarity: Adding metaphors/similes or social contexts to questions is a great way to get bad results. AI firms lobotomize their model’s ability to deal with ambiguity on purpose for safety reasons, so don’t contaminate your prompt. Here’s a hack – if you have a word processor with excellent grammar checks or a top-of-the-line tool like Grammarly, use it to clean your prompt before submission!
Add structure to your inputs and output requests: Think of the chatbot as an API. As an intellectual exercise, you're in the right headspace if you could serialize your question into an interchange format like JSON or XML. Similarly, be specific about what you want to get back and what format you want it in.

Copyright and copying code

It can be tempting, after engineering the perfect prompt and receiving a slick output of efficient code, to drag it directly into your IDE. If there’s one thing to take away from this article, however, it’s this: Don’t do it.

If you are working on code that’s really important to your organization, such as customer-facing code or the back-end engine, It’s critical to make each piece of code you pull from an AI tool your own. Depending on how small the snippet you’re bringing in is, this may seem like rework, but there are serious reasons to do this.

First, and most importantly, is copyright. The idea of who owns code generated by an AI tool is murky – and that’s putting it politely. For tools integrated into your IDE like CoPilot, things get more ambiguous. As of this writing, larger organizations are signing very specific contracts for who owns the output from these tools. Chances are you won’t know where the boundaries are, so best to be safe.

Second, as a developer, you are responsible for understanding what you’re putting in a codebase. If you haven’t refactored and tested what you’ve pulled from an AI tool, you’re vastly increasing the chances of introducing bugs or security holes.

Finally, the best codebases are consistent (even if they’re not perfect). Your solution, if anything beyond trivial, will have its own flavor of variable naming conventions, indentation, newline style, and more. The hard work can be done already, but a couple of extra keystrokes to polish off the raw edges of a result is worth it.

To keep up to date with what’s going on in the world of Copyright and AI, the U.S. Copyright Office runs a well-updated page. Tech-focused outlets like The Verge and Ars Technica cover the topic regularly, as do traditional news sources. Last but certainly not least is social media - relevant communities on Reddit and X are a constant stream of the latest updates, with many users posting daily digests.

Trust, but verify

What you get from an AI engine can be many things – raw code, advice, links, or art!

However, there is no such thing as an infallible model. Google’s Bard has a helpful validation engine that tries correlating answers to questions with its search results. Still, even this is prone to “hallucination” (the model believing data it’s created) and general inaccuracy.

Anything you consume from an AI model should be validated with a secondary source. The code you’ve modified and set to fit your team’s project must still be debugged and unit-tested before a PR. The .csv with mock data should be thoroughly combed through and spot-validated by yourself before loading it into a database.

There are several other approaches you can take. Many of the best AI tools cite their sources for review, like perplexity.ai. Skim through these yourself to see if they make sense.

You can also reach out to humans with subject matter expertise to validate your assumptions. Coming to your technical lead with a potential solution to a problem generated by AI will help you test its limitations.

Speaking of testing, ask the AI questions you already have the answers to. This is one of the exercises that really increased my trust in GenAI's capabilities. For me, this involved small programming problems that I use during an interview process that have been tested hundreds of times before.

Finally, if these simpler methods aren’t available, you can do your own research via search engine to find reliable, human-tested sources: Academic journals, high-quality journalism, or the official documentation for a software tool.

Bottom Line: Your interaction with GenAI should look a lot like any Human-In-The-Loop workflow. The state of the art, as of this writing, still requires true human discernment to be involved every step of the way.

Conclusion

Experimenting with AI tools in code is still a novel exercise for many developers. However, if your organization is rushing headlong into the AI revolution or you’re working independently, you can often halve (or better!) your time to deliver many solutions in code.

Regardless of how much you dip your toe in the water, it’s important to develop the skills necessary to securely prompt chat-interface AI models, have the core building blocks you need to understand what you’re asking for, and protect yourself and your team from untested or copywritten material.

Happy coding!

About Sema Technologies, Inc.

Sema is the leader in comprehensive codebase scans with over $1T of enterprise software organizations evaluated to inform our dataset. We are now accepting pre-orders for AI Code Monitor, which translates compliance standards into “traffic light warnings” for CTOs leading fast-paced and highly productive engineering teams. You can learn more about our solution by contacting us here.

Disclosure

Sema publications should not be construed as legal advice on any specific facts or circumstances. The contents are intended for general information purposes only. To request reprint permission for any of our publications, please use our “Contact Us” form. The availability of this publication is not intended to create, and receipt of it does not constitute, an attorney-client relationship. The views set forth herein are the personal views of the authors and do not necessarily reflect those of the Firm.