Table of contents
Sema is pleased to share updated Standards for GenAI Use and Risk Management in the software development lifecycle (SDLC).
These Standards are a tool for CTOs, the C-Suite, Boards of Directors, other engineering leaders and especially developers to mitigate the risks while capturing the significant benefits of GenAI coding tools.
Sema will continue to publish and update these Standards as best practices develop and emerge along with the rise in AI adoption.
We welcome your feedback and input.
GenAI Code Use and Risk Management Standards
Q2 2024: Last update on May 8
Part 1 of 2: GenAI-Originated Code
Definitions:
- Included: Code that originated with a GenAI tool, as opposed to created by in-house developers.
- Not included: code written in house, copied from an external source such as Open Source or Google/ Stack Overflow, or automatically generated.
Standards:
- Strength: 5-20% of the codebase
- Low Risk: <5%
- Medium Risk: 20-50%
- High Risk: >50%
Part 2 of 2: Pure GenAI Code
Definitions:
- Pure GenAI code is code that originated with a GenAI tool and was not modified by developers afterwards.
- By contrast, Blended GenAI code was modified by a developer.
Standards:
- Strength: <10% of the codebase
- Low Risk: 10-15%
- Medium Risk: 15-25%
- High Risk: >25%
Discussion/Explanation
- Just like the use of Open Source code, GenAI code can significantly increase developer productivity and job satisfaction. Too, both Open Source and GenAI code come with intellectual property, security, and operational risks that are in scope for technical due diligence.
- The greatest risks from GenAI code usage are from intellectual property defensibility, code security, and code maintainability/quality. For all three, the more that the code was written solely by GenAI, without modification from developers (Pure GenAI, as opposed to Blended GenAI), the greater the risk. Therefore, the thresholds for risk are lower for PureGenAI code rather than Blended GenAI code.
- There are a few situations where no GenAI use in the SDLC is appropriate, including companies that have not yet approved GenAI use. However given the substantial benefits for adoption (41X ROI over two years – see AI Working Paper 01), Sema assesses a Low Risk of not using GenAI enough.
- Engineering teams should develop their own standards for effective GenAI usage. In particular, teams may choose to exceed the thresholds of overall GenAI use. However, these organizations should have a well-defended position on the security, quality and IP defensibility to prepare for future sale/ investment, including guidance and controls to blend the code sufficiently.
- Standards are applied across code usage types to guide discussion. However it is expected that GenAI usage in Legacy code will be lower, and higher levels of GenAI usage will be acceptable in Proof of Concept code.
Keeping track of global GenAI compliance standards
Periodically, Sema publishes a no-cost newsletter covering new developments in Gen AI code compliance. The newsletter shares snapshots and excerpts from Sema’s GenAI Code compliance Database. Topics include recent highlights of regulations, lawsuits, stakeholder requirements, mandatory standards, and optional compliance standards. The scope is global.
You can sign up to receive the newsletter here.
About Sema Technologies, Inc.
Sema is the leader in comprehensive codebase scans with over $1T of enterprise software organizations evaluated to inform our dataset. We are now accepting pre-orders for AI Code Monitor, which translates compliance standards into “traffic light warnings” for CTOs leading fast-paced and highly productive engineering teams. You can learn more about our solution by contacting us here.