Version 2 of this article. I checked the article into source control and code rabbit found yet another bug with it, so it’s updated. If interested in the diff see me.
I had this article written like I had AugmentCode write features in a branch that I created as a PR on GitHub, where CodeRabbit reviews code to find security, correctness and optimization issues.
A case study on the importance of AI-powered code review in catching security flaws that slip through initial development
Introduction
In the rapidly evolving world of AI-assisted development, tools like Augment Code are revolutionizing how we write software. However, as this case study demonstrates, even sophisticated AI code generation tools can produce code with serious security vulnerabilities. This is where AI-powered code review tools like CodeRabbit become invaluable, serving as a critical safety net in the development process.
The Project: Redacted SSH Key Management
The Redacted project is a Flutter application designed for redacted with SSH connectivity and cloud server management. The application includes sophisticated SSH key management functionality, which handles sensitive cryptographic operations and secure data storage.
Initially, much of the security-related code was generated using Augment Code, an advanced AI coding assistant. While the generated code was functionally correct and followed good architectural patterns, it contained several critical security vulnerabilities that could have compromised user data and system security.
Critical Security Issues Discovered by CodeRabbit
1. Insecure XOR Cipher Implementation
The Problem:
// Original AI-generated code
final dataBytes = utf8.encode(data);
final keyBytes = utf8.encode(key);
final encrypted = <int>[];
for (int i = 0; i < dataBytes.length; i++) {
  encrypted.add(dataBytes[i] ^ keyBytes[i % keyBytes.length] ^ salt[i % salt.length]);
}
CodeRabbit’s Analysis:
CodeRabbit immediately flagged this as a critical security vulnerability, noting that XOR ciphers are cryptographically weak and easily breakable. The AI reviewer pointed out that this implementation:
- Uses a simple XOR operation that can be easily reversed
- Lacks proper authentication
- Doesn’t provide forward secrecy
- Is vulnerable to known-plaintext attacks
The Fix:
import 'package:cryptography/cryptography.dart';
import 'dart:typed_data';
// Secure AES-GCM implementation with proper KDF
Future<Map<String, dynamic>> encryptData(String data, String password) async {
  // Generate cryptographically secure salt and nonce
  final salt = Uint8List.fromList(List.generate(16, (_) => Random.secure().nextInt(256)));
  final nonce = Uint8List.fromList(List.generate(12, (_) => Random.secure().nextInt(256)));
  // Derive key using PBKDF2 with secure parameters
  final pbkdf2 = Pbkdf2(
    macAlgorithm: Hmac.sha256(),
    iterations: 100000, // OWASP recommended minimum
    bits: 256, // 32 bytes for AES-256
  );
  final secretKey = await pbkdf2.deriveKey(
    secretKeyData: SecretKeyData(utf8.encode(password)),
    nonce: salt,
  );
  // Encrypt using AES-GCM (authenticated encryption)
  final algorithm = AesGcm.with256bits();
  final secretBox = await algorithm.encrypt(
    utf8.encode(data),
    secretKey: secretKey,
    nonce: nonce,
  );
  // Return encrypted data with salt and nonce for storage
  return {
    'ciphertext': base64Encode(secretBox.cipherText),
    'salt': base64Encode(salt),
    'nonce': base64Encode(nonce),
    'mac': base64Encode(secretBox.mac.bytes),
  };
}
Security Note: This implementation uses PBKDF2 with a cryptographically secure salt to derive the encryption key from the password, preventing rainbow table attacks. The AES-GCM mode provides authenticated encryption (AEAD), ensuring both confidentiality and integrity. The salt and nonce are stored alongside the ciphertext and must be preserved for decryption. Never truncate passwords directly for key derivation as this weakens security significantly.
2. Insecure Private Key Storage
The Problem:
// Original AI-generated code
final keyData = {
  'n': privateKey.n.toString(),
  'e': privateKey.exponent.toString(),
  'd': privateKey.privateExponent.toString(),
  'p': privateKey.p.toString(),
  'q': privateKey.q.toString(),
};
final keyJson = jsonEncode(keyData);
final keyBase64 = base64Encode(utf8.encode(keyJson));
CodeRabbit’s Analysis:
The AI reviewer identified this as a severe security vulnerability, explaining that:
- Private key components were stored as plain JSON
- No proper ASN.1 encoding was used
- The format was easily readable and extractable
- It violated cryptographic standards (PKCS#1/PKCS#8)
The Fix:
// Proper PKCS#1 ASN.1 encoding
final sequence = ASN1Sequence();
sequence.add(ASN1Integer(BigInt.zero)); // Version
sequence.add(ASN1Integer(privateKey.n!)); // modulus
sequence.add(ASN1Integer(privateKey.exponent!)); // publicExponent
sequence.add(ASN1Integer(privateKey.privateExponent!)); // privateExponent
// ... additional PKCS#1 components
final derBytes = sequence.encodedBytes;
3. Invalid SSH Public Key Format
The Problem:
// Original AI-generated code
final keyData = {
  'n': publicKey.n.toString(),
  'e': publicKey.exponent.toString(),
};
final keyJson = jsonEncode(keyData);
final keyBase64 = base64Encode(utf8.encode(keyJson));
return 'ssh-rsa $keyBase64 $comment';
CodeRabbit’s Analysis:
CodeRabbit caught that this implementation:
- Generated invalid SSH keys that wouldn’t work with standard SSH tools
- Used JSON encoding instead of SSH wire format (RFC 4253)
- Lacked proper MPINT encoding for RSA components
- Would be rejected by SSH servers and clients
The Fix:
// Proper SSH wire-format encoding
final buffer = <int>[];
final algorithmBytes = utf8.encode('ssh-rsa');
_writeSSHString(buffer, algorithmBytes);
final exponentBytes = _bigIntToBytes(publicKey.exponent!);
_writeSSHMpint(buffer, exponentBytes);
final modulusBytes = _bigIntToBytes(publicKey.n!);
_writeSSHMpint(buffer, modulusBytes);
final keyBase64 = base64Encode(buffer);
return 'ssh-rsa $keyBase64 $comment';
4. Insufficient Key Validation
The Problem:
// Original AI-generated code
if (!privateKey.contains('BEGIN') || !privateKey.contains('END')) {
  return Result.failure(SSHException.invalidKey());
}
if (!publicKey.startsWith('ssh-rsa') && !publicKey.startsWith('ssh-ed25519')) {
  return Result.failure(SSHException.invalidKey());
}
CodeRabbit’s Analysis:
The AI reviewer noted that this validation was superficial and missed:
- Actual PEM structure validation
- Base64 content verification
- Key strength requirements (minimum key sizes)
- SSH wire format validation
- Key correspondence verification
The Fix:
// Comprehensive validation with proper parsing
final validationResult = _validatePrivateKeyPEM(privateKey);
if (validationResult.isFailure) return validationResult;
final publicKeyValidation = _validatePublicKeySSH(publicKey);
if (publicKeyValidation.isFailure) return publicKeyValidation;
final strengthValidation = _validateKeyStrength(privateKey, publicKey);
if (strengthValidation.isFailure) return strengthValidation;
The Impact of CodeRabbit’s Review
Security Improvements Achieved
- Encryption Security: Moved from easily breakable XOR to military-grade AES-GCM
- Key Storage: Implemented industry-standard PKCS#1 encoding
- SSH Compatibility: Generated RFC 4253-compliant SSH keys
- Validation Robustness: Added comprehensive security checks
Metrics of Improvement
- Vulnerability Count: Reduced from 4 critical security flaws to 0
- Cryptographic Strength: Improved from trivially breakable to industry-standard
- Standards Compliance: Achieved full compliance with SSH and cryptographic standards
- Test Coverage: Added 23 comprehensive test cases covering all security fixes
Lessons Learned
1. AI Code Generation Limitations
While AI tools like Augment Code excel at:
- Generating functionally correct code
- Following architectural patterns
- Implementing complex business logic
- Maintaining code consistency
They can struggle with:
- Cryptographic security best practices
- Industry-specific standards compliance
- Subtle security vulnerabilities
- Context-aware security decisions
2. The Critical Role of AI Code Review
CodeRabbit’s AI-powered review proved invaluable by:
- Catching What Humans Miss: Identifying subtle cryptographic flaws
- Providing Context: Explaining why each issue was a security risk
- Suggesting Solutions: Offering specific remediation strategies
- Ensuring Standards Compliance: Verifying adherence to security standards
3. The Importance of Layered AI Tools
This case demonstrates the power of using multiple AI tools in the development pipeline:
- Generation Phase: Augment Code for rapid development
- Review Phase: CodeRabbit for security and quality assurance
- Testing Phase: AI-assisted test generation for comprehensive coverage
Best Practices for AI-Assisted Secure Development
1. Never Skip Code Review for Security-Critical Code
Even AI-generated code should undergo thorough review, especially for:
- Cryptographic operations
- Authentication mechanisms
- Data storage and transmission
- Input validation and sanitization
2. Use Specialized AI Tools for Security
Consider using AI tools specifically trained on security patterns:
- CodeRabbit for comprehensive code review
- Security-focused static analysis tools
- AI-powered penetration testing tools
3. Implement Comprehensive Testing
Always include:
- Unit tests for cryptographic functions
- Integration tests for security workflows
- Penetration testing for real-world scenarios
- Compliance testing against industry standards
4. Stay Updated on Security Standards
Ensure your AI tools and development practices stay current with:
- Latest cryptographic standards
- Industry best practices
- Emerging threat vectors
- Regulatory requirements
Conclusion
This case study highlights both the promise and the limitations of AI-assisted development. While tools like Augment Code can dramatically accelerate development and generate sophisticated code, they are not infallible when it comes to security.
CodeRabbit’s AI-powered code review proved to be an essential safety net, catching critical security vulnerabilities that could have had serious real-world consequences. The combination of AI code generation followed by AI code review creates a powerful development pipeline that maximizes both speed and security.
The key takeaway is that AI tools should complement, not replace, security-conscious development practices. By leveraging multiple AI tools in a layered approach—generation, review, and testing—development teams can achieve both rapid development velocity and robust security posture.
As AI continues to evolve and become more sophisticated, we can expect these tools to become even better at generating secure code. However, the principle of defense in depth remains crucial: multiple layers of AI-assisted review and validation will always be more effective than relying on any single tool, no matter how advanced.
The future of secure software development lies not in choosing between human expertise and AI assistance, but in thoughtfully combining both to create development workflows that are faster, more reliable, and more secure than either could achieve alone.