cqas/doc/md/security.md

# Security Guide

CQaS incorporates comprehensive security analysis capabilities designed to identify potential vulnerabilities and security anti-patterns in Python code. The security analysis engine uses static analysis techniques combined with pattern matching to detect common security issues before they reach production.

### Security Analysis Features

- **Multi-category Vulnerability Detection**: 8 distinct vulnerability types
- **CVSS v3.1 Scoring**: Industry-standard severity assessment
- **Confidence Ratings**: Reliability assessment for each finding
- **Context-aware Analysis**: Understanding of Python-specific security patterns
- **False Positive Reduction**: Smart filtering to minimise noise
- **Actionable Remediation**: Specific guidance for fixing issues

### Analysis Scope

The security analyser examines:

- Function calls and method invocations
- String literals and formatting operations
- Import statements and module usage
- Data serialisation and deserialisation
- Template and code generation patterns
- Cryptographic function usage
- Input validation and sanitisation

## Vulnerability Categories

### 1. SQL Injection (CVSS Base: 8.1)

SQL injection vulnerabilities occur when user input is directly incorporated into SQL queries without proper sanitisation.

#### Detection Patterns

```python
# Detected patterns
sql_query = f"SELECT * FROM users WHERE id = {user_id}"  # High risk
query = "SELECT * FROM table WHERE name = '" + name + "'"  # High risk
cursor.execute("DELETE FROM users WHERE id = %s" % user_id)  # High risk
```

#### Safe Alternatives

```python
# Parameterized queries (safe)
cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,))
cursor.execute("SELECT * FROM users WHERE id = %(id)s", {"id": user_id})

# Using ORM (safe)
User.objects.filter(id=user_id)
```

#### CQaS Detection Logic

- Pattern matching for SQL keywords with string formatting
- Detection of string concatenation in SQL contexts
- Identification of `.format()` usage with SQL strings
- F-string usage with SQL operations

### 2. Command Injection (CVSS Base: 9.8)

Command injection allows attackers to execute arbitrary system commands through application vulnerabilities.

#### High-Risk Functions

```python
# Dangerous patterns detected
os.system(f"ls {user_input}")  # Critical
subprocess.call(command, shell=True)  # High risk
subprocess.run(cmd, shell=True)  # High risk
subprocess.Popen(command, shell=True)  # High risk
os.popen(f"grep {pattern} file.txt")  # High risk
```

#### Safe Alternatives

```python
# Safe command execution
subprocess.run(["ls", directory], capture_output=True)  # Safe
subprocess.call(["grep", pattern, "file.txt"])  # Safe

# Input validation and sanitisation
import shlex
safe_command = shlex.quote(user_input)
```

### 3. Code Injection (CVSS Base: 9.3)

Code injection vulnerabilities allow execution of arbitrary Python code.

#### Dangerous Functions

```python
# Critical security risks
eval(user_input)  # Critical
exec(user_code)  # Critical
compile(source, filename, mode)  # High risk

# Dynamic attribute access
getattr(obj, user_attr)  # Medium risk
setattr(obj, user_attr, value)  # Medium risk
```

#### Safe Alternatives

```python
# Safe evaluation alternatives
import ast
def safe_eval(expr):
    return ast.literal_eval(expr)  # Only evaluates literals

# Whitelist-based attribute access
ALLOWED_ATTRS = {'name', 'email', 'age'}
if attr_name in ALLOWED_ATTRS:
    getattr(obj, attr_name)
```

### 4. Hardcoded Secrets (CVSS Base: 7.5)

Hardcoded secrets in source code pose significant security risks.

#### Detection Patterns

```python
# Detected secret patterns
password = "admin123"  # Detected
api_key = "sk-1234567890abcdef"  # Detected
secret_token = "eyJhbGciOiJIUzI1NiJ9..."  # Detected
db_password = "MySecretPassword123"  # Detected

# Base64 encoded secrets
encoded_secret = "YWRtaW46cGFzc3dvcmQ="  # Detected

# Hex patterns
hex_key = "deadbeef12345678"  # Detected if long enough
```

#### Best Practices

```python
# Environment variables (recommended)
import os
password = os.getenv('DB_PASSWORD')
api_key = os.environ['API_KEY']

# Configuration files (not in version control)
import configparser
config = configparser.ConfigParser()
config.read('secrets.ini')
secret = config['DEFAULT']['secret_key']

# Key management services
from azure.keyvault.secrets import SecretClient
secret = client.get_secret("database-password").value
```

### 5. Weak Cryptography (CVSS Base: 7.4)

Usage of cryptographically weak algorithms or insecure random number generation.

#### Weak Algorithms Detected

```python
# Weak hash functions
import hashlib
md5_hash = hashlib.md5(data)  # Vulnerable
sha1_hash = hashlib.sha1(data)  # Vulnerable

# Insecure random
import random
token = random.random()  # Not cryptographically secure
session_id = random.randint(1000, 9999)  # Predictable
```

#### Secure Alternatives

```python
# Strong hash functions
import hashlib
sha256_hash = hashlib.sha256(data)  # Secure
sha3_hash = hashlib.sha3_256(data)  # Secure

# Cryptographically secure random
import secrets
token = secrets.token_urlsafe(32)  # Secure
random_bytes = secrets.randbits(256)  # Secure

# Password hashing
import bcrypt
hashed = bcrypt.hashpw(password.encode(), bcrypt.gensalt())
```

### 6. Dangerous Imports (CVSS Base: 4.3)

Importing modules that can lead to security vulnerabilities.

#### Risky Modules

```python
# Deserialisation risks
import pickle  # Can execute arbitrary code
import dill    # Similar risks to pickle
import shelve  # Uses pickle internally

# Command execution
import commands  # Deprecated, command injection risk
import popen2    # Command injection risk
```

#### Safer Alternatives

```python
# Safe serialisation
import json
data = json.loads(json_string)  # Safe for simple data

# Structured data
import xml.etree.ElementTree as ET
root = ET.fromstring(xml_data)  # Safer XML parsing

# For complex objects
import marshmallow  # Schema validation and serialisation
```

### 7. Unsafe Deserialisation (CVSS Base: 8.8)

Deserialising untrusted data can lead to remote code execution.

#### Dangerous Patterns

```python
# High-risk deserialisation
import pickle
data = pickle.load(file)  # Can execute arbitrary code
obj = pickle.loads(user_data)  # Critical if user_data is untrusted

# Other risky deserializers
import dill
obj = dill.load(file)

import yaml
data = yaml.load(stream)  # Can execute arbitrary Python
```

#### Safe Deserialisation

```python
# Safe alternatives
import json
data = json.loads(json_string)  # Safe for basic data types

# Safe YAML loading
import yaml
data = yaml.safe_load(stream)  # Only loads basic YAML constructs

# Schema validation
from marshmallow import Schema, fields
class UserSchema(Schema):
    name = fields.Str(required=True)
    email = fields.Email(required=True)

schema = UserSchema()
result = schema.load(untrusted_data)  # Validated deserialisation
```

### 8. Template Injection (CVSS Base: 8.5)

Server-side template injection can lead to remote code execution.

#### Risky Template Usage

```python
# String formatting with user input
template = f"Hello {user_input}"  # Potential injection
result = "User: {}".format(user_data)  # Risk if user_data contains format specifiers
```

#### Safe Template Practices

```python
safe_template = f"User: {user_data}"
```