research-school-2024/findings/stats-1.txt

THIS IS A DRAFT! SOME OF THIS INFORMATION MAY BE IN ACCURATE OR INCOMPLETE!

Baseline:

- 4 independent students
- 2 security checklists: https://git.ari.lt/ari/research-school-2024/
- Groups:
    - flask_project, flask_project_no1 - short checklist
    - flask_projektas, project_1 - long checklist

Automatic review:

* Original
    1. Pyright (typeCheckingMode=strict)
        1. flask_project: 65 errors, 0 warnings, 0 informations
        2. flask_project_no1: 33 errors, 0 warnings, 0 informations
        3. flask_projektas: 72 errors, 0 warnings, 0 informations
        4. project_1: 65 errors, 0 warnings, 0 informations

        Average: 58.75 errors per project.

    2. Pylint
        1. flask_project: 8.17/10
        2. flask_project_no1: 7.61/10
        3. flask_projektas: 6.59/10
        4. project_1: 5.87/10

        Average: 7.06/10.

    3. Bandit
        1. flask_project: 0 undefined, 2 low, 0 medium, 1 high
        2. flask_project_no1: 0 undefined, 1 low, 0 medium, 1 high
        3. flask_projektas: 0 undefined, 1 low, 7 medium, 1 high
        4. project_1: 0 undefined, 2 low, 0 medium, 1 high

        Average: 0 undefined, 2 low, 2 medium, 1 high.

    4. OWASP ZAP automated scan
        1. flask_project: 3 medium, 4 low, 4 informational
        2. flask_project_no1: 2 medium, 2 low
        3. flask_projektas: 3 medium, 3 low, 3 informational
        4. project_1: 4 medium, 3 low, 4 informational

        Average: 3 medium, 3 low, 3 informational.
* New
    1. Pyright (typeCheckingMode=strict)
        1. flask_project: 69 errors, 0 warnings, 0 informations
        2. flask_project_no1: 59 errors, 0 warnings, 0 informations
        3. flask_projektas: 96 errors, 0 warnings, 0 informations
        4. project_1: 107 errors, 0 warnings, 0 informations

        Average: 82.75 errors per project.

    2. Pylint
        1. flask_project: 8.00/10
        2. flask_project_no1: 6.47/10
        3. flask_projektas: 7.70/10
        4. project_1: 4.98/10

        Average: 6.79/10.

    3. Bandit
        1. flask_project: 2 low, 0 medium, 0 high
        2. flask_project_no1: 1 low, 0  medium, 0 high
        3. flask_projektas: 1 low, 0 medium, 0 high
        4. project_1: 1 low, 0 medium, 1 high

        Average: 1 low, 0 medium, 0 high.

    4. OWASP ZAP automated scan
        1. flask_project: 4 medium, 4 low, 5 informational.
        2. flask_project_no1: 5 medium, 4 low, 7 informational.
        3. flask_projektas: 5 medium, 7 low, 7 informational
        4. project_1: 5 medium, 7 low, 7 informational

        Average: 5 medium, 5 low, 7 informational

Manual review:

* Original
    1. flask_project
        - Weaknesses
            - Static and exposed credentials as well as keys. Violates the open security model.
            - Lack of input validation.
            - Nonreuse of connection leads to worsened performance.
            - Lack of type hints and documentation leads to worse maintainability and clarity.
            - Lack of type hints also leads to more error prone code.
            - Lack of early-exit, trap conditions
            - Does not escape template literals in templates, making it vulnerable to XSS
            - Storing of the user role (admin status) in the session rather than fetching from the database/hybrid approach.
            - Does not make use of CSRF tokens to protect the forms.
            - Depends on the client for management of the login. Once the session is leaked, there is no way to revoke it.
            - Improperly indicates user errors by not returning the proper return code.
            - A minimal Flask configuration which can be hardened.
            - Minimal, inadequate error handling.
            - Debug=true in production.
        - Strengths
            - Modularity by using blueprints.
            - Separated out database connection logic.
            - Makes use of secure password hashing and storage.
            - Uses dynamic redirects (url_for) over static in-place URLs.
            - Uses built-in SQL functions like AVG() over manually calculating the average, allowing it to be executed in C.
            - Uses SQL parameterised queries.
    2. flask_project_no1
        - Weaknesses
            - Static and exposed credentials as well as keys. Violates the open security model.
            - Lack of input validation.
            - Nonreuse of connection leads to worsened performance.
            - Lack of type hints and documentation leads to worse maintainability and clarity.
            - Lack of type hints also leads to more error prone code.
            - Lack of early-exit, trap conditions
            - Does not escape template literals in templates, making it vulnerable to XSS
            - Storing of the user role (admin status) in the session rather than fetching from the database/hybrid approach.
            - Does not make use of CSRF tokens to protect the forms.
            - Stores the passwords in plain-text in the database over using a secure hashing function.
            - Depends on the client for management of the login. Once the session is leaked, there is no way to revoke it.
            - A minimal Flask configuration which can be hardened.
            - Minimal, inadequate error handling.
            - Debug=true in production.
        - Strengths
            - The database connection is being reused.
            - The user errors are properly accompanied by the correct HTTP code.
            - Uses dynamic redirects (url_for) over static in-place URLs.
            - More appropriately used short-circuit statements.
            - Uses SQL parameterised queries.
    3. flask_projektas
        - Weaknesses
            - Static application secret key.
            - Lack of input validation.
            - Nonreuse of connection leads to worsened performance.
            - Lack of type hints and documentation leads to worse maintainability and clarity.
            - Lack of type hints also leads to more error prone code.
            - Lack of early-exit, trap conditions
            - Does not escape template literals in templates, making it vulnerable to XSS
            - Storing of the user role (admin status) in the session rather than fetching from the database/hybrid approach.
            - Does not make use of CSRF tokens to protect the forms.
            - A minimal Flask configuration which can be hardened.
            - Stores the passwords as plain-text in the database.
            - Depends on the client for management of the login. Once the session is leaked, there is no way to revoke it.
            - The SQL queries executed are templated using format strings, leaving it vulnerable to SQL injection.
            - The database connection is not being reused.
            - Absolutely zero error handling.
            - Debug=true in production.
        - Strengths
            - However it still has the problem of static as well as exposed credentials as well as keys and somewhat violates the open security model, the credentials are separated into config.py, which we can deem secret.
            - The user errors are properly accompanied by the correct HTTP code.
            - Uses dynamic redirects (url_for) over static in-place URLs.
            - More appropriately used short-circuit statements.
    4. project_1
        - Weaknesses
            - Static application secret key.
            - Lack of input validation.
            - Nonreuse of connection leads to worsened performance.
            - Lack of type hints and documentation leads to worse maintainability and clarity.
            - Lack of type hints also leads to more error prone code.
            - Lack of early-exit, trap conditions
            - Does not escape template literals in templates, making it vulnerable to XSS
            - Storing of the user role (admin status) in the session rather than fetching from the database/hybrid approach.
            - Does not make use of CSRF tokens to protect the forms.
            - Depends on the client for management of the login. Once the session is leaked, there is no way to revoke it.
            - The database connection is not being reused.
            - Calls to 3rd-party libraries without mirroring them, which can lead to a compromise.
            - A minimal Flask configuration which can be hardened.
            - Uses MD5, a deprecated and considered insecure hashing function.
            - Debug=true in production.
        - Strengths
            - However it still has the problem of static as well as exposed credentials as well as keys and somewhat violates the open security model, the credentials are separated into config.py, which we can deem secret.
            - The user errors are properly accompanied by the correct HTTP code.
            - Uses dynamic redirects (url_for) over static in-place URLs.
            - More appropriately used short-circuit statements.
* New
    1. flask_project
        - Did not take the whole checklist into account when they could, for example, enforcing more secure passwords.
        - Improved error handling and validated user input as well as forms.
        - Began using flash messages for error handling. While still a bad way to report errors, it may be clearer for the user.
        - Escaped markup before storing in the database, allowing it to be embedded without reencoding it at runtime.
        - Made better use of trap conditions.
        - Did not follow the open security model even when requested to.
        - Removed debug mode in production.

        Overall improvement: Minimal. Inadequate review of the checklist, however, there are some improvements.
        Applied: 7/32
    2. flask_project_no1
        - Albeit still stored in a file, we can inherit that they meant that the DB credentials may be filled at runtime over static inline credentials.
        - Began using CSP and CSRF.
        - Reviewed the checklist in detail.
        - Validated user input.
        - Began reusing the connection of the database, improving the performance.
        - Made better use of trap conditions.
        - Uses markupsafe to escape markup where needed, alleviating the need to escape at runtime and any XSS attacks.
        - Uses a more hardened Flask configuration, such as the samesite session cookie being strict.
        - Now, hashes the passwords using a secure hashing algorithm.
        - Improved error handling.
        - Removed debug mode in production.

        Overall improvement: Great. Checked every point and clearly put in effort to improve the application, resulting in better security.
        Applied: 26/32
    3. flask_projektas
        - Added input and form validation.
        - Began using parameterised SQL queries for querying the DB over format strings.
        - Hardened the default Flask configuration.
        - Began hashing the password over storing the passwords in plain-text.
        - Added error handling.
        - Began enforcing more secure password policies.
        - Understood the idea behind IDOR, however, understandably did not put effort into it. But did show that they could.
        - Cryptographically secure keys are still unused, while the student claims they are used.
        - Did not try running automated tools to improve their code.
        - Removed debug mode in production.

        Overall improvement: Moderate. Part of it is laziness as stated in the checklist review, which is understandable considering the checklist length.
        Applied: *17/56
            Theoretical applied: 31/56
    4. project_1
        - The application secret key is now generated at runtime and is cryptographically secure.
        - There is sufficient and good input validation.
        - The database connection is now being reused.
        - The database connection is now being reused.
        - The templates are now secure and escape the dynamic parts using the escape filter, alleviating XSS attacks.
        - Now uses both CSRF tokens as well as CSP.
        - Batches SQL requests, improving performance.
        - Calls to 3rd party, although, now has explicit CSP policies to allow only that.
        - The default Flask configuration is now hardened.
        - Now the application uses secure password hashing, the SHA() function, which is more secure than MD5. Positive.
        - Removed debug mode from production.
        - Stronger password policies are enforced.
        - Encryption is now enforced.
        - The application now collects relevant logs securely.
        - The application is more modular.

        Overall improvement: Near perfect. Most of the flaws were alleviated besides some caveats for code quality, however, securitywise, it is majorly improved.
        Applied: 38/56

Conclusion and analysis:

One of the most pervasive issues in the original projects was the poor handling of sensitive data: hardcoded credentials, static secret keys, and exposed passwords. This was present in all four projects and directly betrayed the open security model by exposing sensitive data. Most of these projects adopted more secure practices after the application of the security checklists. Even though the projects did not show major improvement in the mere storage of the credentials, the credentials were separated out of 75% of the code. Proper password hashing was also implemented for more secure password storage. In "project_1," the introduction of a secure password hashing function significantly mitigated the risks from plaintext password storage and weak hashing algorithms like MD5.

Input validation was another critical area that needed much attention. Initially, most of the projects did not validate the inputs given by users; this could result in possible vulnerabilities like injection attacks: SQL injection, Cross-Site Scripting. After following the security checklist, the projects improved their input validation practices significantly, which also improved the error handling.

Another common weakness was the lack of CSRF protection in the original projects. This made the applications vulnerable to CSRF attacks, where malicious users could trick authenticated users into performing unintended actions. After the implementation of the checklist, both "flask_project_no1" and "project_1" added CSRF tokens to their forms, hence mitigating this threat. "project_1" also implemented a content security policy, to avoid XSS attacks and make sure that only trusted resources could run within the app.

Among the key takeaways for the students from the manual reviews of these projects, there came out the much-improved settings in "project_1," which were addressed. Moreover, secret keys and credentials were not only stored in a more secure way, but performance was also optimised by reusing database connections. Improvements in "project_1" were reflected in the high number of applied checklist items - 38 out of 56. Although far from perfect, "project_1" was substantially more secure compared to its initial state and showed a very good understanding of both security principles and best coding practices.

Baseline data shows that indeed all the projects had very serious security problems before the application of the security checklist. The average number of original projects was 58.75 errors per project according to Pyright. On the other hand, for Pylint, the average of the original ones was 7.06/10, and the lowest score belonged to project_1, which was 5.87/10. The Bandit scan also revealed an average count of 0 undefined, 2 low, 2 medium, and 1 high in severity for the four projects. The OWASP ZAP automated scan reported a total average of 3 medium, 3 low, and 3 informational vulnerabilities per project.

Once the security fixes had been applied, several of the statistics shifted significantly. In many cases, the automated scan results were worse or no better than the original findings. While counterintuitive, this is not unusual as systems become increasingly complex, true review requires sophisticated human context to understand and make educated assessments on the code. Although, bandit usually improved, which correlates well with the mitigated security flaws. The security enhancements recommended by the checklists added complexity to the applications, and complexity often begets an increase in lint-time errors. It is worth pointing out, though, that even while these are not directly security-related issues, better code quality, in the sense of clean, organised, and optimized code, usually means fewer security bugs and more efficient mitigations; a fact to which the results of this project proved no exception.

Moreover, manual reviews revealed that there was a very strong correlation between the applied security checklist on improvements in the security of the projects. The most significant improvements directly addressed major vulnerabilities on sensitive data management, input validation, and password security. The projects that followed fewer items in the checklist, such as "flask_project", demonstrated very modest improvements. While basic security measures were implemented, like password hashing and input validation, the lack of comprehensive changes resulted in only minimal improvements. In contrast, "project_1" applied 38 out of 56 checklist items and showed the most robust improvement: more secure session management, improved error handling, and enhanced cryptographic and cybersec practices. These improvements were directly reflected to how many security issues could be detected, however, the mitigations implemented could be hardened as highlighted by automated scan results, manual review, and introspective comments of the students.

Lastly, while originally we found the following common security vulnerabilities prevalent in the code:

1. Project configuration and security.
2. Input validation.
3. Code structure and quality.
4. Session and authentication.
5. Cryptographic practices.
6. Template security.
7. Coding standards and practices.

In the new iteration of code, these were significantly less pervasive, and much better mitigated than before, leaving us with a conclusion that our checklist has positive outcomes in security, however, does not mean that it may improve the code quality.