1 Introduction
- we found that by using machine learning (ML) to perform a line-level code analysis it is possible to match the accuracy of static code analysis when identifying violations of guidelines requiring to understand a limited code context (i.e., a single line or a few lines of code),
- we show that using active learning for sampling training data provides the most accurate results in recognizing violations of company-specific coding guidelines and allows reducing the effort required to train the ML-based tool (a smaller number of training examples was needed to achieve the same (or higher) prediction quality as in the case of manual selection of examples),
- we show that the frequencies of tokens are a valuable source of information while recognizing code violations and allow to perform this task without the need for parsing or compiling the code,
- we show that the approach works both on industry-wide standards applied to open source and on the company-specific, proprietary guidelines applied to professionally developed code from two large companies. Therefore, using the ML-based tool can help to reduce the effort of manual code review, and
- we report observations from the in situ application of ML to analyze code in industrial environments that could help practitioners to adopt ML-based approaches in their companies (i.e., the strategies to minimize the effort of labeling data, the effect that the guidelines and code evolution can have on the accuracy of an ML-based tool).
2 Related Work
2.1 Comparison Between Tools
2.2 Machine Learning for Static Code Analysis
2.3 Machine Learning for Code-Smell Detection
2.4 Summary
Tool | Supported Languages | Compilation Requirements | Extensibility | Configurability | Interoperability | Access to Results |
---|---|---|---|---|---|---|
Static Code Analysis Tools | ||||||
Coverity Scan | C/C++ | parsing/ (linking) | imperative | N/A | stand-alone/ IDE/ collab. tools | API |
KlocWork | C/C++ | parsing/ (linking) | imperative | rules by parameter/ ruleset | stand-alone/ IDE/ collab. tools | API |
PolySpace | C/C++ | parsing/ (linking) | N/A | ruleset | stand-alone/ IDE | N/A |
Splint | C | parsing/linking | declarative by rule | N/A | stand-alone | no external |
CPPcheck | C/C++ | parsing/linking | declarative by rule | other | stand-alone/ IDE/ collab. tools | file-export |
Flawfinder | C/C++ | robust | declarative by rule | other | stand-alone/ IDE | file-export |
Style Checkers | ||||||
CodeCheck | C/C++ | parsing | imperative | N/A | stand-alone | file-export |
Uncrustify | C/C++ | parsing | no | rules by parameter | stand-alone/ IDE | file-export (config.) |
KWStyle | C/C++ | parsing/linking | no | rules by parameter | stand-alone/ collab. tool | N/A |
C++ Style Checker | C++ | parsing/linking | N/A | ruleset | stand-alone | file-export |
Learning/Example-Based Approaches | ||||||
Naturalize | C/C++ | parsing/linking | no | rules by example | stand-alone/ IDE/ collab. tools | N/A |
Code Style Analytics | N/A | parsing/linking | no | rules by example | N/A | N/A |
CCFlex | C/C++ | robust | declarative by example | ruleset | stand-alone / collab. tools | file-export |
3 The CCFlex Tool
3.1 Architecture
@!
”, then the line “@! int MyVAR = 10;
” would be recognized as a line violating the coding guideline.3.2 Feature-Extraction Filters
int a
vs. class a
).3.3 Classification Algorithms
3.4 Active Learning
4 Research Methodology and Design
Cycle | Diagnosing | Action planning | Action taking / Executing | Evaluating | Learning |
---|---|---|---|---|---|
1 | What coding guidelines are used in the industry? | We planned to conduct a document analysis at partner companies to understand how they design their coding guidelines and the content of these guidelines. | We changed the way of grouping of guidelines—from thematic to scope-based; we analyzed 45 and 66 guidelines and classified them according to this new grouping. | We found that some guidelines need quality improvement because of their ambiguity or difficulty to make the assessment. | Most of the coding guidelines are different between the companies, which means that the potential tool has to be adapted and tuned on a per-company basis. |
2 | Which of available tools can be adapted to recognize violations of code guidelines assuming that code might not parse or compile? | We planned to review the most popular tools for C/C++ code analysis and compare extendability, readability and usability of the tools to find a tool that is easy to extend, does not require parsing or compiling the code, and can recognize code guidelines violations in code. | We reviewed 13 tools for C/C++ code analysis. We refactored CCFlex (a machine-learning-based tool) and performed a benchmark study on a problem of finding violations of popular Sun and Google coding conventions for Java by comparing on 3 open source projects. | CCFlex could recognize 98.98%–99.93% of lines violating any of the Java Sun’s and Google’s coding style guidelines | Since we could recognize almost all of the lines with violations, we set off to check the guidelines from the industrial partners. |
3 | What is the accuracy of the recognition of the violations of guidelines in the industrial settings and how it depends on coding style? | We planned to assess different situations: old/legacy codebase (before the guidelines), modern codebase (currently under development) and in-between (code which was developed alongside the development of the guidelines). | We analyzed three codebases of Company A; we assessed 3 out of 45 guidelines; using different configurations of CCFlex and a limited number of iterations. | We could achieve satisfactory results of up to 87% Recall for the evaluated guidelines. | We needed to conduct a formal evaluation with Company B to first evaluate the generalizability of the findings and secondly investigate if we are able to reduce the number of false-positives. |
Cycle | Diagnosing | Action planning | Action taking / Executing | Evaluating | Learning |
4 | How much training of CCFlex is needed to minimize the percentage of false-positives? | We selected one or two guidelines per type of rule (taxonomy) and defined a procedure on how to sample code for training. | We conducted the assessment of seven rules at Company A and used two different strategies for selecting lines in the training set—manual selection and Active Learning; we assessed the code quality of a product. We evaluated the approach on a large codebase of over three million SLOC. The number of trials varied from three to seven and the F-score varied from 0.04 to 1.00 | We found that the identified violations were accurate and the architects used them in their quality improvement after the study. | Active learning allows to achieve higher F-score in fewer iterations and provides higher F-score for the entire codebase. |
5 Execution and Results
5.1 Action Research Cycle 1 – What Coding Guidelines are used by our Industrial Partners?
5.1.1 Cycle Goal and Research Procedure
- understand the types of rules that are in the companies’ guidebooks,
- check the quality of the rules (whether they are unambiguous and their violations can be found by analyzing code).
5.1.2 Cycle Execution and Results
union
).- rules as documentation: no style-related coding guidelines, but rather hints about what libraries or interfaces to use or what protocols to follow when calling an interface,
- optional rules: either a whole rule or its part is optional to follow,
- rules on external information: rules that require information outside of the studied code, e.g. user requirements.
5.2 Action Research Cycle 2 – Selecting a Tool Capable of Recognizing Code Guidelines Violations of our Partners
5.2.1 Cycle Goal and Research Procedure
- the proposed solution needs to be easy to extend or modify without the need of learning any API or having a deep understanding of static code analysis techniques,
- running the code analysis should not require parsing or compiling the code.
Violation of the Sun’s guidelines | All% | Eclipse% | Jasper% | Spring% |
---|---|---|---|---|
Line is longer than 80 characters (Line properties) | 40.70 | 55.04 | 20.13 | 62.59 |
Parameter should be final (Uni-line context) | 19.41 | 26.37 | 11.79 | 25.72 |
’{’ should be on the previous line (Multi-line context) | 13.67 | 29.31 | ||
Missing a Javadoc comment (Multi-line context) | 10.67 | 12.52 | 9.77 | 10.32 |
Class designed for extension without Javadoc (Design semantics) | 9.03 | 6.93 | 12.88 | 4.29 |
Line has trailing spaces (Uni-line context) | 8.78 | 18.83 | ||
Expected @param tag (Multi-line context) | 5.73 | 9.26 | 2.93 | 7.00 |
Hidden field (Files context) | 3.90 | 2.73 | 3.03 | 6.75 |
Variable must be private and have accessor methods (Multi-line context) | 2.73 | 1.76 | 4.68 | 0.22 |
Symbol is not followed by whitespace (Uni-line context) | 2.50 | 2.30 | 3.94 | 0.07 |
File contains tab characters (this is the first instance) (Keyword) | 2.40 | 3.40 | 1.35 | 3.25 |
Expected an @return tag (Multi-line context) | 2.27 | 0.07 | 1.45 | 6.17 |
’}’ should be on the same line as the next part of a multi-block statement (Multi-line context) | 1.72 | 0.33 | 1.01 | 4.51 |
Avoid inline conditionals (Uni-line context) | 1.45 | 2.36 | 0.55 | 2.09 |
’if’ construct must use ’{}’s (Multi-line context) | 0.85 | 3.06 | ||
Expected @throws tag (Multi-line context) | 0.69 | 0.87 | 1.12 | |
First sentence should end with a period (Multi-line context) | 0.61 | 1.20 | 0.02 | 1.05 |
Symbol should be on a new line (Multi-line context) | 0.52 | 0.10 | 1.84 | |
Symbol is followed by whitespace (Uni-line context) | 0.50 | 0.87 | 0.55 | |
Name must match pattern ’∧[A-Z][A-Z0-9]*(_[A-Z0-9]+)*$’ (Uni-line context) | 0.46 | 0.67 | 0.42 | 0.32 |
Redundant ’final’ modifier (Multi-line context) | 0.34 | 1.23 | ||
Symbol is not preceded with whitespace (Uni-line context) | 0.34 | 0.77 | 0.26 | 0.04 |
Unused import (Multi-line context) | 0.34 | 0.20 | 0.10 | 0.94 |
Redundant ’public’ modifier (Multi-line context) | 0.30 | 0.60 | 0.18 | 0.18 |
Symbol is preceded with whitespace (Uni-line context) | 0.29 | 0.03 | 0.57 | 0.04 |
Magic number (Uni-line context) | 0.20 | 0.47 | 0.06 | 0.18 |
’for’ construct must use ’{}’s (Multi-line context) | 0.14 | 0.50 | ||
’static’ modifier out of order with the JLS suggestions (Uni-line context) | 0.10 | 0.27 | 0.04 | 0.04 |
File does not end with a newline (Multi-line context) | 0.10 | 0.30 | 0.04 | |
Class should be declared as final (Uni-line context) | 0.08 | 0.30 | ||
Avoid nested blocks (Multi-line context) | 0.07 | 0.03 | 0.14 | |
Extra HTML tag found (Multi-line context) | 0.06 | 0.17 | 0.07 | |
Utility classes should not have a public or default constructor (Design semantics) | 0.05 | 0.17 | ||
Unused @param tag (Multi-line context) | 0.04 | 0.13 | ||
Inner assignments should be avoided (Multi-line context) | 0.03 | 0.10 | ||
Unclosed HTML tag found (Multi-line context) | 0.03 | 0.03 | 0.07 | |
Unknown tag (Uni-line context) | 0.03 | 0.10 | ||
’else’ construct must use ’{}’s (Multi-line context) | 0.02 | 0.07 | ||
Expression can be simplified (Multi-line context) | 0.02 | 0.07 | ||
Method length greater than 150 (Multi-line context) | 0.02 | 0.07 | ||
Unable to get class information for @throws tag (Checkstyle error) | 0.02 | 0.07 | ||
’protected’ modifier out of order with the JLS suggestions (Uni-line context) | 0.01 | 0.03 | ||
’public’ modifier out of order with the JLS suggestions (Uni-line context) | 0.01 | 0.03 | ||
Array brackets at illegal position (Uni-line context) | 0.01 | 0.03 | ||
Comment matches to-do format ’TODO:’ (Multi-line context) | 0.01 | 0.03 | ||
Redundant ’private’ modifier (Multi-line context) | 0.01 | 0.02 | ||
Switch without ”default” clause (Multi-line context) | 0.01 | 0.02 |
Violation of the Google Java style | All% | Eclipse% | Jasper% | Spring% |
---|---|---|---|---|
Line contains a tab character (Keyword) | 98.83 | 99.42 | 97.91 | 99.16 |
Incorrect indentation level (Multi-line context) | 59.48 | 55.05 | 69.01 | 53.82 |
’{’ should be on the previous line (Multi-line context) | 4.81 | 14.02 | ||
Line is longer than 100 characters (Line properties) | 3.83 | 3.47 | 2.66 | 5.66 |
First sentence of Javadoc is incomplete (period is missing) or not present (Multi-line context) | 1.36 | 1.18 | 2.36 | 0.40 |
At-clause should have a non-empty description (Uni-line context) | 1.03 | 2.43 | 0.41 | 0.01 |
< p > tag should be preceded with an empty line (Multi-line context) | 0.75 | 0.18 | 0.08 | 2.24 |
’}’ should be on the same line as the next part of a multi-block statement (Multi-line context) | 0.61 | 0.09 | 0.48 | 1.39 |
’package’ should be separated from previous statement (Uni-line context) | 0.31 | 0.25 | 0.63 | 0.01 |
’if’ construct must use ’{}’s (Multi-line context) | 0.30 | 0.82 | ||
Whitespace around a symbol is not followed by whitespace (Uni-line context) | 0.28 | 0.46 | 0.33 | |
Abbreviation in name must contain no more than ’2’ consecutive capital letters (Uni-line context) | 0.22 | 0.04 | 0.60 | |
Missing a Javadoc comment (Multi-line context) | 0.18 | 0.11 | 0.40 | 0.02 |
Symbol should be on a new line (Uni-line context) | 0.18 | 0.05 | 0.55 | |
Wrong lexicographical order for import (Uni-line context) | 0.16 | 0.14 | 0.22 | 0.11 |
Whitespace around a symbol is not preceded by whitespace (Uni-line context) | 0.12 | 0.21 | 0.12 | 0.01 |
Member name must match pattern ’∧[a-z][a-z0-9][a-zA-Z0-9]*$’ (Uni-line context) | 0.11 | 0.12 | 0.19 | |
< p > tag should be placed immediately before the first word. with no space after (Uni-line context) | 0.10 | 0.20 | 0.09 | |
’)’ is preceded with whitespace (Uni-line context) | 0.10 | 0.01 | 0.27 | |
’(’ is followed by whitespace (Uni-line context) | 0.09 | 0.01 | 0.27 | |
’for’ construct must use ’’s (Multi-line context) | 0.05 | 0.13 | ||
’static’ modifier out of order with the JLS suggestions (Uni-line context) | 0.04 | 0.07 | 0.02 | 0.01 |
Empty line should be followed by <p> tag on the next line (Multi-line context) | 0.04 | 0.04 | 0.04 | 0.03 |
Overload methods should not be split (Multi-line context) | 0.04 | 0.01 | 0.08 | 0.02 |
Empty catch block (Multi-line context) | 0.03 | 0.07 | 0.01 | |
Javadoc comment has parse error (Multi-line context) | 0.02 | 0.04 | 0.03 | |
At-clauses have to appear in the order ’[@param. @return. @throws. @deprecated]’ (Multi-line context) | 0.02 | 0.04 | 0.01 | |
Distance between variable declaration and its first usage is more than ’3’ (Multi-line context) | 0.02 | 0.04 | ||
’METHOD_DEF’ should be separated from previous statement (Multi-line context) | 0.01 | 0.04 | ||
Single-line Javadoc comment should be multi-line (Uni-line context) | 0.01 | 0.04 | ||
Local variable name must match pattern ’∧[a-z]([a-z0-9][a-zA-Z0-9]*)?$’ (Multi-line context) | 0.01 | 0.03 | ||
’else’ construct must use ’{}’s (Multi-line context) | 0.01 | 0.02 | ||
Each variable declaration must be in its own statement (Uni-line context) | 0.01 | 0.02 | ||
Parameter must match pattern ’∧[a-z]([a-z0-9][a-zA-Z0-9]*)?$’ (Uni-line context) | 0.01 | 0.02 | ||
’CTOR_DEF’ should be separated from previous statement (Multi-line context) | 0.00 | 0.01 | ||
’protected’ modifier out of order with the JLS suggestions (Uni-line context) | 0.00 | 0.01 | ||
’public’ modifier out of order with the JLS suggestions (Uni-line context) | 0.00 | 0.01 | ||
Array brackets at illegal position (Uni-line context) | 0.00 | 0.01 | ||
Catch parameter name must match pattern ’∧[a-z]([a-z0-9][a-zA-Z0-9]*)?$’ (Uni-line context) | 0.00 | 0.01 | ||
GenericWhitespace ’>’ is followed by whitespace (Uni-line context) | 0.00 | 0.01 | ||
Redundant < p > tag (Multi-line context) | 0.00 | 0.01 | ||
Top-level class BookmarkStack has to reside in its own source file (Files context) | 0.00 | 0.01 | ||
Switch without ”default” clause (Multi-line context) | 0.00 | 0.01 |
5.2.2 Cycle Execution and Results
- All–Sun (Count: 10,821, Ignore: 34,481)
- Eclipse-Sun (Count: 3,003, Ignore: 12,585)
- Jasper-Sun (Count: 5,046, Ignore: 9,799)
- Spring-Sun (Count: 2,772, Ignore: 12,097)
- All–Google (Count: 30,727, Ignore: 14,575)
- Eclipse-Google (Count: 11,157, Ignore: 4,431)
- Jasper-Google (Count: 10,546, Ignore: 4,299)
- Spring-Google (Count: 9,024, Ignore: 5,845)
- All–Google’ (Count: 4,423, Ignore: 40,879)
- Eclipse-Google’ (Count: 1,066, Ignore: 14,522)
- Jasper-Google’ (Count: 2,365, Ignore: 12,480)
- Spring-Google’ (Count: 992, Ignore: 13,877)
Guidelines | Dataset | Accuracy % | Precision | Recall | F-score |
---|---|---|---|---|---|
Sun | All | 99.54 ± 0.11 | 1.00 ± 0.00 | 1.00 ± 0.00 | 1.00 ± 0.00 |
Sun | Eclipse | 99.05 ± 0.27 | 0.99 ± 0.00 | 0.99 ± 0.00 | 0.99 ± 0.00 |
Sun | Jasper | 99.55 ± 0.20 | 1.00 ± 0.00 | 1.00 ± 0.00 | 1.00 ± 0.00 |
Sun | Spring | 99.15 ± 0.22 | 0.99 ± 0.00 | 0.99 ± 0.00 | 0.99 ± 0.00 |
Google | All | 99.87 ± 0.06 | 1.00 ± 0.00 | 1.00 ± 0.00 | 1.00 ± 0.00 |
Google | Eclipse | 99.93 ± 0.06 | 1.00 ± 0.00 | 1.00 ± 0.00 | 1.00 ± 0.00 |
Google | Jasper | 99.70 ± 0.14 | 1.00 ± 0.00 | 1.00 ± 0.00 | 1.00 ± 0.00 |
Google | Spring | 99.88 ± 0.11 | 1.00 ± 0.00 | 1.00 ± 0.00 | 1.00 ± 0.00 |
Google’ | All | 98.98 ± 0.17 | 0.99 ± 0.00 | 0.99 ± 0.00 | 0.99 ± 0.00 |
Google’ | Eclipse | 99.26 ± 0.29 | 0.99 ± 0.00 | 0.99 ± 0.00 | 0.99 ± 0.00 |
Google’ | Jasper | 99.03 ± 0.26 | 0.99 ± 0.00 | 0.99 ± 0.00 | 0.99 ± 0.00 |
Google’ | Spring | 98.91 ± 0.29 | 0.99 ± 0.00 | 0.99 ± 0.00 | 0.99 ± 0.00 |
Rule | N | Accuracy % | Precision | Recall | F-score |
---|---|---|---|---|---|
’{’ should be on the previous line. (Multi-line) | 1,479 | 99.98 | 0.998 | 0.997 | 0.997 |
Line is longer than 80 characters (Line properties) | 4,404 | 99.93 | 0.997 | 0.996 | 0.996 |
Avoid inline conditionals. (Uni-line) | 153 | 100.00 | 0.997 | 0.991 | 0.994 |
’for’ construct must use ’{}’s. (Multi-line) | 15 | 100.00 | 1.000 | 0.987 | 0.993 |
Parameter should be final (Uni-line) | 1,597 | 99.90 | 0.983 | 0.990 | 0.987 |
Missing a Javadoc comment. (Multi-line) | 1,153 | 99.88 | 0.979 | 0.975 | 0.977 |
Variable must be private and have accessor methods (Multi-line) | 295 | 99.95 | 0.971 | 0.949 | 0.960 |
Line has trailing spaces. (Uni-line) | 950 | 99.82 | 0.958 | 0.954 | 0.956 |
’static’ modifier out of order with the JLS suggestions. (Uni-line) | 11 | 100.00 | 1.000 | 0.909 | 0.952 |
Class designed for extension without Javadoc (Design semantics) | 977 | 99.79 | 0.938 | 0.965 | 0.951 |
’if’ construct must use ’{}’s. (Multi-line) | 92 | 99.98 | 0.906 | 0.983 | 0.943 |
Symbol is preceded with whitespace. (Uni-line) | 31 | 99.99 | 0.966 | 0.919 | 0.942 |
’}’ should be on the same line as the next part of a multi-block statement (Multi-line) | 186 | 99.93 | 0.884 | 0.949 | 0.915 |
Name must match pattern ’ˆ[A-Z][A-Z0-9]*(_[A-Z0-9]+)*$’ (Uni-line) | 50 | 99.97 | 0.810 | 0.922 | 0.862 |
Expected @throws tag (Multi-line) | 74 | 99.95 | 0.838 | 0.866 | 0.852 |
Symbol is followed by whitespace. (Uni-line) | 48 | 99.97 | 0.822 | 0.871 | 0.845 |
Symbol is not followed by whitespace. (Uni-line) | 253 | 99.81 | 0.823 | 0.831 | 0.827 |
Expected an @return tag. (Multi-line) | 246 | 99.79 | 0.814 | 0.802 | 0.808 |
Expected @param tag (Multi-line) | 508 | 99.55 | 0.796 | 0.801 | 0.799 |
Redundant ’final’ modifier. (Multi-line) | 37 | 99.96 | 0.760 | 0.805 | 0.781 |
Hidden field (Files context) | 388 | 99.57 | 0.752 | 0.744 | 0.748 |
Symbol should be on a new line. (Multi-line) | 56 | 99.94 | 0.782 | 0.709 | 0.743 |
Symbol is not preceded with whitespace. (Uni-line) | 35 | 99.95 | 0.699 | 0.683 | 0.690 |
First sentence should end with a period. (Multi-line) | 66 | 99.91 | 0.686 | 0.679 | 0.682 |
Redundant ’public’ modifier. (Multi-line) | 32 | 99.95 | 0.623 | 0.606 | 0.614 |
Magic number (Uni-line) | 22 | 99.92 | 0.268 | 0.382 | 0.315 |
Unused import (Multi-line) | 37 | 99.85 | 0.034 | 0.032 | 0.033 |
Rule | N | Accuracy % | Precision | Recall | F-score |
---|---|---|---|---|---|
Line contains a tab character. (Keyword) | 30,366 | 100.00 | 1.000 | 1.000 | 1.000 |
’(’ is followed by whitespace. (Uni-line) | 29 | 100.00 | 1.000 | 1.000 | 1.000 |
’{’ should be on the previous line (Multi-line) | 1,479 | 99.98 | 0.997 | 0.998 | 0.997 |
Incorrect indentation level (Multi-line) | 18,277 | 99.51 | 0.993 | 0.995 | 0.994 |
At-clause should have a non-empty description. (Uni-line) | 315 | 99.99 | 0.990 | 0.995 | 0.992 |
’package’ should be separated from previous statement. (Uni-line) | 95 | 99.99 | 0.979 | 0.989 | 0.984 |
’for’ construct must use ’{}’s. (Multi-line) | 15 | 100.00 | 1.000 | 0.940 | 0.969 |
’)’ is preceded with whitespace. (Uni-line) | 30 | 100.00 | 0.967 | 0.963 | 0.965 |
’if’ construct must use ’{}’s. (Multi-line) | 92 | 99.98 | 0.929 | 0.977 | 0.952 |
Line is longer than 100 characters (Line properties) | 1,178 | 99.73 | 0.954 | 0.940 | 0.947 |
First sentence of Javadoc is incomplete (period is missing) or not present. (Multi-line) | 417 | 99.88 | 0.909 | 0.963 | 0.935 |
’}’ should be on the same line as the next part of a multi-block statement (Multi-line) | 186 | 99.92 | 0.867 | 0.957 | 0.910 |
<p> tag should be preceded with an empty line. (Multi-line) | 230 | 99.91 | 0.918 | 0.900 | 0.909 |
’static’ modifier out of order with the JLS suggestions. (Uni-line) | 11 | 100.00 | 1.000 | 0.818 | 0.900 |
<p> tag should be placed immediately before the first word, with no space after. (Uni-line) | 31 | 99.97 | 0.791 | 0.790 | 0.790 |
Abbreviation in name must contain no more than ’2’ consecutive capital letters (Uni-line) | 68 | 99.93 | 0.769 | 0.794 | 0.782 |
Member name must match pattern ’ˆ[a-z][a-z0-9][a-zA-Z0-9]*$’ (Uni-line) | 33 | 99.97 | 0.775 | 0.758 | 0.766 |
Missing a Javadoc comment. (Multi-line) | 56 | 99.93 | 0.718 | 0.721 | 0.719 |
Symbol should be on a new line (Uni-line) | 55 | 99.92 | 0.702 | 0.658 | 0.679 |
Whitespace around a symbol is not followed by whitespace (Uni-line) | 73 | 99.88 | 0.632 | 0.641 | 0.636 |
Whitespace around a symbol is not preceded by whitespace (Uni-line) | 35 | 99.94 | 0.620 | 0.643 | 0.631 |
Empty line should be followed by <p> tag on the next line. (Multi-line) | 11 | 99.98 | 0.651 | 0.545 | 0.592 |
Wrong lexicographical order for import (Uni-line) | 49 | 99.85 | 0.310 | 0.306 | 0.308 |
Overload methods should not be split (Multi-line) | 11 | 99.96 | 0.000 | 0.000 | 0.000 |
- CCFlex was able to achieve high accuracy of identifying lines violating the coding guidelines for the simplified problem of recognizing lines violating any of the rules and for the problem of identifying violations of particular rules. However, for the latter problem, we observed rules for which the tool failed to learn to recognize their violations.
- CCFlex was able to achieve high accuracy even for the smallest datasets. Therefore, it seemed that the CCFlex could also be used to identify lines violating similar coding standards even if the lines had to be labeled manually.
- Sun’s and Google’s guidelines could be mapped to the proposed taxonomy. However, by comparing these guidelines and Java Open Source code to guidelines and C/C++ code of our industrial partners, the latter seemed to be more complicated and richer when it comes to syntax and language constructs being used. Therefore, we perceived the accuracy observed in this study as an upper bound of what we could expect for the study on the code of our partners.
5.3 Action Research Cycle 3 – how can we Recognize the Violations Provided by the Industrial Partners?
5.3.1 Cycle Goal and Research Procedure
- Pre-processor directives must be placed at the beginning of an empty line, and must never be indented (semantics, uni-line context). We chose this rule because it requires understanding the position of specific tokens in a line.
- For public enumerations, the members of enum should follow the pattern, i.e., the name of the component, underscore, and the value name, for example
ComponentName_ValueOne = 0
(semantics, multi-line context). We chose this rule because it requires to understand the context, i.e. recognition of the lines within ”enum” blocks. We expected that this type of recognition could be difficult for CCFlex as it is primarily designed for one-line rules. - Names of the variables should follow the so-called camel case format — each word or abbreviation in the middle of the name begins with a capital letter (semantics, uni-line context). We expected that this type of recognition could be difficult for CCFlex as it requires to understand the concept of lower and upper cases and the fact that the token represents the variable name.
- Bag of words — to explore whether this way of providing meaning to the constructs allows teaching the tool quicker.
- Active Learning — to explore the ease-of-use of providing examples line-by-line (suggested by active learning) rather than manually.
- Adding new features — to explore how important it is to have the right set of features for the decision tree (CART) algorithm used in CCFlex; in particular whether it is better to rely on bag-of-words or on adding the ability to recognize specific keywords.
5.3.2 Cycle Execution and Results
5.4 Action Research Cycle 4 – how much Training of CCFlex is Required to Reduce the Percentage of False-Positives?
5.4.1 Cycle Goal and Research Procedure
- 120 characters—line length must not exceed 120 characters (semantics-free, line properties).
- Braces in compound statements—braces must be used for all compound statements (semantics, multi-line context)—this rule helps to control the readability of the code and thus minimize programming mistakes.
- Do not use variants—software units must not have variants at build-time (semantics-free, keyword)—this rule helps to assure that
#ifdef
pre-processor statements are used scarcely to minimize the need for understanding which code is compiled during each build. - Named constants—named constants must be used (semantics, uni-line context)—instead of using untyped
#define
pre-processor directive, this rule helps to enforce usage of constants, which are typed. - One statement per line—only one statement per line of code is allowed (semantics, uni-line context)—this rule helps to enforce the simplicity of the code and reduce the cognitive burden when reading the code.
- Use Enum classes—C++ 11 Enum classes must be used instead of traditional
enum
types (semantics-free, keyword)—instead ofenum
types, the code should useenum
classes, which can enforce constructors, typing, and destructors. - Use constants instead of macros—C++ constructs must be used instead of pre-processor macros (semantics-free, keyword).
5.4.2 Cycle Execution and Results
enum
or comment).
5.5 Summary of the Results
6 Validity Evaluation
isECUPresent()
or (ii) isEcuPresent()
. Consulting the practitioners led to diverse opinions. For these cases, we chose to include the second version as correct. For all cases like this, we consulted the practitioners and discussed them in our research team to minimize the researcher bias.7 Conclusions
- we were able to train the ML-based tool by using the maximum of around 700 SLOC to achieve the average F-score of 0.78. Although we obtained a high Recall (0.97 or higher) for all of the rules (often by using only 300 SLOC), it was usually at the cost of high false-positive rates (Precision ranged from 0.21 to 1.00 depending on the rule). The best results were obtained for the rules requiring understanding the context of a single line (semantical uni-line context, semantic-free line properties, and keywords) while the rules requiring to understand the context of multiple lines were far more difficult to train.
- the ML-based tool was able to recognize code guidelines violations by using features extracted directly from the text (e.g., frequencies of tokens) without the need for parsing or compiling the code,
- we observed that the best strategy for training the tool to recognize violations of company-specific guidelines was to start with the examples provided in the companies’ code guidebook and then use Active Learning to poll lines from a sample of the codebase to label.
- we have learned that using ML-based code analysis tools bring new challenges when it comes to maintenance in comparison to static-code-analysis tools (that require source code modification) which is maintaining examples in training codebase.