It is difficult to write consistent and high quality code when using libraries/sdks from multiple sources and when development is distributed between several teams and multiple time zones. Many challenges exist for both new and experienced developers including lack of documentation, insufficient unit test coverage and nuances to each platform/sdk that make things different. It becomes necessary for developers of one platform to understand complicated legacy code of an unfamiliar platform. To make things more complex, it may be written in a language they do not understand well. It is estimated that upto 60-80% of programmer’s time is spent to maintain a system and 50% of that maintenance effort is spent understanding that program (http://www.bauhaus-stuttgart.de/bauhaus/index-english.html).
It is helpful for developers to have tools that can analyze different codebases quickly. A rather comprehensive list of source code analyzer tools for each platform is listed here: (http://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis) . Since an in-depth comparison of the multitude of analyzer tools is beyond the scope of this article, all figures and analysis were done by using Understand by Scitools, a typical albeit premium source code analysis tool.
Understand by SciTools
Understand by SciTools has the ability to scan Ada, Cobol, C/C++, C#, Fortran, Objective-C, Java, Jovial, Pascal, PL/M, Python and others. Like many multi-language analyzer programs it is not free, however, the benefits of such a program are enormous. (For the purposes of this demonstration, a deprecated and unused codebase was analyzed.)
After source code files have been parsed, you’ll see multi-windowed view like the following:
Figure 1. Parsing of project sources
Figure 2. IDE source windows.
Pros of Source Code Analyzer tool
The ability to see how a variable/method in the project is analyzed, how it is used (or unused) in the project, what methods call it or what methods are called by it are a quick way to get reference information. This can also be done quite easily in various IDEs of the respective languages – Xcode, Eclipse, IntelliJ, etc. What sets premium multi-language source code analysis tools apart from IDEs is the ability to see graphically how the source code is structured and to be able to run metrics on them as a whole. In Figure 3, for example, we notice that the androidTestApp has 113 references to the androidAdAgent library. The flurryAdTestApp has about 25 dependencies on such project. Further analysis of this project reveals that the flurryAdTestApp is a generic sample project for testing ad functionality while the androidTestApp is a more universal testing application. There are many more benefits to know this internal dependencies – for example, knowing how complex a dependency exists makes it easier to understand how much QA is required if such code is refactored.
Figure 3. Internal architecture complexity
Figure 4. UML class diagram
Figure 4 shows the overall UML class structure of the project. This is particularly useful if you need to refactor or re-engineer your code base to specific design patterns. Some of these analyzer tools allow you to directly manipulate the uml diagrams and the underlying code structure is changed as a result.
Figure 5. Unused variables and parameters
The ability to see unused variables and parameters is particularly useful in reducing code bloat and keeping the codebase lean.
Figure 6. Check code for various coding standards.
Quite a few of these premium analyzer tools have code check algorithms that allow you to be notified of overly complex algorithms (big-O, number of lines, cyclomatic dependencies, and unused code paths). Overly complex programs are difficult to comprehend and have many possible paths making them difficult to test and validate. Most analyzer tools allow you to record/program your own macros for determining proper code validation:
– Improper use of .equals() and .hashCode()
– Unsafe casts
– When something will always be null
– Possible StackOverflows
– Possible ignored exceptions
– Cyclomatic Compexity (modified, strict)
Cons of Source Code Analyzer tool
One drawback of using such source code analysis tools is that you have to configure the project to find all relevant sources for each project type – Failure to have a project properly configured could result in too much or too little detail that’s useful. Source code that is significantly modular across libraries would be difficult to analyze. Also, many tools that are out there are quite one-dimensional in that they may check coding style but may not be able to provide detailed analysis on code complexity and improper algorithmic complexity. One open source tool that shows promise is sonar (http://www.sonarsource.org) but it’s big drawback is that it requires a web server and a database. Other possible issues are that some code analyzers analyze the bytecode while others analyze the source. Whatever source analyzer tool is chosen it may not be comprehensive enough for the organizational needs.
At Flurry, we have multiple sdk codebases – Objective-C, java (android), BlackBerry10, windows phone 8/7, windows 8, HTML5. Each additional analyzer tool needs a significant learning curve. We try to keep the tools to a minimum while still trying to get coverage on the different codebases.
Source Code Analysis As Part of the Development Environment
There are many programs that analyze source code but only few provide support for multiple languages and are open-source. The most popular ones for java are PMD, FindBugs and Checkstyle. However a tool that is simple, multi-language and open-source would be ideal.
The ability to easily see the UML structure of a program, understand it’s code complexity, and see all the dependencies with the click of a button can easily replace invalid comments and outdated documentation. A good source analyzer tool is only one part of a toolbox that should be available to the developer. Unit tests, code reviews, and pair programming should always have priority but a source code tool can definitely help the developer (both new and experienced) to keep track of large codebase that he may be working on.
Author: Richard Brett