Exploration And Visualization Of Software Architecture
Working with large, badly documented or even partly unknown code bases can be less painful for developers and architects if they can obtain knowledge about the de-facto architecture of the system. jQAssistant therefore provides very flexible capabilities to perform individual exploration and visualization of software structures.
The term „evolved structures“ very often describes – with or without ironic undertone – very clearly the state of Java applications which have been developed over several years in the enterprise context and which have seen lots of different developers coming and going during their lifetime. A recurring problem for the people who maintain such projects are questions like: Which fundamental code structures exist in the code base, how are these related to each other and to which degree does this information correspond to the architecture documentation (if existing) or the mental model in the heads of the developers (hopefully existing)?
jQAssistant provides the possibility to scan existing software structures and to store them into a graph database. Queries allow to enrich the data by so called „concepts“ (e.g. abstractions like „Module“) and to create reports in textual or graphical representation which provide insight into the „status quo“ of an application. How this approach works in practice shall be explained by using an example.
Netflix provides many of its developed solutions under open source licenses. One if these is „Eureka“, a discovery service for microservices. The server can be obtained as WAR artifact from Maven central (groupId: com.netflix.eureka, artifactId: eureka-server) and shall be analysed using the jQAssistant command line distribution which is available as ZIP distribution from this web site. After unpacking it the WAR file can be scanned using the following command:
jqassistant.distribution-1.1.3/bin/jqassistant.sh scan -f eureka-server-1.4.2.war
As a result tge folder structure „jqassistant/store“ is created within the current working directory which contains the database. Now the integrated Neo4j server can be started:
Artifacts And Their Relations
The first question to answer is which files are contained within the WAR file:
MATCH (:Web:Archive)-[:CONTAINS]->(file:File RETURN file.fileName as fileName ORDER BY fileName
By exploring the result it can be observed that the archive does not contain any class file directly but a lot of JAR files. It would be interesting to see which dependencies exist between them. This information is not directly available from the gathered data but can be enriched by applying concepts, i.e. executing pre-defined queries.
Concepts are provided in XML or Asciidoc files and the latter will be used for this example. Therefore a file „jqassistant/rules/eureka.adoc“ (based on the current working directory) with the following content must be created:
= Eureka Example == Artifact Dependencies [[artifactDependencies]] .Propagates dependencies on type level to the artifact level. [source,cypher,role=concept,requiresConcepts="classpath:ResolveDependency"] ---- MATCH (a1:Artifact)-[:CONTAINS]->(t1:Type), (a2:Artifact)-[:CONTAINS]->(t2:Type), (t1)-[:DEPENDS_ON]->(t2) WHERE a1 <> a2 MERGE (a1)-[:DEPENDS_ON]->(a2) RETURN a1, collect(a2) ----
Asciidoc is an easy-to-learn but at the same time very powerful markup language for creating HTML or PDF documents. It is used by jQAssistant to embed executable rules, e.g. concepts or constraints, in developer documentation. In this case it is a Cypher source code block which provides a unique ID, a short description and meta-information.
It describes a concept „artifactDependencies“ which creates a relation DEPENDS_ON between two artifacts a1 and a2 (e.g. JAR files) if a1 contains a Java type that depends on another Java type which is contained in a2. The concept itself is based on another one called „classpath:ResolveDependency“ which is provided by the Java plugin of jQAssistant and which propagates DEPENDS_ON relations between types which are located in different artifacts.
If the server is still running it must be stopped by pressing <Enter>. The concepts can now be applied by executing the analyze task:
jqassistant.distribution-1.1.3/bin/jqassistant.sh analyze -concepts artifactDependencies
After restarting the server the new relations can be determined using the following query:
MATCH (a1:Artifact)-[:DEPENDS_ON]->(a2:Artifact) RETURN a1.fileName, collect(a2.fileName)
The presented result is complete and correct but in most cases a graphical representation would be much more desirable. Therefore jQAssistant provides a GraphML report plugin which exports the result of a concept (i.e. of an executed query) as a GraphML file which can be visualized by other tools like yEd or Gephi.
More Graph Please!
The following concept is added to the end of the Asciidoc file:
[[artifactDependencies.graphml]] .Creates a GraphML report for artifact dependencies [source,cypher,role=concept,requiresConcepts="artifactDependencies"] ---- MATCH (:Web:Archive)-[:CONTAINS]->(artifact:Artifact) OPTIONAL MATCH // equivalent to left-outer-join in SQL (artifact)-[dependsOn:DEPENDS_ON]->(:Artifact) RETURN artifact, dependsOn ----
The suffix „.graphml“ of the ID „artifactDependencies.graphml“ is detected by the GraphML report plugin and interpreted such that the result of the query shall be exported. The execution of
jqassistant.distribution-1.1.3/bin/jqassistant.sh analyze -concepts artifactDependencies.graphml
creates a file „jqassistant/report/artifactDependencies.graphml“ which can be opened with yEd. After applying a hierarchical layout (Layout → Hierarchical) the following visualization is created:
Zooming in reveals more details:
The graph is quite extensive and therefore a bit unwieldy. But it can already be observed that artifacts with lots of outgoing dependencies are rendered on top (e.g. Netflix artifacts like eureka-core or netflix-eventbus) and those with mostly incoming dependencies are located at the bottom (mainly libraries and APIs like stax-api or http-core).
Sometimes Less Is More
The latter usually play a minor role for a first analysis – „own“ artifacts and their dependencies are of much more significance. In this context usually the terms „internal“ and „external“ dependencies are used, e.g. in Maven projects. But how can these be distinguished?
In this example the Eureka server of Netflix is the subject under investigation. It seems likely that internal artifacts provide package structures with appropriate naming schemes. In fact it can be observed that executing the query
MATCH (:Web:Archive)-[:CONTAINS]->(artifact)-[:CONTAINS]->(package:Package) RETURN artifact.fileName, collect(package.fqn)
in the Neo4j server returns packages name with the prefix „com.netflix“ for Netflix-owned artifacts. Using that knowledge two additional concepts may be added to eureka.adoc:
[[internalArtifact]] .Labels all artifacts containing the package "com.netflix" as "Internal". [source,cypher,role=concept] ---- MATCH (:Web:Archive)-[:CONTAINS]->(artifact:Artifact)-[:CONTAINS]->(package:Package) WHERE package.fqn = "com.netflix" SET artifact:Internal RETURN artifact ---- [[internalArtifactDependencies.graphml]] .Creates a GraphML report for internal artifact dependencies. [source,cypher,role=concept,requiresConcepts="artifactDependencies,internalArtifact"] ---- MATCH (artifact:Artifact:Internal) OPTIONAL MATCH (artifact)-[dependsOn:DEPENDS_ON]->(:Artifact:Internal) RETURN artifact, dependsOn ----
The concept „internalArtifact“ adds a label „Internal“ to all artifacts containing a package „com.netflix“. It forms the base for the concept „internalArtifactDependencies.graphml“ which uses that label for creating the filtered GraphML report. This can be triggered by executing the following command:
jqassistant.distribution-1.1.3/bin/jqassistant.sh analyze -concepts internalArtifactDependencies.graphml
After opening the created file „jqassistant/report/internalArtifactDependencies.graphml“ using yEd and applying the hierachical layout the following image can be seen – it provides a useful architectural overview:
jQAssistant allows scanning of artifacts, inspecting their structures by using queries and creating textual or graphical reports as needed. The user has all freedom to enrich the data and filter the output according to his individual needs (e.g. internal and external artifacts). By creating concepts these steps may be automated and can even be integrated into a build process. This allows continuous reporting about the current state of the architecture.
The approach has been demonstrated on the level of artifacts and this – by the way – is only the very little tip of the iceberg. There are more analysis scenarios at the user’s hand like detecting cross-cutting concerns (e.g. logging, injection), which technologies are actually used (e.g. JPA provider, REST frameworks) or which impact changing the version of a library could have. If needed the scope of the queries can be narrowed down to other levels like packages or classes. Examples might be reports about actual dependencies between Spring components (e.g. controllers, services and repositories) or which parts of a large library (e.g. Google Guava) are actually used – there are quite no limits to the imagination!