Detecting differences between JAR files
Detecting differences between JAR files
30. April 2015 1 Kommentar zu Detecting differences between JAR filesSometimes it’s necessary to switch to a newer library version and it would be great to get an idea what actually has been changed in the new release. The following approach has been demonstrated for the SLF4j API during the talk „Yes We Scan – Software Analysis using jQAssistant“ at JavaLand 2015.
Scan the JAR files
We’ll use the jQAssistant distribution as it is available from the web site. After unzipping we can scan the JAR files which we want to compare and start the server:
jqassistant.cmd scan -f slf4j-api-1.6.6.jar jqassistant.cmd scan -f slf4j-api-1.7.7.jar jqassistant.cmd server
The Neo4j browser is now available under http://localhost:7474 and we can use it for our analysis:
Which types have changed?
The following query lists all types which are contained in the scanned artifacts:
match (artifact:Artifact)-[:CONTAINS]->(type:Type) return artifact.fileName as Artifact, collect(type.fqn) as Types
We could compare the results manually to find out which types have been added or removed between both JAR file but let’s use the database to get this information.
First we’re going to create a „MATCHES“ relation between all types with the same fully qualified name (fqn):
match (:Artifact)-[:CONTAINS]->(type:Type) with type, type.fqn as fqn match (matchingType:Type) where type <> matchingType and matchingType.fqn = fqn create unique (type)-[:MATCHES]->(matchingType) return type, matchingType
Now it’s easy to find types without a matching type, i.e. those which have been added or removed:
match (artifact:Artifact)-[:CONTAINS]->(type:Type) where not (type)-[:MATCHES]-() return artifact.fileName as Artifact, type.fqn as Type
There’s exactly one result: the class „org.slf4j.helpers.SubstituteLogger“ from „slf4j-api-1.7.7.jar“ has no equivalent in „slf4j-api-1.6.6.jar“.
We can also compare the MD5 checksums of matching class files to see if there are changes:
match (type:Type)-[:MATCHES]->(matchingType:Type) where not type.md5 = matchingType.md5 return type.fqn as ChangedType, type.md5, matchingType.md5
Are there changes on method level?
We’re taking the same approach we used on type level to identify matching methods, i.e. by creating a „MATCHES“ relation between two methods if they are declared by matching types and have the same signature:
match (type:Type)-[:MATCHES]->(matchingType:Type), (type)-[:DECLARES]->(method:Method), (matchingType)-[:DECLARES]->(matchingMethod:Method) where method.signature=matchingMethod.signature create unique (method)-[:MATCHES]->(matchingMethod) return type, method, matchingType, matchingMethod
The following query now returns all methods without a „MATCHES“ relation, i.e. those which have been added or removed:
match (artifact:Artifact)-[:CONTAINS]->(type:Type)-[:DECLARES]->(method:Method) where not (method)-[:MATCHES]-() return artifact.fileName as Artifact, type.fqn as Type, collect(method.signature) as Methods order by Type, Artifact
This the result:
Wrap up
The approach is surprisingly simple:
- Scan the different releases of the same library into the database
- Execute queries to create „MATCHES“ relations between the elements that are going to be compared
- Execute a second set of queries which find all elements without a matching equivalent
You can adjust these queries to your needs, e.g. you could restrict the types to be analyzed to interfaces by adding the required conditions, e.g.
- match (:Artifact)-[:CONTAINS]->(type:Type:Interface) …
- … where method.visibility=“public“ and not (method)-[:MATCHES]-() …
If you need to execute this kind of comparism on a regular basis it makes sense to put these queries as concepts into rule files (XML or AsciiDoc). This allows to execute them using „jqassistant.sh analyze -concepts=…“ and getting an HTML report with „jqassistant.sh report“.
1 Comment
[…] versions of the same library can be detected, the approach is already described in one of my former blog posts. Let’s now concentrate on the second part which is about finding potential hotspots and […]