Detecting differences between JAR files

Detecting differences between JAR files

1 Kommentar zu Detecting differences between JAR files

Sometimes it’s necessary to switch to a newer library version and it would be great to get an idea what actually has been changed in the new release. The following approach has been demonstrated for the SLF4j API during the talk „Yes We Scan – Software Analysis using jQAssistant“ at JavaLand 2015.

Scan the JAR files

We’ll use the jQAssistant distribution as it is available from the web site. After unzipping we can scan the JAR files which we want to compare and start the server:

jqassistant.cmd scan -f slf4j-api-1.6.6.jar
jqassistant.cmd scan -f slf4j-api-1.7.7.jar
jqassistant.cmd server

The Neo4j browser is now available under http://localhost:7474 and we can use it for our analysis:

Which types have changed?

The following query lists all types which are contained in the scanned artifacts:

match
  (artifact:Artifact)-[:CONTAINS]->(type:Type)
return
  artifact.fileName as Artifact, collect(type.fqn) as Types

Slf4j - Types Per Artifact

We could compare the results manually to find out which types have been added or removed between both JAR file but let’s use the database to get this information.

First we’re going to create a „MATCHES“ relation between all types with the same fully qualified name (fqn):

match
  (:Artifact)-[:CONTAINS]->(type:Type)
with
  type, type.fqn as fqn
match
  (matchingType:Type)
where
  type <> matchingType
  and matchingType.fqn = fqn
create unique
  (type)-[:MATCHES]->(matchingType)
return
  type, matchingType

Slf4j - Matching Types

Now it’s easy to find types without a matching type, i.e. those which have been added or removed:

match
  (artifact:Artifact)-[:CONTAINS]->(type:Type)
where
  not (type)-[:MATCHES]-()
return
  artifact.fileName as Artifact, type.fqn as Type

There’s exactly one result: the class „org.slf4j.helpers.SubstituteLogger“ from „slf4j-api-1.7.7.jar“ has no equivalent in „slf4j-api-1.6.6.jar“.

We can also compare the MD5 checksums of matching class files to see if there are changes:

match
  (type:Type)-[:MATCHES]->(matchingType:Type)
where
  not type.md5 = matchingType.md5
return
  type.fqn as ChangedType, type.md5, matchingType.md5

Slf4j - MD5 comparism

Are there changes on method level?

We’re taking the same approach we used on type level to identify matching methods, i.e. by creating a „MATCHES“ relation between two methods if they are declared by matching types and have the same signature:

match
  (type:Type)-[:MATCHES]->(matchingType:Type),
  (type)-[:DECLARES]->(method:Method),
  (matchingType)-[:DECLARES]->(matchingMethod:Method)
where
  method.signature=matchingMethod.signature
create unique
  (method)-[:MATCHES]->(matchingMethod)
return
  type, method, matchingType, matchingMethod

Slf4j - Matching Methods

The following query now returns all methods without a „MATCHES“ relation, i.e. those which have been added or removed:

match
  (artifact:Artifact)-[:CONTAINS]->(type:Type)-[:DECLARES]->(method:Method)
where
  not (method)-[:MATCHES]-()
return
  artifact.fileName as Artifact, type.fqn as Type, collect(method.signature) as Methods
order by
  Type, Artifact

This the result:

Slf4j - Method Changes

Wrap up

The approach is surprisingly simple:

  • Scan the different releases of the same library into the database
  • Execute queries to create „MATCHES“ relations between the elements that are going to be compared
  • Execute a second set of queries which find all elements without a matching equivalent

You can adjust these queries to your needs, e.g. you could restrict the types to be analyzed to interfaces by adding the required conditions, e.g.

  • match (:Artifact)-[:CONTAINS]->(type:Type:Interface) …
  • … where method.visibility=“public“ and not (method)-[:MATCHES]-() …

If you need to execute this kind of comparism on a regular basis it makes sense to put these queries as concepts into rule files (XML or AsciiDoc). This allows to execute them using „jqassistant.sh analyze -concepts=…“ and getting an HTML report with „jqassistant.sh report“.

About the author:

@dirkmahler

1 Comment

  1. Yes We Scan - Exploring Libraries · jQAssistant  - 15. Dezember 2015 - 21:36

    […] versions of the same library can be detected, the approach is already described in one of my former blog posts. Let’s now concentrate on the second part which is about finding potential hotspots and […]

Leave a comment

Back to Top