Every distribution inspected to extract out specific library versions and key dependencies.
This is helpful when you are tracking an open source security issue, want to know a version of a library to know if a feature is available, or to minimize dependency graph with your own java software that leverages kafka-client or kafka-streams libraries.
Additional cross-referencing with release notes and GitHub repositories to track differences between Confluent Community and Apache Kafka releases.
The goal of this effort is to make it easier to understand the dependencies of the various Kafka and Confluent Community distributions and to aid in an CVE audit.
Process
Most of the information pulled into these documents is based on a script like the following:
for i in $(ls -dr kafka_*.tgz); do
gzcat $i | tar tfv - | grep "/NOTICE$"
gzcat $i | tar tfv - | (echo $i; grep -E 'libs/scala-library|libs/rocksdbjni-|libs/zstd-jni|libs/lz4-java|libs/snappy-java|libs/jackson-core|libs/slf4j-api|libs/slf4j-log4j|libs/zookeeper-[23]' | cut -d\/ -f2-; echo "")
done
for i in $(ls -dr confluent-community-*.tar); do
gzcat $i | tar tfv - | grep "/README$"
gzcat $i | tar tfv - | (echo $i; grep -E 'kafka/scala-library|kafka/rocksdbjni-|kafka/zstd-jni|kafka/lz4-java|kafka/snappy-java|kafka/jackson-core|libs/slf4j-api|kafka/slf4j-log4j|kafka/zookeeper-[23]|schema-registry/avro|schema-registry/protobuf' | cut -d\/ -f2-; echo "")
done
Addition effort is done inspecting release notes and GitHub (Apache’s and Confluent’s repositories).
Transparency
These scripts provide to showcase the process for our documentation; if you find an issue please let us know and we will update accordingly.
