Cloudera and StreamNative have made open source a connector that integrates Apache NiFi and Apache Pulsar. The connector will be included with the Cloudera platform. NiFi and Pulsar can be combined to create a cloud-native streaming data platform capable of ingesting, transforming, and analysing massive amounts of data.
Cloudera’s team includes some of the original Apache NiFi developers, whereas StreamNative was founded by the original Apache Pulsar creators. NiFi was created from the NSA’s “Niagara Files” technology and made available to the Apache Software Foundation via the NSA Technology Transfer Program. NiFi is a flow-based programming visual tool that can be used to create data flows that move data from one technological platform to another, such as databases, cloud storage, and messaging systems.
NiFi provides data provenance and traceability at the event level. The NiFi platform comes with over 100 pre-built processors. Apache Pulsar is a cloud-native, distributed messaging and streaming platform that was developed at Yahoo! and is now an Apache Software Foundation top-level project. To provide long-term stream storage, Pulsar employs a replicated distributed ledger.
The new tool allows Apache NiFi users to consume and produce messages from Pulsar topics at scale using simple configuration settings. After the data is stored in Pulsar, it can be made available to stream processing engines like Flink or Spark.
The tool will be available on Cloudera beginning with CDF on the Public Cloud version 7.2.14. Developers who want to use the processors in other Apache NiFi clusters can get the files from a maven central repository or build them from the source code on GitHub.