Proton
This tutorial shows how to integrate Upstash Kafka with Proton
Proton is a unified streaming SQL processing engine which can connect to historical data processing in one single binary. It helps data engineers and platform engineers solve complex real-time analytics use cases, and powers the Timeplus streaming analytics platform.
Both Timeplus and Proton can be integrated with Upstash Kafka. Timeplus provides intuitive web UI to minimize the SQL typing and clicks. Proton provides SQL interface to read/write data for Upstash.
Upstash Kafka Setup
Create a Kafka cluster using Upstash Console or Upstash CLI by following Getting Started.
Create two topics by following the creating topic steps. Let’s name the first topic input
, since we are going to stream from this topic to Proton. The name of the second topic can be output
. This one is going to receive the stream from Proton.
Setup Proton
Proton is a single binary for Linux/Mac, also available as a Docker image. You can download/install it via various options:
- ghcr.io/timeplus-io/proton:latest
- brew tap timeplus-io/timeplus; brew install proton
- curl -sSf https://raw.githubusercontent.com/timeplus-io/proton/develop/install.sh | sh
- or download the binary for Linux/Mac via https://github.com/timeplus-io/proton/releases/tag/v1.3.31
With Docker engine installed on your local machine, pull and run the latest version of the Proton Docker image.
Connect to your proton container and run the proton-client tool to connect to the local Proton server:
Create an External Stream to read Kafka data
External Stream is the key way for Proton to connect to Kafka cluster and read/write data.
Run Streaming SQL
Then you can run the following streaming SQL:
Let’s go to Upstash UI and post a JSON message in input
topic:
Right after the message is posted, you should be able to see it in the Proton query result.
Apply Streaming ETL and Write Data to Upstash Kafka
Cancel the previous streaming SQL and use the following one to mask the IP addresses.
You will see results as below:
To write the data back to Kafka, you need to create a new external stream (with output
as topic name) and use a Materialized View as the background job to write data continously to the output stream.
Go back to the Upstash UI. Create a few more messages in input
topic and you should get them available in output
topic with raw IP addresses masked.
Congratulations! You just setup a streaming ETL with Proton, without any JVM components. Check out https://github.com/timeplus-io/proton for more details or join https://timeplus.com/slack
Was this page helpful?