Tinybird
This tutorial shows how to set up a pipeline to stream traffic events to Upstash Kafka and analyse with Tinybird
In this tutorial series, we will show how to build an end to end real time analytics system. We will stream the traffic (click) events from our web application to Upstash Kafka then we will analyse it on real time. We will implement one simply query with different stream processing tools:
SELECT city, count() FROM kafka_topic_page_views where timestamp > now() - INTERVAL 15 MINUTE group by city
Namely, we will query the number of page views from different cities in last 15 minutes. We keep the query and scenario intentionally simple to make the series easy to understand. But you can easily extend the model for your more complex realtime analytics scenarios.
If you do not have already set up Kafka pipeline, see the first part of series where we did the set up our pipeline including Upstash Kafka and Cloudflare Workers (or Vercel).
In this part of the series, we will showcase how to use Tinybird to run a query on a Kafka topic.
Tinybird Setup
Create a Tinybird account and select a region. Create an
empty workspace. On the wizard screen click Add Data
button.
In the next screen click on Kafka tab and fill the fields with the credentials
copied from Upstash Kafka console. Key is username
, secret is password
.
Select SCRAM-SHA-256
. Then click on Connect
button.
If the connection is successful, then you should see the topic you have just
created. Select it and click Continue
.
In the next screen, you should see data is populated from your Kafka topic. It
should look like the below. Select Latest
and click Create Data Source
Click Create Pipe
on the next screen.
In the next page, you will see the query editor where you can execute queries on your data pipe. You can rename the views. Update the query as (replace the datasource):
SELECT city, count() FROM kafka_ds_534681 where timestamp > now() - INTERVAL 15 MINUTE group by city
You should see the number of page view from cities in last 15 minutes. The good thing with TinyBird is you can chain queries (new transformation node) also you can add multiple data sources (e.g. Kafka topics) and join them in a single query.
If you are happy with your query, click on Create API Endpoint
at top right.
It will create an API endpoint which returns the result for your query.
Copy the curl command and try, you should see the result like below:
curl --compressed -H 'Authorization: Bearer p.eyJ1IjogIjMyMDM1YzdjLTRkOGYtNDA3CJpZCI6ICJlNTY4ZjVjYS1hNjNiLTRlZTItYTFhMi03MjRiNzhiNmE5MWEifQ.3KgyhWoohcr_0HCt6J7y-kt71ZmWOlrjhYyFa7TaUlA' https://api.us-east.tinybird.co/v0/pipes/kafka_ds_534681_pipe_2444.json
{
"meta":
[
{
"name": "city",
"type": "String"
},
{
"name": "count()",
"type": "UInt64"
}
],
"data":
[
{
"city": "San Jose",
"count()": 8
}
],
"rows": 1,
"statistics":
{
"elapsed": 0.000562736,
"rows_read": 8,
"bytes_read": 168
}
}
Conclusion
We have built a simple data pipeline which collect data from edge to Kafka then create real time reports using SQL. Thanks to serverless technologies (Vercel, Upstash, Tinybird), we achieved this without dealing any server or infrastructural configurations. You can easily extend and adapt this example for much more complex architectures and queries.
Was this page helpful?