Dec 3, 2020
2 min read

Cloud-Native Apache Pulsar 2.7 Supports Transactions and Azure Blob Storage Offloader

Penghui Li
Engineering Lead of Messaging Team, StreamNative
Jennifer Huang
head img

Apache Pulsar is a cloud-native and distributed messaging and streaming platform originally created in Yahoo! and now a top-level Apache project. The latest 2.7 version supports transactions, Azure Blob storage offloader, topic-level policy, and more. The new version enables event streaming applications to consume, process, and produce messages in one atomic operation and also allows Pulsar users to offload their historical data to Azure Cloud.

Main features in the new release include:

  1. Pulsar transactions
  2. Azure Blob Storage Offloader
  3. Topic level policy
  4. Upgrade of Apache BookKeeper to version 4.12
  5. OAuth2 authentication
  6. Native protobuf Schema
  7. 30+ Pulsar Functions Enhancement
    ……

Transactions

Transactional semantics enable event streaming applications to consume, process, and produce messages in one atomic operation. With transactions, Pulsar achieves the exactly-once semantics for a single partition and multiple partitions as well. This enables new use cases with Pulsar where a client (either as a producer or consumer) can work with messages across multiple topics and partitions and ensure those messages will all be processed as a single unit. This will strengthen the message delivery semantics of Apache Pulsar and processing guarantees for Pulsar Functions.

Currently, Pulsar transactions are in developer preview. The community will work further to enhance the feature to be used in the production environment soon.

Azure Blob storage offloader

Pulsar 2.7.0 supports Azure Blob storage offloader. With this offloader, users can offload their historical data to Azure Blob Storage. It greatly benefits Azure Cloud users, and effectively reduces the cost of managing massive historical data in BookKeeper. Pulsar will add more support on Azure Cloud in the upcoming releases.

Topic-level Policy

Pulsar 2.7.0 introduces the system topic which can maintain all policy change events to achieve the topic level policy. All policies at the namespace level are now also available at the topic level, so users can set different policies at the topic level flexibly without using lots of metadata service resources. The topic level policy enables users to manage topics more flexibly and adds no burden to ZooKeeper.

Deep Dive on Pulsar and Kafka Benchmark

CSDN spoke with Penghui Li, an Apache Pulsar PMC, about the Pulsar benchmark report they published recently.

Question: You recently wrote Benchmarking Pulsar and Kafka, why do you want to conduct the benchmark?

Penghui Li: This year, Confluent ran a benchmark to evaluate how Kafka, Pulsar, and RabbitMQ compare in terms of throughput and latency. According to Confluent, Kafka was the "fastest" in all scenarios. Given our knowledge of Pulsar's capabilities, this did not seem accurate.

For the community, we have already met many users who hope to get official benchmark results for reference, and even the performance comparison with other messaging systems. So we think this is also an opportunity to push us to do this. So we set out to repeat the benchmark.

Taking a deeper look at Confluent's benchmark, we noticed a number of issues with the setup, framework, and methodology. We identified and fixed these issues and also added additional test parameters that would provide insights on more real-world use cases. You can read the full benchmark.

Although in the test results, Pulsar is better than Kafka in many aspects of latency. But we still think that this cannot cover all user scenarios. Different physical resource environments may get completely different results, we also recommend that users have a better understanding of Pulsar's design and performance-related knowledge, this will allow Pulsar to perform better in a real environment. We have published a whitepaper which introduces many aspects of the performance tuning of Pulsar. You can read the full whitepaper.

High performance is only one aspect of Pulsar. Pulsar has advanced architecture, better scalability, and easy operations and maintenance. We sincerely invite you to download Pulsar and try it out, and you will have a better understanding of Pulsar. To download the Apache Pulsar 2.7.0, click here.

For more information on the new release, check out the release notes on Pulsar website.

Penghui Li
Penghui Li is passionate about helping organizations to architect and implement messaging services. Prior to StreamNative, Penghui was a Software Engineer at Zhaopin.com, where he was the leading Pulsar advocate and helped the company adopt and implement the technology. He is an Apache Pulsar Committer and PMC member. Penghui lives in Beijing, China.
Jennifer Huang
Jennifer Huang is an Apache Pulsar committer. She works as a senior content strategist at StreamNative, responsible for Apache Pulsar documentation and growth of the community.

Related articles

Apr 11, 2024
5 min read

The New CAP Theorem for Data Streaming: Understanding the Trade-offs Between Cost, Availability, and Performance

Mar 31, 2024
5 min read

Data Streaming Trends from Kafka Summit London 2024

Newsletter

Our strategies and tactics delivered right to your inbox

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Apache Pulsar Announcements
Pulsar Releases