LogoLogo
CloudDiscordGitHub
  • 👉Getting Started
    • Introduction
    • Quick start
    • Learn by example
    • Case studies
    • How to contribute?
  • ⭐Memphis Broker
    • Architecture
    • Key concepts
      • Message broker
      • Station
      • Producer API
      • Consumer API
      • Consumer Group
      • Storage and Redundancy
      • Security/Authentication
      • Scaling
      • Ordering
      • Dead-letter Station (DLS)
      • Delayed messages
      • Data exchange
      • Idempotency (Duplicate processing)
      • Failover Scenarios
      • Troubleshooting process
      • Connectors
    • Best practices
      • Producer optimization
      • Compression
    • Memphis configuration
    • Comparisons
      • NATS Jetstream vs Memphis
      • RabbitMQ vs Memphis
      • AWS SQS vs Memphis
      • Apache Kafka vs Memphis
      • Apache Pulsar vs Memphis
      • ZeroMQ vs Memphis
      • Apache NiFi vs Memphis
    • Privacy Policy
  • ⭐Memphis Schemaverse
    • Overview
    • Getting started
      • Management
      • Produce/Consume
        • Protobuf
        • JSON Schema
        • GraphQL
        • Avro
    • Comparison
    • KB
  • 📦Open-Source Installation
    • Kubernetes
      • 1 - Installation
      • 2 - Access
      • 3 - Upgrade
      • Terraform
        • Deploy on AWS
        • Deploy on GCP
        • Deploy on DigitalOcean
      • Guides
        • Deploy/Upgrade Memphis utilizing predefined secrets
        • Monitoring/Alerts Recommendations
        • Production Best Practices
        • NGINX Ingress Controller and Cloud-Agnostic Memphis Deployments
        • Migrate Memphis storage between storageClass's
        • Expanding Memphis Disk Storage
        • Scale-out Memphis cluster
        • TLS - Deploy Memphis with TLS Connection to Metadata Frontend
        • TLS - Memphis TLS websocket configuration
        • TLS - Securing Memphis Client with TLS
        • Installing Memphis with an External Metadata Database
    • Docker
      • 1 - Installation
      • 2 - Access
      • 3 - Upgrade
    • Open-source Support
  • Client Libraries
    • REST (Webhook)
    • Node.js / TypeScript / NestJS
    • Go
    • Python
    • Kotlin (Community)
    • .NET
    • Java
    • Rust (Community)
    • NATS
    • Scala
  • 🔌Integrations Center
    • Index
    • Processing
      • Zapier
    • Change data Capture (CDC)
      • Debezium
    • Monitoring
      • Datadog
      • Grafana
    • Notifications
      • Slack
    • Storage tiering
      • S3-Compatible Object Storage
    • Source code
      • GitHub
    • Other platforms
      • Argo
  • 🗒️Release notes
    • KB
    • Releases
      • v1.4.3 - latest/stable
      • v1.4.2
      • v1.4.1
      • v1.4.0
      • v1.3.1
      • v1.3.0
      • v1.2.0
      • v1.1.1
      • v1.1.0
      • v1.0.3
      • v1.0.2
      • v1.0.1
      • V1.0.0 - GA
      • v0.4.5 - beta
      • v0.4.4 - beta
      • v0.4.3 - beta
      • v0.4.2 - beta
      • v0.4.1 - beta
      • v0.4.0 - beta
      • v0.3.6 - beta
      • v0.3.5 - beta
      • v0.3.0 - beta
      • v0.2.2 - beta
      • v0.2.1 - beta
      • v0.2.0 - beta
      • v0.1.0 - beta
Powered by GitBook
LogoLogo

Legal

  • Terms of Service
  • Privacy Policy

All rights reserved to Memphis.dev 2023

On this page
  • What is Idempotency?
  • The problem with retries
  • Producer side - How to avoid?
  • How does it work internally?
  • Step 1: Set up idempotency cache time
  • Step 2: Set up messages IDs
  • Consumer group side - How to avoid?

Was this helpful?

  1. Memphis Broker
  2. Key concepts

Idempotency (Duplicate processing)

What is it? And how to avoid it.

Last updated 2 years ago

Was this helpful?

How to avoid duplicate producing and consuming the same message more than once.

What is Idempotency?

Idempotence, in programming and mathematics, is a calculation of some operations such that no matter how many times you execute them, you achieve the same result.

In a streaming environment, retrying to send a failed message often includes a small risk that both messages were successfully written to the broker, leading to duplicates.

The problem with retries

On the producer side, this can happen as illustrated below.

  1. The producer sends a message to Memphis

  2. The message was successfully written and stored

  3. Network issues prevented the broker acknowledgment from reaching the producer

  4. The producer will treat the lack of acknowledgment as a temporary network issue and retry sending the message (since it can’t know it was received).

  5. In that case, the broker will end up having the same message twice.

On the consumer side, this can happen as illustrated below.

  1. Consumer group consume message from Memphis

  2. Process the message

  3. A sudden issue like missing resources/network failure made the CG not return an acknowledgment to the broker

  4. Based on maxAckTimeMs parameter Memphis broker decides to retransmit the same message

  5. The consumer will process the same message again

Producer side - How to avoid?

Producer idempotence ensures that duplicate messages are not introduced due to unexpected retries.

With an idempotency producer, the process will take place as illustrated below.

How does it work internally?

  1. The producer sets a unique ID for each message

  2. The SDK attaches the ID to the message

  3. Inside the station, there is a table stored in the cache that stores all the ingested IDs

  4. In case a message gets produced with an ID that is already stored in that table, it will be dropped

  5. The table resets every defined time as configured upon station creation

Step 1: Set up idempotency cache time

Via the GUI during station creation.

Or via the different SDKs.

idempotencyWindowMs: 0, // defaults to 120000

Step 2: Set up messages IDs

Producer becomes idempotence once adding IDs to the messages.

For example:

await producer.produce({
    message: '<bytes array>/object', // Uint8Arrays / You can send object in case your station is schema validated
    ackWaitSec: 15, // defaults to 15
    msgId: "fdfd" // defaults to null
});

Consumer group side - How to avoid?

Consumer idempotence ensures that duplicate messages are not introduced due to unexpected retries.

To avoid the situation, it is recommended to use idempotence producers and set maxMsgDeliveries to 1 on the consumer side.

Search terms: Consumed multiple times, duplicate message

As explained in "", the timer is responsible for the retention of the messages IDs table. For example, if 3 hours are chosen, the table will be emptied every three hours.

By configuring maxMsgDeliveries to 1, in a sudden failure of the consumer in a CG, the entire CG will not receive the same message again, and it will be stored automatically in the for supervised retries.

DLS
How does it work internally?
⭐
GUI example
Page cover image