This section describes Memphis Station entity
A station is a distributed unit that stores messages. Similar to Kafka's topics and RabbitMQ's queues. Each station has a retention policy, which defines when and how messages will be removed from the station—for example, by the number of stored messages, store time, or total size.
Each station is distributed across one or more Memphis brokers, depending on the number of configured station replicas. Data will be poured in a RAID-1 manner.
How replicas work
A station is a virtual entity that resides on a type of file called "stream" which stores the data. Stream files are stored on the broker's memory or non-volatile storage, based on the user's configuration per station.
the Memphis is based on NATS Jetstream, which makes use of RAFT algorithm as a non-quorum consensus algorithm.
Raft is a consensus algorithm designed as an alternative to the Paxos family of algorithms. Raft offers a generic way to distribute a state machine across a cluster of computing systems, ensuring that each node in the cluster agrees upon the same state transitions.
Raft is not a Byzantine fault-tolerant algorithm: the nodes trust the elected leader.
Each station stores a stream component with a single leader on the most available broker for consensus reasons. In case of broker failure, the leader role will be transferred to a follower in configured replicas.
Choosing memory persistency will improve performance, while disk-based persistency will provide higher availability.
Available in cluster mode only. During station creation, the user can choose the number of station replicas. Replicas are an exact mirror of the entire station data, and each produced message will be mirrored across the configured replicas. Each replica will be stored on a different broker; therefore, the maximum number of replicas is derived from the number of brokers in a cluster.
Replicas can be defined using the SDK, GUI, or CLI.
The number of replicas cannot be changed after station creation. (Will be in the future)
Message broker, by design, has a "temporary" nature. Messages can, but not by default deleted when acknowledged to enable new or other consumers from different consumer groups to consume the stored messages.
To avoid filling out the station, we must choose a retention policy per station that defines the condition that will trigger Memphis to remove messages from a station.
- Number of messages The station will only retain the last X-produced messages.
- Size High threshold for station capacity in bytes.
- Time Each produced message receives a dedicated timer and will be removed after hitting the configured time.
- Ack Messages will be removed from a station once acknowledged by all the connected consumer groups. Great for exactly-once semantics.
A partition is a division of a station or its constituent elements into distinct independent parts. Partitioning is normally done for manageability, performance, and/or availability reasons, and for load balancing.
It is popular in distributed systems, where each partition may be spread over multiple nodes, with users of a station performing local transactions on the partition.
Memphis partitions were released on v1.2.
Partitions are the main method of concurrency for stations. A station will be broken into multiple partitions across one or more memphis brokers.
- 1.Round-Robin (Default) - Messages will be produced or consumed to or from a station in a round-robin fashion, meaning each time, a message will be produced or consumed from a different partition. In this way, without a specific logic in place, data will be striped across the created partitions.
- 2.Partition Key - Producing / Consuming messages with a specified partition key that will allow producers/consumers to work with a single partition. Memphis uses the popular
Murmur3hash function to convert partition keys into a partition ID. *Note - Changing the number of partitions will affect the partition key calculation.
- 1.Custom Partition Key Function - Ability to provide a custom callback function to calculate the partition ID from a given partition key.
- 2.Partition IDs - Specify a partition ID or a subset of IDs for the producer/consumer to work with. If a number of IDs are given, producers/consumers will work in a round-robin fashion with the specified partitions.
If a station has only one partition, messages will be produced/consumed from that partition only. Specifying a partition key or a custom function will be ignored and won’t break the described behavior.
Within each client library manual, you can find where and how to define partitions.
Partitions are defined during station creation, and currently, there is no change needed from the client's side to work with the created partitions.
- 1.The number of partitions cannot be changed after station creation. It will be available in the future.
- 2.Currently, producers cannot select which partition to write.
- 3.Currently, consumers cannot select which partition to read, and it will take place in a round-robin manner.
- 4.Ordering is guaranteed on the partition level. In case a station is configured with more than one partition, ordering will be transformed into partition level and not the station.
- 1.Partition-level produce and consumption.
- 2.Partition's key assignment. To enable dynamic consumer listening.
Ordering is guaranteed only while working with a single consumer group.
As seen in the illustration below, each consumer group will receive all the messages stored within the station.
Searched terms: max message, max message size, retention, Retention