MongoDB Sharding and CAP Theorem in Microservices

Week 9: Setup a MongoDB Cluster

What is MongoDB Sharding?

Sharding is the process of distributing data across multiple machines. MongoDB uses sharding to handle deployments with large data volumes and high-performance demands. Horizontal scaling (scale-out) involves adding machines to manage data load, allowing nearly limitless scaling for big data.

Why Sharding?

Sharding is essential when a single server can’t handle high workloads. It allows horizontal scaling, reducing the strain on individual servers. Vertical scaling (adding resources to existing servers) is limited and costly. Sharding follows a shared-nothing architecture where nodes don’t share resources, improving system stability and performance.

Components of a Sharded Cluster

Shard: A replica set with primary and secondary nodes, containing a subset of the data. Mongos: Acts as a query router, directing client requests to the appropriate shards. Config Server Replica Set: Stores sharding metadata, which contains routing information and shard status.

n+FyOBoJaNgKgAAAABJRU5ErkJggg==

wOoKr6A4RDXGwAAAABJRU5ErkJggg==

FGnkSiQSiUQikUgkEonkmUEauRKJRCKRSCQSiUQieWaQRq5EIpFIJBKJRCKRSJ4ZpJErkUgkEolEIpFIJJJnBmnkSiQSiUQikUgkEonkmUEauRKJRCKRSCQSiUQieWb4f0FzbWka3osxAAAAAElFTkSuQmCC

wMA85iVXuXT+gAAAABJRU5ErkJggg==

Shard Key and Data Distribution

MongoDB shards at the collection level using a shard key. The shard key determines document distribution across shards. MongoDB divides data into chunks, with each chunk containing a range of shard key values. Choosing the right shard key is critical for performance and even data distribution.

Sharding Strategy

MongoDB supports three primary sharding strategies: Range-Based Sharding: Distributes documents based on shard key ranges, useful for targeted queries. Hashed Sharding: Distributes documents evenly by hashing shard key values, ensuring balanced data distribution. Zone Sharding: Groups data into zones (e.g., geographical regions), associating each zone with specific shards.

Benefits of Sharding in MongoDB

Increased Read/Write Throughput: Parallel processing across shards improves throughput. High Availability: Replica sets within shards enhance data redundancy and availability. Increased Storage Capacity: Adding shards increases total storage capacity. Data Locality: Zone sharding optimizes data storage by enforcing data residency policies.

CAP Theorem in Microservices Architecture

Introduction to CAP Theorem

Definition: CAP theorem states that a distributed data store can only provide two out of the three guarantees—Consistency (C), Availability (A), and Partition Tolerance (P)—at any given time. Relevance to Microservices: In a microservices architecture, data is distributed and often replicated for performance and resilience. CAP considerations are critical, as microservices must ensure high availability and horizontal scalability.

Understanding CAP Elements in Microservices

Consistency (C): Every read receives the latest write. In microservices, consistency implies that all services should see the same data at any time. Availability (A): Every request receives a response (successful or not). In microservices, each service should respond to requests even if data isn’t fully consistent. Partition Tolerance (P): System operates despite network partitions. Network issues might affect consistency or availability but shouldn’t halt services entirely.

CAP Theorem Scenarios in Microservices

Scenario 1: Prioritizing Consistency over Availability (CP System) Context: Banking services where transaction data consistency is essential. Challenge: Network partitioning risks data consistency, so data may be delayed to ensure accuracy. Outcome: Suitable for critical data integrity but may compromise availability during partitions. Scenario 2: Prioritizing Availability over Consistency (AP System) Context: A recommendation service in e-commerce. Challenge: The service remains available, possibly with slightly outdated recommendations. Outcome: Tolerable consistency trade-offs ensure continuous engagement. Scenario 3: Balancing Partition Tolerance with Availability or Consistency Context: Messaging service in a distributed social media platform. Challenge: Network partitions may delay or cause out-of-order messages. Outcome: Partition tolerance may be prioritized by sacrificing strict message order or availability.

Data Replication and Horizontal Scaling in Microservices

Impact of Data Replication: Enhances availability but may lead to consistency issues. Replicated services may temporarily diverge, impacting consistency until re-synchronization. Horizontal Scaling and Partitions: With more replicas across data centers, partition likelihood increases, possibly leading to stale data or delayed responses.

CAP Theorem

CAP Theorem – Also known as Brewer’s theorem, states that it is impossible for a distributed computer system to simultaneously provide all three attributes: Consistency (all nodes see the same data at the same time), Availability (every request receives a response about whether it was successful or failed), Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)

An Informal Proof of the CAP Theorem

Eric Brewer presented his conjecture in 2000, Gilbert&Lynch proved it formally in 2002. Here is an informal presentation of their proof showing that a highly available and partition tolerant service may be inconsistent. Let us consider two nodes N1 and N2. Both nodes contain the same copy of data D having a value D0. A program W (write) runs on N1. A program R (read) runs on N2. In a regular situation: The program W writes a new value of D, say D1 on N1. Then, N1 sends a message M to N2 to update the copy of D to D1. The program R reads D and returns D1. Assume now the network partitions and the message from N1 to N2 is not delivered. The program R reads D and returns D0

CAP Theorem in Service Context

For a distributed database system as a service, the components of the CAP acronym mean: Consistency – A service is considered to be consistent if after an update operation of some writer all readers see his updates in some shared data source (all nodes containing replicas of a datum have the same value), Availability – A service is considered to have a high availability if it allows read and write operations a high proportion of time (even in the case of a node crush or some hardware or software parts are down due to upgrades) Partition tolerance is the ability of a service to perform expected operations in the presence of a network partition, unless the whole network crushes. A network partition occurs if two or more “islands” of network nodes cannot connect to each other.(Dynamic addition and removal of nodes is also considered to be a network partition.)

CAP Theorem and Cloud Computing

A cloud platform (or at least a large part of it) is an (AP) system. Outages of single nodes (partition tolerance) High Availability (HA) is a must have

GX7zy+30gePXUjiiIj5AwGg8FgMBgMBoPB8Nn4Tyqm0jdAuwlsAAAAAElFTkSuQmCC

A5c6Ohj7bnC6AAAAAElFTkSuQmCC

WdfYAjIRjUEAAAAASUVORK5CYII=

Comments on the CAP Theorem

If applications in a distributed and replicated network have to be: Highly available (i.e. working with minimal latency), and Tolerant to network partitions (no lost messages, undelivered messages, hardware outages, process failures), then It may happen that some of Ni (i = 1, 2, …) nodes have stale data. The decision on which of {CA, CP, AP} property pairs to satisfy is made in early stages of a cloud application design, according to requirements and expectations defined for the application

NoSQL Database Systems

Many cloud databases are not relational. They are termed NoSQL databases (Not only SQL). Generally, NoSQL database systems: Adopt the key-value data model for storing semi structured or unstructured data. Are either scheme less, or their schemes lack rigidity and integrity constraints of relational schemes. Use data partitioning, replication and distribution to several independent network nodes made of commodity machines for: Scalability, Availability, and Network partition tolerance. Do not support joins and use data sharding (horizontal partitioning) to avoid need for joins and achieve a good throughput

CAP Theorem and NoSQL

NoSQL database systems do not support ACID properties. Locking and logging would severely impair scalability, availability, and throughput. Still, Consistency, Availability, and Network Partition Tolerance (CAP) represent their desirable properties. But, according the CAP Theorem, also known as Brewer’s Conjecture, any networked, shared data system can have at most two of these three desirable properties

Classes of CAP DB Architectures

CA A single server or shared disk cluster RDBMS fortifies network partitioning tolerance for the sake of consistency and availability. Scaling is also limited. CP A distributed database with distributed pessimistic locking fortifies availability, because of locking. AP A distributed database with a shared nothing architecture with no distributed pessimistic locking fortifies consistency, for the sake of availability and network partition tolerance. NoSQL database systems fall in the AP class. Users of web applications expect the AP behavior from the backing cloud database

Comments on CAP Properties

Contrary to traditional databases, many cloud database are not expected to satisfy any integrity constraints. High availability of a service is often characterized by small latency. Amazon claims that raise of 0.1 sec in their response time will cost them 1% in sales. Google claims that just .5 sec in latency caused traffic to drop by 20%. It is hard to draw a clear border between availability and network partitions. Network partitions may induce delays in operations of a service, and hence a greater latency

BASE Properties

BASE (as defined by Eric Brewer) stands for: BA (Basically Available) meaning that the system is always available, although there may be intervals of time when some of its components are not available. A basically available system answers each request in such a way that it appears that system works correctly and is always available and responsive. S (Soft State) means that the system has not to be in a consistent state after each transaction. The database state may change without user’s intervention (due to eventual consistency). E (Eventually Consistent) guarantees that: If no new updates are made to a given data item, accesses to each replica of that item will eventually return the last updated value. A system that has achieved eventual consistency is often said to have converged, or achieved replica convergence

Eventual Consistency

An eventually consistent system may provide additional guaranties to its clients: Read Your Own Writes Consistency (RYOW) – A client sees his own updates to one server, even if he reads from a different server. Updates by other clients are not visible instantly. Session Consistency – Is a RYOW limited to a session scope (writes and reads from the same server). Casual Consistency – If a client reads version x and updates it to version y, all other see both x and y

BASE Properties

The reconciliation of differences between multiple replicas requires exchanging and comparing versions of data between nodes. Usually “last writer wins”. The reconciliation is performed as a: read, write, or asynchronous repair. Asynchronous repairs are performed in a background non transactional process. Stale and approximate data in answers are tolerated

Observations from experiment “Trade-off is not clear! We suspect consistent read is less scalable and slower under datacenter failures. However, we’ve not observed any differences.”

Summary

NoSQL cloud databases are scalable, highly available and tolerate network partitions. Cloud database system can have only two out of Consistency, Availability, and Network Partition Tolerance. BASE represents a “best effort” approach to ensuring database reliability. BASE forfeits ACID properties, also by its name (base versus acid, as found in chemistry). Eventual Consistency is a model used in distributed systems that guarantees that all accesses to replicas of an updated item will eventually return the last updated value