Cassandra and Consistency - A Simplified Explaination
Cassandra
is architected for read-write (RW) anywhere, enabling any client to connect to
any node in any datacenter (DC). The concept of a primary or single master node
does not exist in Cassandra. This allows for drastically increased RW times as
master node is not taking all the traffic. The flip side is that there is not a
single source of truth (SSOT) as you would see in a RDBMS such as MySQL where
you assume the master or primary is always correct. Cassandra has abilities to
manage this though.
REPLICATION FACTOR (RF): This determines how many
copies of data exists.
CONSISTENCY LEVEL (CL): This determines how many
nodes must acknowledge a read or write before an acknowledgement of a write
commit or the data is returned. Examples would be ALL, ONE, QUORUM, etc.
Effectively a level of assurance that a write occurred or your read is against
the freshest data.
Let’s
look at two nodes for example.
WRITES: If the data is simply being
split between the two nodes evenly and goes down, you have effectively lost
half your data. To overcome this for writes, you would need to set the CL to
ALL (or TWO) to ensure data was written to both nodes, or set it to ONE with an
RF of TWO to ensure data was copied to the other node. At first this seems like
plausible strategy, but there are a couple issues. If the CL is set to the
total number of nodes or ALL and one goes down, you do not meet the CL to do an
operation and it will fail, which leaves us with only the option of doing a CL
of ONE with an RL of 2, which works for writes.
READS: The problem with the CL=1
and RL=2 strategy comes from the reads. With a CL of 1, when an object is updated
it will replicate, but a read on that data may occur before the replication
happens. If the update occurs on NODE A and the client checks NODE B, how does
it know the data is fresh or if it is waiting on replication? The only way is
to compare the timestamp to another node to ensure consistency. Once again the
issue then becomes is you have a CL of ALL or TWO, a single node outage prohibits
this analysis, and disables reads.
So effectively
with just two nodes, if a single node goes down due to issue or maintenance, you
either sacrifice redundancy or consistency.
QUORUM: A majority of nodes.
CL does
not just support ALL or a hard number set, it also supports a QUORUM. This can
be a local or all-encompassing majority when you look at regionally separated,
fully redundant DCs for example. With a three node cluster, the majority, or
quorum, would be two. This then allows for assured redundancy and consistency,
even in the event of a single node outage.
As to
why three, because it is the minimum number to ensure redundancy, consistency,
and the capability to do a quorum. With Cassandra the more nodes the better
however. If you have enough resources to stand up 5 Cassandra nodes in a DC,
there is no downside that I know of.
edirne
ReplyDeletetrabzon
adana
yozgat
7CUS