Cassandra database performance characteristics and Cassandra database design and maintenance

Cassandra Overview

Cassandra is an open source distributed NoSQL database system. Originally developed by Facebook, it is used to store simple format data such as inboxes, and combines the data model of GoogleBigTable with the fully distributed architecture of Amazon Dynamo. Cassandra's name comes from Greek mythology and is the name of a tragic female prophet in Troy, so the project's logo is a shining eye. Facebook opened Cassandra in 2008. Since then, Cassandra has become a popular distributed structured data storage solution due to its good scalability and adoption by well-known Web2.0 sites such as Digg and Twitter. Cassandra became the Incubator project of the Apache Software Foundation in 2009 and went out of the incubator in February 2010 to become a formal foundation project.

Cassandra function introduction

The main feature of Cassandra is that it is not a database, but a distributed network service composed of a bunch of database nodes. A write operation to Cassandra will be copied to other nodes, and the read operation of Cassandra will also be Route to a node to read. For a Cassandra cluster, scaling performance is a relatively simple matter, just add nodes to the cluster.

There are many reasons to choose Cassandra for your website. Compared with other databases, there are three outstanding features:

Flexible mode

With Cassandra, like document storage, you don't have to resolve the fields in the record in advance. You can add or remove fields as you wish while the system is running. This is an amazing efficiency boost, especially in large deployments.

True (high) scalability

Cassandra is a horizontal extension in pure sense. To add more capacity to the cluster, you can point to another computer. You don't have to restart any processes, change application queries, or manually migrate any data. () can help you add more hardware at any time to add more customers and more data to your needs.

Multiple data center identification

You can adjust your node layout to avoid a data center fire, and an alternate data center will have at least a full copy of each record.

Rigid structure

Cassandra does not have a single point of failure, it can be used for mission-critical applications that cannot withstand failure.

Fast linear scale performance

Cassandra is linearly scalable. It can increase throughput because it can help you increase the number of nodes in your cluster. Therefore, it maintains a fast response time.

Fault tolerance

Cassandra is fault tolerant. Suppose there are 4 nodes in the cluster, where each node has a copy of the same data. If one node is no longer serving, the other three nodes can service as requested.

Flexible data storage

Cassandra supports all possible data formats such as structured, semi-structured and unstructured. It can help you change the data structure as needed.

Simple data distribution

Data distribution in Cassandra is very simple because it provides the flexibility to distribute the required data by copying data across multiple data centers.

Transaction support

Cassandra supports transactions such as atomicity, consistency, isolation, and persistence (ACID).

Fast write

Cassandra is designed to run on inexpensive commodity hardware. It performs fast writes and can store hundreds of terabytes of data without sacrificing read efficiency.

Some other features that make Cassandra competitive:

Range query

If you don't like all key-value queries, you can set the range of keys to query.

List data structure

In mixed mode you can add a super column to 5 dimensions. This is very convenient for each user's index.

Distributed write operation

There is a place to read or write any data at any time, anywhere. And there won't be any single point of failure.

Application customer facebook

Cassandra database performance characteristics and Cassandra database design and maintenance

Main characteristics

●Distributed

●Column-based structuring

●High stretchability

Basic architecture

Cassandra does not choose the central control node like BigTable or Hbase, but chooses the centerless P2P architecture. All the nodes in the network are peers. They form a ring. The nodes exchange data every second through the P2P protocol. So that each node has information about all other nodes, including location, status, and so on.

The core components of Cassandra include:

Gossip: A peer-to-peer communication protocol used to exchange node location and status information. The Gossip information is stored locally when a node is started, but the history information needs to be cleaned when the node information changes, such as IP changes. Through the Gossip protocol, each node periodically exchanges data of itself and its nodes that have exchanged information every second. Each exchanged information has a version number, so that when there is new data, the old data can be overwritten to ensure data. The accuracy of the exchange, all nodes must use the same cluster list, such nodes are also called seed.

ParTITIoner: Responsible for allocating data in the cluster, which determines which nodes place the first copy. Generally, Hash is used as the primary key, and each row of data is distributed to different nodes to ensure the scalability of the cluster. .

Replica placement strategy: A replication strategy that determines which node places the replicated data and the number of copies.

Snitch: Defines a network topology map to determine how to place replicated data and efficiently route requests.

Cassandra.yaml: The main configuration file, set the cluster's initial configuration, table cache parameters, tuning parameters and resource usage, timeout settings, client connections, backup and security.

Cassandra database design and maintenance summary

1. Partition field design uses TImeuuid/TImestamp (such as daily or hourly 0 point timestamp) + any bucket field (fixed fields such as type)

2.cluster field can be designed according to requirements

3. The query condition when using the index must be accompanied by the query condition of the partition field, otherwise cassandra will go to the index of all the partition fields, the efficiency is low and once the query results exceed 100,000 tombstones, an exception will be thrown.

4. High cardinality fields such as true/false are generally not indexed, otherwise the query efficiency will be greatly reduced.

5. Due to cassandra's read repair mechanism, if a large number of read timeouts occur after performing a large number of delete operations, it needs to be executed in the bin directory of each cassandra.

./nodetool flush

./nodetool $ keyspace $table

Force merge sstable

6. If cassandra has a data consistency error between different nodes, it needs to be executed. /nodetool repair $keyspace $table

7. If the repair still does not solve the problem, it needs to be executed. /sstablescrub $keyspace $table cleans up the corrupted data. Note that if the operation is prone to data corruption, it is best to take a snapshot before execution.

8. cassandra defaults to safe mode. When performing sensitive fields such as drop and truncate, it will take a snapshot of the data. Too many snapshots will cause cassandra to take too long to traverse the directory during startup. It may take several hours to start.

This time needs to be executed. /nodetool clearsnapshot $keyspace

Cassandra database performance characteristics and Cassandra database design and maintenance

Spring Terminal

Spring-type terminals are new types of spring-type terminals, which have been widely used in the world's electrical and electronic engineering industries: lighting, elevator control, instrumentation, power, chemistry, and automotive power.


If the terminal block is black, one of the possibilities is not necessarily burning black, oxidation may also be black. So how to verify whether it is burnt black? The method we take is to wipe it with a finger. If it can be wiped off, like soot, it is the black substance formed by oxidation, which can only be ground off with sandpaper or a file.

Spring Terminal,Spring Push-In Terminal Block,Spring Clamp Terminal Block,Spring Terminal Block For Pcb

Sichuan Xinlian electronic science and technology Company , https://www.sztmlchs.com

This entry was posted in on