1. Introduction
1.绪论
Sometimes, we need a quick reference guide to get started in our learning path. In particular, a cheat sheet is a document that contains all the critical information.
有时,我们需要一个快速参考指南来开始我们的学习道路。特别是,小抄是一份包含所有关键信息的文件。
In this tutorial, we’ll learn the essential concepts of Cassandra query language (CQL) and how to apply them using a cheat sheet that we’ll build along the way.
在本教程中,我们将学习Cassandra查询语言(CQL)的基本概念,以及如何使用我们将一路建立的小抄来应用它们。
2. Cassandra at a Glance
2.卡桑德拉一瞥
Apache Cassandra is an open-source, NoSQL, and distributed data storage system. This means instead of being able to live only on one server, it spreads across multiple servers. It’s also known for its high availability and partition tolerance.
Apache Cassandra是一个开源的、NoSQL和分布式数据存储系统。这意味着它不是只能在一台服务器上生存,而是分散在多个服务器上。它还以其高可用性和分区容忍度而闻名。
To put it another way, the design of the Cassandra database is inspired by the “AP” of the CAP theorem.
换句话说,Cassandra数据库的设计灵感来自CAP定理的 “AP”。
Furthermore, Cassandra is a masterless architecture, is massively scalable, and above all, provides easy fault detection and recovery.
此外,Cassandra是一个无主的架构,具有大规模的可扩展性,最重要的是,它提供了简单的故障检测和恢复。
3. Data Types
3.数据类型
Generally, Cassandra supports a rich set of data types. These include native types, collection types, user-defined types, and tuples, together with custom types.
一般来说,Cassandra支持一组丰富的数据类型。这些类型包括本地类型、集合类型、用户定义的类型和图元,以及自定义类型。
3.1. Native Types
3.1.本地类型
The native types are the built-in types and provide support to a range of constants in Cassandra.
本机类型是内置类型,为Cassandra中的一系列常量提供支持。
To begin with, a string is a very popular datatype in the programming world.
首先,在编程领域,字符串是一个非常流行的数据类型。
CQL offers four different datatypes for strings:
CQL为字符串提供了四种不同的数据类型。
Data Type | Constants Supported | Description |
ascii | string | ASCII character string |
inet | string | IPv4 or IPv6 address string |
text | string | UTF8 encoded string |
varchar | string | UTF8 encoded string |
A boolean has one of two possible values, either true or false:
布尔值有两个可能的值之一,要么是true,要么是false。
Data Type | Constants Supported | Description |
boolean | boolean | true or false |
Using the blob data type, we can store images or multimedia data as a binary stream in a database:
使用blob数据类型,我们可以将图像或多媒体数据作为二进制流存储在数据库中。
Data Type | Constants Supported | Description |
blob | blob | Arbitrary bytes |
Duration is a three-signed integer that represents months, days, and nanoseconds:
持续时间是一个三符号的整数,代表月、日和纳秒。
Data Type | Constants Supported | Description |
duration | duration | A duration value |
Cassandra offers a wide range of data types for integer data:
Cassandra为整数数据提供了广泛的数据类型。
Data Type | Constants Supported | Description |
tinyint | integer | 8-bit signed int |
smallint | integer | 16-bit signed int |
int | integer | 32-bit signed int |
bigint | integer | 64-bit signed long |
variant | integer | Arbitrary-precision integer |
counter | integer | Counter column (64-bit signed) |
For integer and float, we have three data types:
对于整数和浮点数,我们有三种数据类型。
Data Type | Constants Supported | Description |
decimal | integer, float | Variable precision decimal |
double | integer, float | 64-bit floating-point |
float | integer, float | 32-bit floating-point |
For date- and time-related needs, Cassandra provides three data types:
对于日期和时间相关的需求,Cassandra提供了三种数据类型。
Data Type | Constants Supported | Description |
date | integer, string | A date value (without time) |
time | integer, string | A time value (without date) |
timestamp | integer, string | A timestamp (with date & time) |
Generally, we have to avoid collision while using the INSERT or UPDATE commands:
一般来说,在使用INSERT或UPDATE命令时,我们必须避免碰撞。
Data Type | Constants Supported | Description |
uuid | uuid | A UUID (any version) |
timeuuid | uuid | A version 1 UUID |
3.2. Collection Types
3.2.采集类型
When a user has multiple values against one field in a relational database, it’s common to store them in a separate table. For example, a user has numerous bank accounts, contact information, or email addresses. Therefore, we need to apply joins between two tables to retrieve all the data in this case.
当一个用户在关系型数据库中对一个字段有多个值时,通常会将它们存储在一个单独的表中。例如,一个用户有许多银行账户、联系信息或电子邮件地址。因此,在这种情况下,我们需要在两个表之间应用连接来检索所有的数据。
Cassandra provides a way to group and store data together in a column using collection types.
Cassandra提供了一种使用集合类型将数据分组并存储在一个列中的方法。
Let’s quickly look at those types:
让我们快速看一下这些类型。
- set – unique values; stored as unordered
- list – can contain duplicate values; order matters
- map – data stores in the form of key-value pairs
3.3. User-Defined Types
3.3.用户定义的类型
User-defined types give us the liberty to attach multiple data fields in a single column:
用户定义的类型使我们可以自由地在一列中附加多个数据字段。
CREATE TYPE student.basic_info (
birthday timestamp,
race text,
weight text,
height text
);
3.4. Tuple Type
3.4.元组类型
A tuple is an alternative to a user-defined type. It’s created using angle brackets and a comma delimiter to separate the types of elements it contains.
元组是用户定义的类型的一种替代方式。它是用角括号和逗号分隔它所包含的元素类型来创建的。
Here are the commands for a simple tuple:
下面是一个简单元组的命令。
-- create a tuple
CREATE TABLE subjects (
k int PRIMARY KEY,
v tuple<int, text, float>
);
-- insert values
INSERT INTO subjects (k, v) VALUES(0, (3, 'cs', 2.1));
-- retrieve values
SELECT * FROM subjects;
4. Cassandra CQL Commands
4.Cassandra CQL命令
Let’s look at several categories of CQL commands.
让我们来看看CQL命令的几个类别。
4.1. Keyspace Commands
4.1.密钥空间命令
The first thing to remember is that a keyspace in Cassandra is much like a database in RDBMS. It is an outermost container of data that defines the replication strategy and other options, particularly for all the keyspace tables. With this in mind, a good general rule is one keyspace per application.
首先要记住的是,Cassandra中的关键空间很像RDBMS中的数据库。它是一个最外层的数据容器,定义了复制策略和其他选项,特别是所有的关键空间表。考虑到这一点,一个好的一般规则是每个应用程序有一个钥匙空间。
Let’s look at the related commands:
我们来看看相关的命令。
Command | Example | Description |
CREATE keyspace | CREATE KEYSPACE keyspace_name WITH replication = {‘class’:’SimpleStrategy’, ‘replication_factor’ : 2}; |
To create a keyspace. |
DESCRIBE keyspace | DESCRIBE KEYSPACES; | It will list all the key spaces. |
USE keyspace | USE keyspace_name; | This command connects the client session to a keyspace. |
ALTER keyspace | ALTER KEYSPACE keyspace_name WITH REPLICATION = { ‘class’ : ‘SimpleStrategy’, ‘replication_factor’ : 3 } AND DURABLE_WRITES = false; |
To alter a keyspace. |
DROP keyspace | DROP KEYSPACE keyspace_name; | To drop a keyspace. |
4.2. Table Commands
4.2.表命令
In Cassandra, a table is also referred to as a column family. We already know the importance of a primary key. However, it is mandatory to define the primary key while creating the table.
在Cassandra中,表也被称为一个列族。我们已经知道主键的重要性。然而,在创建表的时候,定义主键是必须的。
Let’s review these commands:
让我们回顾一下这些命令。
Command | Example | Description |
CREATE table | CREATE TABLE table_name ( column_name UUID PRIMARY KEY, column_name text, column_name text, column_name timestamp); | To create a table. |
ALTER table | ALTER TABLE table_name ADD column_name int; | It will add a new column to a table. |
ALTER table | ALTER TABLE table_name ALTER column_name TYPE datatype; | We can change the data type of an existing column. |
ALTER table | ALTER TABLE table_name WITH caching = {‘keys’ : ‘NONE’, ‘rows_per_partition’ : ‘1′ }; | This command helps to alter the properties of a table. |
DROP table | DROP TABLE table_name; | To drop a table. |
TRUNCATE table | TRUNCATE table_name; | Using this, we can remove all the data permanently. |
4.3. Index Commands
4.3.索引命令
Instead of scanning a whole table and waiting for results, we can use indexes to speed up queries. However, we must remember that the primary key in Cassandra is already indexed. Therefore, it cannot be used for the same purpose again.
我们可以使用索引来加快查询速度,而不是扫描整个表并等待结果。然而,我们必须记住,Cassandra中的主键已经有了索引。因此,它不能再次用于相同的目的。。
Let’s look at the commands:
我们来看看这些命令。
Command | Example | Description |
CREATE index | CREATE INDEX index_name on table_name (column_name); | To create an index. |
DELETE index | DROP INDEX IF EXISTS index_name; | To drop an index. |
4.4. Basic Commands
4.4.基本命令
These commands are used to read and manipulate the table values:
这些命令用于读取和操作表的数值。
Command | Example | Description |
INSERT | INSERT INTO table_name (column_name1, column_name2) VALUES(value1, value2); | To insert a record in a table. |
SELECT | SELECT * FROM table_name; | The command is used to fetch data from a specific table. |
WHERE | SELECT * FROM table_name WHERE column_name=value; | It filters out records on a predicate. |
UPDATE | UPDATE table_name SET column_name2=value2 WHERE column_name1=value1; | It is used to edit records. |
DELETE | DELETE identifier FROM table_name WHERE condition; | This statement deletes the value from a table. |
4.5. Other Commands
4.5.其他命令
Cassandra has two different types of keys: partition key and clustering key. A partition key indicates the node(s) where the data is stored.
Cassandra有两种不同类型的键:分区键和聚类键。分区键表示数据存储的节点。
In comparison, the clustering key determines the order of data within a partition key:
相比之下,聚类键决定了分区键内数据的顺序。
Command | Example | Description |
ORDER BY | SELECT * FROM table_name WHERE column_name1 = value ORDER BY cloumn_name2 ASC; | For this, the partition key must be defined in the WHERE clause. Also, the ORDER BY clause represents the clustering column to use for ordering. |
GROUP BY | SELECT column_name FROM table_name GROUP BY condition1, condition2; | This clause only supports with Partition Key or Partition Key and Clustering Key. |
LIMIT | SELECT * FROM table_name LIMIT 3; | For a large table, limit the number of rows retrieved. |
5. Operators
5.操作人员
Cassandra supports both arithmetic and conditional types of operators. Under the arithmetic operators, we have +, -, *, /, %, and – (unary) for addition, subtraction, multiplication, division, reminder, and negation, respectively.
Cassandra同时支持算术和条件类型的运算符。在算术运算符下,我们有+、-、*、/、%和-(单数),分别用于加、减、乘、除、提醒和否定。
The WHERE clause is significant in Cassandra. The conditional operators are used in this clause with certain scenarios and limitations. These operators are CONTAINS, CONTAINS KEY, IN, =, >, >=, <, and <=.
在Cassandra中,WHERE子句是很重要的。条件运算符在此子句中使用,有某些情况和限制。这些运算符是CONTAINS、CONTAINS KEY、IN、=、>、>=、<和<=。
6. Common Functions
6.共同功能
Without a doubt, functions, either aggregate or scalar, play an essential part in transforming values from one to another. For this reason, Cassandra offers several native functions in both categories.
毫无疑问,函数,无论是聚合函数还是标量函数,在将数值从一个转化为另一个方面都起着至关重要的作用。出于这个原因,Cassandra在这两个类别中都提供了几个本地函数。
Let’s look at those functions:
让我们来看看这些功能。
- Blob conversion functions
- UUID & Timeuuid functions
- Token function
- WRITETIME function
- TTL function
- TOKEN function
- MIN(), MAX(), SUM(), AVG()
Along with these native functions, it also allows users to define the functions and aggregates.
除了这些原生函数之外,它还允许用户定义函数和聚合。
7. Conclusion
7.结语
In this short article, we’ve seen what the building blocks of Cassandra’s query language are. First, we studied the data types it supports and how to define them. Then, we looked at common commands to perform database operations. Finally, we discussed the operators and functions of the language.
在这篇短文中,我们已经看到了Cassandra的查询语言的构件是什么。首先,我们研究了它支持的数据类型以及如何定义它们。然后,我们看了执行数据库操作的常用命令。最后,我们讨论了该语言的运算符和函数。