1. Introduction
1.介绍
In our previous article, we looked at building a dashboard for viewing the current status of the Avengers using DataStax Astra, a DBaaS powered by Apache Cassandra using Stargate to offer additional APIs for working with it.
在我们的上一篇文章中,我们研究了如何使用DataStax Astra构建一个用于查看复仇者联盟当前状态的仪表板,该DBaaS由Apache Cassandra提供的DBaaS,使用Stargate提供额外的API,以便与它合作。
In this article, we will be extending this to store discrete events instead of the rolled-up summary. This will allow us to view these events in the UI. We will allow the user to click on a single card and get a table of the events that have led us to this point. Unlike with the summary, these events will each represent one Avenger and one discrete point in time. Every time a new event is received then it will be appended to the table, along with all the others.
在本文中,我们将对其进行扩展,以存储不连续的事件,而不是卷起的摘要。这将使我们能够在用户界面中查看这些事件。我们将允许用户点击一张卡片,并获得导致我们到达这一点的事件表。与摘要不同,这些事件将分别代表一个复仇者和一个不连续的时间点。每当收到一个新的事件,它就会和所有其他的事件一起被添加到表格中。
We are using Cassandra for this because it allows a very efficient way to store time-series data, where we are writing much more often than we are reading. The goal here is a system that can be updated frequently – for example, every 30 seconds – and can then allow users to easily see the most recent events that have been recorded.
我们之所以使用Cassandra,是因为它允许以一种非常有效的方式来存储时间序列数据,在这种情况下,我们的写入频率远远高于我们的读取。这里的目标是建立一个可以频繁更新的系统–例如,每30秒–然后可以让用户轻松地看到最近记录的事件。
2. Building out the Database Schema
2.建立数据库模式
Unlike with the Document API that we used in the previous article, this will be built using the REST and GraphQL APIs. These work on top of a Cassandra table, and these APIs can completely cooperate with each other and the CQL API.
与我们在上一篇文章中使用的文档 API 不同,这将使用 REST 和 GraphQL API 来构建。这些API在Cassandra表之上工作,而且这些API可以完全与对方和CQL API合作。
In order to work with these, we need to have already defined a schema for the table we are storing our data into. The table we are using is designed to work with a specific schema – find events for a given Avenger in order of when they happened.
为了使用这些,我们需要已经为我们要存储数据的表定义了一个模式。我们所使用的表被设计为与一个特定的模式一起工作–按照事件发生的时间顺序查找特定复仇者的事件。
This schema will look as follows:
这个模式将看起来如下。
CREATE TABLE events (
avenger text,
timestamp timestamp,
latitude decimal,
longitude decimal,
status decimal,
PRIMARY KEY (avenger, timestamp)
) WITH CLUSTERING ORDER BY (timestamp DESC);
With data that looks similar to this:
有了与此相似的数据。
avenger | timestamp | latitude | longitude | status |
---|---|---|---|---|
falcon | 2021-05-16 09:00:30.000000+0000 | 40.715255 | -73.975353 | 0.999954 |
hawkeye | 2021-05-16 09:00:30.000000+0000 | 40.714602 | -73.975238 | 0.99986 |
hawkeye | 2021-05-16 09:01:00.000000+0000 | 40.713572 | -73.975289 | 0.999804 |
This defines our table to have multi-row partitions, with a partition key of “avenger”, and a clustering key of “timestamp”. The partition key is used by Cassandra to determine which node the data is stored on. The clustering key is used to determine the order that the data is stored within the partition.
这定义了我们的表有多行分区,分区键为 “avenger”,聚类键为 “timestamp”。Cassandra使用分区键来确定数据存储在哪个节点上。聚类键用于确定数据在分区中的存储顺序。
By indicating that the “avenger” is our partition key it will ensure that all data for the same Avenger is kept together. By indicating that the “timestamp” is our clustering key, it will store the data within this partition in the most efficient order for us to retrieve. Given that our core query for this data is selecting every event for a single Avenger – our partition key – ordered by the timestamp of the event – our clustering key – Cassandra can allow us to access this very efficiently.
通过指出 “avenger “是我们的分区键,它将确保同一Avenger的所有数据被保存在一起。通过指出 “时间戳 “是我们的聚类键,它将以最有效的顺序存储该分区内的数据,以便我们检索。鉴于我们对这些数据的核心查询是选择单个复仇者的每个事件–我们的分区键–按照事件的时间戳–我们的聚类键排序–Cassandra可以让我们非常有效地访问这些数据。
In addition, the way the application is designed to be used means that we are writing event data on a near-continuous basis. For example, we might get a new event from every Avenger every 30 seconds. Structuring our table in this way makes it very efficient to insert the new events into the correct position in the correct partition.
此外,该应用程序的设计使用方式意味着我们在近乎连续的基础上写入事件数据。例如,我们可能每30秒就从每个复仇者那里得到一个新的事件。以这种方式构建我们的表格,使我们能够非常有效地将新事件插入到正确分区的正确位置上。
For convenience sake, our script for pre-populating our database will also create and populate this schema.
为了方便起见,我们用于预先填充数据库的脚本也将创建和填充这个模式。
3. Building the Client Layer Using Astra, REST, & GraphQL APIs
3.使用Astra、REST和GraphQL API构建客户端层
We are going to interact with Astra using both the REST and GraphQL APIs, for different purposes. The REST API will be used for inserting new events into the table. The GraphQL API will be used for retrieving them again.
我们将使用REST和GraphQL API与Astra互动,用于不同的目的。REST API将被用于将新的事件插入到表中。GraphQL API将用于再次检索它们。
In order to best do this, we will need a client layer that can perform the interactions with Astra. These are the equivalent of the DocumentClient class that we built in the previous article, for these other two APIs.
为了最好地做到这一点,我们将需要一个能够与Astra进行交互的客户端层。这些相当于我们在上一篇文章中建立的DocumentClient类,用于这另外两个API。
3.1. REST Client
3.1.REST客户端
Firstly, our REST Client. We will be using this to insert new, whole records and so only needs a single method that takes the data to insert:
首先是我们的REST客户端。我们将使用它来插入新的、完整的记录,因此只需要一个方法来获取要插入的数据。
@Repository
public class RestClient {
@Value("https://${ASTRA_DB_ID}-${ASTRA_DB_REGION}.apps.astra.datastax.com/api/rest/v2/keyspaces/${ASTRA_DB_KEYSPACE}")
private String baseUrl;
@Value("${ASTRA_DB_APPLICATION_TOKEN}")
private String token;
private RestTemplate restTemplate;
public RestClient() {
this.restTemplate = new RestTemplate();
this.restTemplate.setRequestFactory(new HttpComponentsClientHttpRequestFactory());
}
public <T> void createRecord(String table, T record) {
var uri = UriComponentsBuilder.fromHttpUrl(baseUrl)
.pathSegment(table)
.build()
.toUri();
var request = RequestEntity.post(uri)
.header("X-Cassandra-Token", token)
.body(record);
restTemplate.exchange(request, Map.class);
}
}
3.2. GraphQL Client
3.2. GraphQL客户端
Then, our GraphQL Client. This time we are taking a full GraphQL query and returning the data that it fetches:
然后是我们的GraphQL客户端。这一次,我们将采用一个完整的GraphQL查询,并返回它所获取的数据。
@Repository
public class GraphqlClient {
@Value("https://${ASTRA_DB_ID}-${ASTRA_DB_REGION}.apps.astra.datastax.com/api/graphql/${ASTRA_DB_KEYSPACE}")
private String baseUrl;
@Value("${ASTRA_DB_APPLICATION_TOKEN}")
private String token;
private RestTemplate restTemplate;
public GraphqlClient() {
this.restTemplate = new RestTemplate();
this.restTemplate.setRequestFactory(new HttpComponentsClientHttpRequestFactory());
}
public <T> T query(String query, Class<T> cls) {
var request = RequestEntity.post(baseUrl)
.header("X-Cassandra-Token", token)
.body(Map.of("query", query));
var response = restTemplate.exchange(request, cls);
return response.getBody();
}
}
As before, our baseUrl and token fields are configured from our properties defining how to talk to Astra. These client classes each know how to build the complete URLs needed to interact with the database. We can use them to make the correct HTTP requests to perform the desired actions.
像以前一样,我们的baseUrl和token字段是从我们定义如何与Astra对话的属性中配置的。这些客户端类各自知道如何构建与数据库交互所需的完整URL。我们可以使用它们来进行正确的HTTP请求以执行所需的操作。
That’s all that’s needed to interact with the Astra since these APIs work by simply exchanging JSON documents over HTTP.
这就是与Astra互动所需的全部内容,因为这些API通过HTTP简单地交换JSON文档来工作。
4. Recording Individual Events
4.记录个人项目
In order to display events, we need to be able to record them. This will build on top of the functionality we had before to update the statuses table, and will additionally insert new records into the events table.
为了显示事件,我们需要能够记录它们。这将建立在我们之前更新statuses表的功能之上,并将在events表中插入新记录。
4.1. Inserting Events
4.1.插入事件
The first thing we need is a representation of the data in this table. This will be represented as a Java Record:
我们首先需要的是这个表中的数据的表示。这将被表示为一个Java记录。
public record Event(String avenger,
String timestamp,
Double latitude,
Double longitude,
Double status) {}
This directly correlates to the schema we defined earlier. Jackson will convert this into the correct JSON for the REST API when we actually make the API calls.
这与我们之前定义的模式直接相关。当我们实际进行API调用时,Jackson将把它转换成REST API的正确JSON。
Next, we need our service layer to actually record these. This will take the appropriate details from outside, augment them with the timestamp and call our REST client to create the new record:
接下来,我们需要我们的服务层来实际记录这些。这将从外部获取适当的细节,用时间戳来增加它们,并调用我们的REST客户端来创建新记录。
@Service
public class EventsService {
@Autowired
private RestClient restClient;
public void createEvent(String avenger, Double latitude, Double longitude, Double status) {
var event = new Event(avenger, Instant.now().toString(), latitude, longitude, status);
restClient.createRecord("events", event);
}
}
4.2. Update API
4.2.更新API
Finally, we need a controller to receive the events. This is extending the UpdateController that we wrote in the previous article to wire in the new EventsService and to then call it from our update method.
最后,我们需要一个控制器来接收这些事件。这是扩展我们在上一篇文章中写的UpdateController,以连接新的EventsService,然后从我们的update方法调用它。
@RestController
public class UpdateController {
......
@Autowired
private EventsService eventsService;
@PostMapping("/update/{avenger}")
public void update(@PathVariable String avenger, @RequestBody UpdateBody body) throws Exception {
eventsService.createEvent(avenger, body.lat(), body.lng(), body.status());
statusesService.updateStatus(avenger, lookupLocation(body.lat(), body.lng()), getStatus(body.status()));
}
......
}
At this point, calls to our API to record the status of an Avenger will both update the statuses document and insert a new record into the events table. This will allow us to record every update event that happens.
在这一点上,调用我们的API来记录Avenger的状态,将同时更新状态文件并在事件表中插入一条新的记录。这将使我们能够记录每一个发生的更新事件。
This means that every single time we receive a call to update the status of an Avenger we will be adding a new record to this table. In reality, we will need to support the scale of data being stored either by pruning or by adding additional partitioning, but that is out of scope for this article.
这意味着,每当我们收到更新复仇者状态的呼叫时,我们都将向该表添加一条新的记录。实际上,我们将需要通过修剪或添加额外的分区来支持所存储的数据规模,但这不在本文的讨论范围之内。
5. Making Events Available to Users via the GraphQL API
5.通过GraphQL API向用户提供事件
Once we have events in our table, the next step is to make them available to users. We will achieve this using the GraphQL API, retrieving a page of events at a time for a given Avenger, always ordered so that the most recent ones come first.
一旦我们的表中有了事件,下一步就是将它们提供给用户。我们将使用GraphQL API来实现这一目标,每次为给定的Avenger检索一页事件,总是按顺序排列,使最近的事件排在前面。
Using GraphQL we also have the ability to only retrieve the subset of fields that we are actually interested in, rather than all of them. If we are fetching a large number of records then this can help keep the payload size down and thus improve performance.
使用GraphQL,我们也有能力只检索我们真正感兴趣的字段子集,而不是所有的字段。如果我们要获取大量的记录,那么这可以帮助减少有效载荷的大小,从而提高性能。
5.1. Retrieving Events
5.1.检索事件
The first thing we need is a representation of the data we are retrieving. This is a subset of the actual data stored in the table. As such, we will want a different class to represent it:
我们首先需要的是我们正在检索的数据的表示。这是存储在表中的实际数据的一个子集。因此,我们将需要一个不同的类来表示它。
public record EventSummary(String timestamp,
Double latitude,
Double longitude,
Double status) {}
We also need a class that represents the GraphQL response for a list of these. This will include a list of event summaries and the page state to use for a cursor to the next page:
我们还需要一个类来表示这些列表的GraphQL响应。这将包括一个事件摘要的列表和用于光标到下一个页面的页面状态。
public record Events(List<EventSummary> values, String pageState) {}
We can now create a new method within our Events Service to actually perform the search.
我们现在可以在我们的事件服务中创建一个新的方法来实际执行搜索。
public class EventsService {
......
@Autowired
private GraphqlClient graphqlClient;
public Events getEvents(String avenger, String offset) {
var query = "query {" +
" events(filter:{avenger:{eq:\"%s\"}}, orderBy:[timestamp_DESC], options:{pageSize:5, pageState:%s}) {" +
" pageState " +
" values {" +
" timestamp " +
" latitude " +
" longitude " +
" status" +
" }" +
" }" +
"}";
var fullQuery = String.format(query, avenger, offset == null ? "null" : "\"" + offset + "\"");
return graphqlClient.query(fullQuery, EventsGraphqlResponse.class).data().events();
}
private static record EventsResponse(Events events) {}
private static record EventsGraphqlResponse(EventsResponse data) {}
}
Here we have a couple of inner classes that exist purely to represent the JSON structure returned by the GraphQL API down to the part that is interesting to us – these are entirely an artefact of the GraphQL API.
在这里,我们有几个内部类,它们的存在纯粹是为了表示由GraphQL API返回的JSON结构,以及我们感兴趣的部分–这些完全是GraphQL API的一个人工制品。
We then have a method that constructs a GraphQL query for the details that we want, filtering by the avenger field and sorting by the timestamp field in descending order. Into this we substitute the actual Avenger ID and the page state to use before passing it on to our GraphQL client to get the actual data.
然后我们有一个方法来构建一个GraphQL查询,以获取我们想要的细节,通过avenger字段进行过滤,并通过timestamp字段进行降序排序。我们将实际的Avenger ID和页面状态代入其中,然后将其传递给我们的GraphQL客户端以获得实际数据。
5.2. Displaying Events in the UI
5.2.在用户界面中显示事件
Now that we can fetch the events from the database, we can then wire this up to our UI.
现在我们可以从数据库中获取事件,然后我们可以将其连接到我们的用户界面。
Firstly we will update the StatusesController that we wrote in the previous article to support the UI endpoint for fetching the events:
首先,我们将更新我们在上一篇文章中写的StatusesController,以支持获取事件的UI端点。
public class StatusesController {
......
@Autowired
private EventsService eventsService;
@GetMapping("/avenger/{avenger}")
public Object getAvengerStatus(@PathVariable String avenger, @RequestParam(required = false) String page) {
var result = new ModelAndView("dashboard");
result.addObject("avenger", avenger);
result.addObject("statuses", statusesService.getStatuses());
result.addObject("events", eventsService.getEvents(avenger, page));
return result;
}
}
Then we need to update our templates to render the events table. We’ll add a new table to the dashboard.html file that is only rendered if the events object is present in the model received from the controller:
然后我们需要更新我们的模板来渲染事件表。我们将在dashboard.html文件中添加一个新的表格,只有当events对象出现在从控制器收到的模型中时才会被渲染。
......
<div th:if="${events}">
<div class="row">
<table class="table">
<thead>
<tr>
<th scope="col">Timestamp</th>
<th scope="col">Latitude</th>
<th scope="col">Longitude</th>
<th scope="col">Status</th>
</tr>
</thead>
<tbody>
<tr th:each="data, iterstat: ${events.values}">
<th scope="row" th:text="${data.timestamp}">
</td>
<td th:text="${data.latitude}"></td>
<td th:text="${data.longitude}"></td>
<td th:text="${(data.status * 100) + '%'}"></td>
</tr>
</tbody>
</table>
</div>
<div class="row" th:if="${events.pageState}">
<div class="col position-relative">
<a th:href="@{/avenger/{id}(id = ${avenger}, page = ${events.pageState})}"
class="position-absolute top-50 start-50 translate-middle">Next
Page</a>
</div>
</div>
</div>
</div>
......
This includes a link at the bottom to go to the next page, which passes through the page state from our events data and the ID of the avenger that we are looking at.
这包括一个在底部的链接,以进入下一页,它通过我们的事件数据的页面状态和我们正在看的复仇者的ID。
And finally, we need to update the status cards to allow us to link through to the events table for this entry. This is simply a hyperlink around the header in each card, rendered in status.html:
最后,我们需要更新状态卡,让我们能够链接到这个条目的事件表。这只是在每个卡片的标题周围的一个超链接,在status.html中呈现:。
......
<a th:href="@{/avenger/{id}(id = ${data.avenger})}">
<h5 class="card-title" th:text="${data.name}"></h5>
</a>
......
We can now start up the application, and click through from the cards to see the most recent events that lead up to this status:
我们现在可以启动应用程序,并从卡片中点击查看导致这一状态的最近事件:。
6. Summary
6.摘要
Here we have seen how the Astra REST and GraphQL APIs can be used to work with row-based data, and how they can work together. We’re also starting to see how well Cassandra, and these APIs, can be used for massive data sets.
在这里我们看到了Astra REST和GraphQL API如何用于处理基于行的数据,以及它们如何协同工作。我们也开始看到Cassandra和这些API如何能够被用来处理大量的数据集。
All of the code from this article can be found on GitHub.
本文的所有代码都可以在GitHub上找到。