This Week in Databend #115
October 15, 2023 · 3 min read
PsiACE
Stay up to date with the latest weekly developments on Databend!
Databend is a modern cloud data warehouse, serving your massive-scale analytics needs at low cost and complexity. Open source alternative to Snowflake. Also available in the cloud: https://app.databend.com .
What's New
Stay informed about the latest features of Databend.
AGGREGATING INDEX
Databend has recently introduced AGGREGATING INDEX to improve query performance, especially for aggregation queries involving MIN
, MAX
, and SUM
. Aggregating Index leverage techniques like precomputing and storing query results separately to eliminate the need to scan the entire table, thus speeding up data retrieval.
In addition, this feature includes a refresh mechanism that allows you to update and persist the latest query results on demand, ensuring data accuracy and reliability by refreshing the results when needed. Databend recommends manually refreshing the aggregating index before executing relevant queries to retrieve the most up-to-date data; Databend Cloud supports auto-refreshing of aggregating index.
-- Create an aggregating index
CREATE AGGREGATING INDEX my_agg_index AS SELECT MIN(a), MAX(c) FROM agg;
-- Refresh the aggregating index
REFRESH AGGREGATING INDEX my_agg_index;
The AGGREGATING INDEX requires Databend Enterprise Edition. Please contact the Databend team for upgrade information.
If you are interested in learning more, please check out the resources below:
Code Corner
Discover some fascinating code snippets or projects that showcase our work or learning journey.
Visualizing the MERGE INTO Pipeline
Databend recently implemented the MERGE INTO
statement to provide more comprehensive data management capabilities. For those interested in how it works under the hood, check out the pipeline visualization of MERGE INTO
below.
+-------------------+
+-----------------------------+ output_port_row_id | |
+-----------------------+ Matched | +------------------------>-ResizeProcessor(1)+---------------+
| +---+--------------->| MatchedSplitProcessor | | | |
| | | | +----------+ +-------------------+ |
+----------------------+ | +---+ +-----------------------------+ | |
| MergeIntoSource +---------->|MergeIntoSplitProcessor| output_port_updated |
+----------------------+ | +---+ +-----------------------------+ | +-------------------+ |
| | | NotMatched | | | | | |
| +---+--------------->| MergeIntoNotMatchedProcessor+----------+------------->-ResizeProcessor(1)+-----------+ |
+-----------------------+ | | | | | |
+-----------------------------+ +-------------------+ | |
| |
| |
| |
| |
| |
+-------------------------------------------------+ | |
| | | |
| | | |
+--------------------------+ +-------------------------+ | ++---------------------------+ | +--------------------------------------+ | |
+---------+ TransformSerializeSegment<--------+ TransformSerializeBlock <-----+---------+|TransformAddComputedColumns|<---------+-----+TransformResortAddOnWithoutSourceSchema<-+ |
| +--------------------------+ +-------------------------+ | ++---------------------------+ | +--------------------------------------+ |
| | | |
| | | |
| | | |
| | | |
| +---------------+ +------------------------------+ | ++---------------+ | +---------------+ |
+----------+ TransformDummy|<----------------+ AsyncAccumulatingTransformer <-+---------------+|TransformDummy |<---------------+---------------+TransformDummy <------------------+
| +---------------+ +------------------------------+ | ++---------------+ | +---------------+
| | |
| | If it includes 'computed', this section |
| | of code will be executed, otherwise it won't |
| | |
| -+-------------------------------------------------+
|
|
|
| +------------------+ +-----------------------+ +-----------+
+------->|ResizeProcessor(1)+----------->|TableMutationAggregator+------->|CommitSink |
+------------------+ +-----------------------+ +-----------+
If you are interested in learning more, please check out the resources below:
Highlights
We have also made these improvements to Databend that we hope you will find helpful:
- MERGE INTO now supports for automatic recluster and compaction.
- SQLsmith now covers
DELETE
,UPDATE
,ALTER TABLE
, andCAST
. - Added semi-structured data processing functions
json_each
andjson_array_elements
. - Added time and date functions
to_week_of_year
anddate_part
. See Docs | Date & Time Functions for details. - Read Sending IoT Stream Data to Databend with LF Edge eKuiper to learn how Databend integrates with eKuiper to meet growing IoT data analytics demands.
What's Up Next
We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.
Enhancing Role-Based Access Control
Currently, Databend's access control system consists of Role-Based Access Control (RBAC) and Discretionary Access Control (DAC). However, there is still room for improvement to make it more comprehensive.
We plan to support more privilege checks on uncovered resources and provide privilege definition guidance in 2023 Q4.
Issue #13207 | Tracking: RBAC improvement plan in 2023 Q4
Please let us know if you're interested in contributing to this feature, or pick up a good first issue at https://link.databend.rs/i-m-feeling-lucky to get started.
Changelog
You can check the changelog of Databend Nightly for details about our latest developments.
Full Changelog: https://github.com/datafuselabs/databend/compare/v1.2.147-nightly...v1.2.160-nightly
🎉 Contributors 29 contributors
Thanks a lot to the contributors for their excellent work.
🎈Connect With Us
Databend is a cutting-edge, open-source cloud-native warehouse built with Rust, designed to handle massive-scale analytics.
Join the Databend Community to try, get help, and contribute!