Partitioning

  • break large tables into pieces
  • 随着数据量的增加,以下工作会变得更加复杂(耗时很长)

    1. 对整个table的scan(从第一行一直查找到最后一行)
    2. 创建和重建Index
    3. 数据归档(data archival) 和数据删除
    4. 对表或index的统计
    5. Table relocation (移动到不同的表空间(tablespace)
    6. 数据备份

Partitioning Concepts

  • 将一个大的table拆分为两个或多个具有相同定义,但数据不重复的tables
  • 一旦table被partitioning,table就成为了一个虚拟的概念,但是用户依然可以对table进行操作就像正常的table那样
  • partitioning的一些特性

    1. partitions可以被存储到不同的表空间,和物理空间上
    2. partitions可以被不同的压缩方法进行压缩
    3. 对于某些partition Local Indexes可以被Drop
    4. 在某些partition上表的统计可以被冻结,而在其他的partition可以定时更新
    5. Individual partition 可以存储于memory或数据库的flash storage tier中。

Table Partitioning

  • horizontal partitioning
    • 根据行进行划分
  • vertical partitioning
    • 根据整列进行划分
  • partition key
    • 为表中的某一列,其数值可以指定对应某一个partition的某一行
    • 一般情况下,一个表单的partitin key包含某一列以及用于此列的partitioning function

Index Partitioning

  • global index: 保留原table内的Index
    • 使用时并不需要指名具体的partition key
  • local index: 取消原来table内的Index,而给每个Paritition新增一个Index

Partitioning Methods

Range partitioning
  • 面对有序变量
  • create demo
    CREATE TABLE $tbl (
        $fld_setting
    )
    PARTITION BY RANGE($fld)
    ( PARTITION $partition_name less than ($v1),
    )
    
  • create example

    CREATE TABLE sales
    (sale_id INT NOT NULL,
    cust_id INT NOT NULL ,
    store_id INT NOT NULL,
    sale_date DATE NOT NULL,
    amount DECIMAL(9,2))
    PARTITION BY RANGE (yearweek(sale_date))
    (
        PARTITION s1 VALUES less than (202002),
        PARTITION s2 VALUES less than (202003),
        PARTITION s3 VALUES less than (202004),
        PARTITION s4 VALUES less than (202005),
        PARTITION s5 VALUES less than (202006),
        PARTITION s999 VALUES less than (MAXVALUE)
    );
    
  • alter demo

    ALTER TABLE $tbl REORGANIZE PARTITION $partition_name INTO
    ( PARTITION $partition_name_1 less than ($v1),
      PARTITION $partition_name_2 less than ($v2),
      ...
      PARTITION $partition_name_n less than ($vn)
    );
    
  • alter example
    ALTER TABLE sales REORGANIZE PARTITION s999 INTO
        (
        PARTITION s6 VALUES less than (202007),
        PARTITION s7 VALUES less than (202008),
        PARTITION s999 VALUES less than (MAXVALUE)
        );
    
  • get data from table' specific partition
    SELECT $fld
    FROM $tbl PARTITION ($partition_name);
    
List partitioning
  • 面对离散无序变量,当某个field中的某个值在某个集合内时,将其分到该paritition内
  • create demo
    CREATE TABLE $tbl
    ($fld_setting)
    PARTITION BY LIST COLUMNS ($fld)
    (PARTITION $partition_key VALUES IN $sets ...);
    
  • create example
    CREATE TABLE sales
    (sale_id INT NOT NULL,
     cust_id INT NOT NULL ,
     store_id INT NOT NULL,
     sale_date DATE NOT NULL,
     geo_region_cd VARCHAR(6) NOT NULL,
     amount DECIMAL(9,2))
    partition by list columns (geo_region_cd)
    (partition NORTHAMERICA VALUES  IN ('US_NE', 'US_SE', 'US_MW',
        'US_NW', 'US_SW','CAN','MEX'),
    partition EUROPE VALUES IN ('EUR_E','EUR_W'),
    PARTITION ASIA VALUES IN ('CHN', 'JPN','IND'));
    
  • alter demo
    ALTER TABLE $tbl REORGANIZE PARTITION $partition_key INTO
    (PARTITION $new_partition_key VALUES IN $new_sets)
    
  • alter example
    ALTER TABLE sales REORGANIZE PARTITION ASIA INTO
    (PARTITION ASIA VALUES IN ('CHN','JPN','IND','KOR'));
    
Hash partitioning
  • 适用于大量离散数据的情形
  • SQL server会自动通过hash 函数将离散的数据均匀存储在先前设置的分区中

  • create demo

    CREATE TABLE $tbl
    ($fld_setting)
    PARTITION BY HASH ($fld)
    PARTITIONS $partitions_num
    (PARTITION $partition_key...);
    
  • create example
    CREATE TABLE sales
    (sale_id INT NOT NULL,
     cust_id INT NOT NULL ,
     store_id INT NOT NULL,
     sale_date DATE NOT NULL,
     amount DECIMAL(9,2)
    )
    PARTITION BY HASH ( cust_id )
    PARTITIONS 4
    (PARTITION H1,
        PARTITION H2,
        PARTITION H3,
        PARTITION H4
        );
    
Composite partition
  • subpartition
  • demo

    CREATE TABLE $tbl
    ($fld_setting)
    PARTITION BY ...
    SUBPARTITION BY ...
    (PARTITION $partition_key_1 ...
    (SUBPARTITION $subpartition_key_11 ...
    SUBPARTITION $subpartition_key_12 ...
    ...),
    PARTITION $partition_key_2 ...
    (SUBPARTITION $subpartition_key_21 ...
    SUBPARTITION $subpartition_key_22 ...
    ...),
    ...
    );
    
  • example

    CREATE TABLE sales
    (sale_id INT NOT NULL,
     cust_id INT NOT NULL ,
     store_id INT NOT NULL,
     sale_date DATE NOT NULL,
     amount DECIMAL(9,2)
    )
    PARTITION BY RANGE (yearweek(sale_date))
    SUBPARTITION BY HASH ( cust_id )
    (PARTITION s1 values less than (202002)
        (SUBPARTITION s1_h1,
        SUBPARTITION s1_h2,
        SUBPARTITION s1_h3,
        SUBPARTITION s1_h4),
    PARTITION s2 values less than (202003)
        (SUBPARTITION s2_h1,
        SUBPARTITION s2_h2,
        SUBPARTITION s2_h3,
        SUBPARTITION s2_h4),
    PARTITION s3 values less than (202004)
        (SUBPARTITION s3_h1,
        SUBPARTITION s3_h2,
        SUBPARTITION s3_h3,
        SUBPARTITION s3_h4),
    PARTITION s4 values less than (202005)
        (SUBPARTITION s4_h1,
        SUBPARTITION s4_h2,
        SUBPARTITION s4_h3,
        SUBPARTITION s4_h4),
    PARTITION s5 values less than (202006)
        (SUBPARTITION s5_h1,
        SUBPARTITION s5_h2,
        SUBPARTITION s5_h3,
        SUBPARTITION s5_h4),
    PARTITION s999 values less than (MAXVALUE)
        (SUBPARTITION s999_h1,
        SUBPARTITION s999_h2,
        SUBPARTITION s999_h3,
        SUBPARTITION s999_h4)
    );
    

Partitioning Benefits

  • partition pruning
    • 例如,当需要的数据集仅为所有数据的一部分时(如,仅需要今年的数据)可以使用partition根据需要进行分块。
  • partition-wise join
    • 使用partitioned table进行join操作时,SQL服务器会自动使用需要的部分partitions

results matching ""

    No results matching ""