运维联盟俱乐部

 找回密码
 立即注册
查看: 331|回复: 0

[日常管理] Gathering Statistics for the Cost Based Optimizer (Pre 10g) (Doc ID 114671.1)

[复制链接]
  • TA的每日心情
    开心
    2023-8-9 11:05
  • 发表于 2023-12-14 16:12:30 | 显示全部楼层 |阅读模式
    The cost-based optimization approach uses statistics to estimate
    the cost of each execution plan. You should gather statistics periodically
    for objects where the statistics become stale over time because of changing
    data volumes or changes in column values.
    For more information about when to gather new statistics, see
    Note:44961.1 Gathering statistics frequency and strategy guidelines

    There are 2 ways to collect statistics on schema objects:

    1. DBMS_STATS (Oracle 8.1.5 and higher)
    1.1 GATHER_DATABASE_STATS Statistics for all objects in a database
    1.2 GATHER_SCHEMA_STATS Statistics for all objects in a schema
    1.3 GATHER_TABLE_STATS Table, column, and index statistics
    1.4 GATHER_INDEX_STATS Index statistics

    2. ANALYZE or PL/SQL package with Recursive ANALYZE statement (old fashion)
    2.1 ANALYZE command
    2.2 DBMS_UTILITY
    2.3 DBMS_DDL

    There are new enhancements to dbms_stats but not to analyze.


    ****************************************
    1. DBMS_STATS package (8.1.5 and higher)
    ****************************************

    - Oracle Corporation strongly recommends that you use the DBMS_STATS package
    rather than ANALYZE to collect optimizer statistics.

    - The DBMS_STATS package can gather statistics on indexes, tables, columns,
    and partitions, as well as statistics on all schema objects in a schema
    or database.

    - It does not gather cluster statistics, but you can use DBMS_STATS to gather
    statistics on the individual tables instead of the whole cluster.

    - The statistics-gathering operations can run either serially or in parallel.
    Whenever possible, DBMS_STATS calls a parallel query to gather statistics
    with the specified degree of parallelism;
    otherwise, it calls a serial query or the ANALYZE statement.

    o For example ,With 8.1.x DBMS_STATS uses recursive ANALYZE to gather statistics on
    - All Index Statistics
    - All Histograms

    o For example ,With 9.2.x DBMS_STATS uses recursive ANALYZE to gather statistics only on
    * CLUSTER INDEX
    * DOMAIN INDEX
    * JOIN INDEX

    - By default DBMS_STATS will generate statistics for the table and not it's
    indexes (By default CASCADE => FALSE).

    - Global statistics Note:236935.1

    o For partitioned tables and indexes, DBMS_STATS can gather separate
    statistics for each partition as well as global statistics for the entire
    table or index.

    o Similary, for composite partitioning DBMS_STATS can gather separate
    statistics for subpartitions, partitions, and the entire table or index.

    o Depending on the SQL statement being optimized, the optimizer may choose
    to use either the partition (or subpartition) statistics or the global
    statistics. Unless the query predicate narrows the query to a single
    partition, the optimizer uses the global statistics.
    Because most queries are not likely to be this restrictive,
    it is most important to have accurate global statistics.

    o Therefore, actually gathering global statistics with the DBMS_STATS package
    is highly recommended, rather than calculating them with the ANALYZE statement.

    - It can collect statistics only on the "stale" objects when monitoring is
    enabled. See: Note:102334.1

    - In 9i and above, it can also do
    o Estimate statistics with automatically determined adequate sample size,
    o Create histograms only on the columns they would be useful.


    1.1 DBMS_STATS.GATHER_DATABASE_STATS
    ====================================
    Can gather statistics on all the tables and indexes in a database.

    The GATHER_DATABASE_STATS procedure - some parameters - :

    - estimate_percent
    o NULL means compute
    o The valid range is [0.000001,100].
    o DBMS_STATS.AUTO_SAMPLE_SIZE so Oracle determines the
    best sample size for good statistics (Recommended value 9i and above).

    - block_sample
    o Only pertinent when doing an estimate statistics.
    o Use or not random block sampling instead of random row sampling.
    o Random block sampling is more efficient,
    but if the data is not randomly distributed on disk,
    then the sample values may be somewhat correlated.

    - method_opt
    o DEFAULT (prior to 10g 'FOR ALL COLUMNS SIZE 1', starting with 10g 'FOR ALL COLUMNS SIZE AUTO')
    o FOR ALL [INDEXED | HIDDEN] COLUMNS [size_clause]
    . INDEXED can be used when predicate columns are all indexed
    . HIDDEN for columns in Function-Based Indexes
    . ALL when all columns appear in predicates
    o FOR COLUMNS [size clause] column|attribute [size_clause]
    [,column|attribute [size_clause]...],
    o size_clause := SIZE {integer | REPEAT | AUTO | SKEWONLY}
    . integer Number of histogram buckets. Valid Range [1,254]
    . REPEAT Collects histograms only on the columns that
    already have histograms (9i and above).
    . AUTO Oracle determines the columns to collect histograms
    based on data distribution and the workload of the
    columns (9i and above).
    . SKEWONLY Oracle determines the columns to collect histograms
    based on the data distribution of the columns (9i and above).
    o Note:252361.1 DBMS_STATS always sets AVG_ROW_LEN=100 or left the old value
    - degree
    o Degree of parallelism.
    o NULL means use the table default value
    o DBMS_STATS.DEFAULT_DEGREE means value based on the init parameters (9i and above).

    - granularity
    o Only pertinent if the table is partitioned.
    o DEFAULT: Gather global- and partition-level statistics.
    o GLOBAL: Gather global statistics.
    o PARTITION: Gather partition-level statistics.
    o SUBPARTITION: Gather subpartition-level statistics.
    o ALL: Gather all (subpartition, partition, and global)stats.
    o Analyze table does not replace statistics
    o Global statistics - DBMS_STATS versus ANALYZE

    - cascade
    o Default (CASCADE => FALSE). No Index Gathering.
    o Gathers statistics on the indexes as well (TRUE).
    o Index statistics gathering is not parallelized prior to 9iR2
    o Equivalent to gather_index_stats procedure on each of the indexes in
    the database in addition to gathering table and column statistics.
    o Note:159374.1 DBMS_STATS.GATHER_SCHEMA_STATS don't analyze indexes

    - options
    o GATHER: Default. Gathers statistics on all objects.
    o GATHER AUTO: Gathers all necessary statistics automatically.
    Oracle implicitly determines which objects need new statistics,
    and determines how to gather those statistics (9i and above).
    o GATHER STALE: Gathers statistics on stale objects as determined by
    looking at the *_tab_modifications views.
    o GATHER EMPTY: Gathers statistics on objects which currently have no
    statistics.
    o Note:102334.1 How to Automate Change Based Statistic Gathering - Monitoring Table
    o Note:228186.1 Differences between GATHER STALE and GATHER AUTO

    - gather_sys
    o Gathers statistics on the objects owned by the 'SYS' user (TRUE)


    1.2 DBMS_STATS.GATHER SCHEMA_STATS
    ==================================
    Can gather statistics on all the tables and indexes in a schema.

    The GATHER SCHEMA_STATS procedure - some parameters - :
    o ownname => Schema to analyze (NULL means current schema).
    o See GATHER_DATABASE_STATS except GATHER_SYS


    1.3 DBMS_STATS.GATHER_TABLE_STATS
    =================================
    Can gather statistics on a table and its indexes.

    The GATHER_TABLE_STATS procedure - some parameters - :
    o ownname => Schema to analyze (NULL means current schema).
    o tabname => Name of table.
    o partname => Name of partition.
    o See DBMS_STATS.GATHER SCHEMA_STATS


    1.4 DBMS_STATS.GATHER_INDEX_STATS
    =================================
    Can gather statistics on an index.

    The GATHER_INDEX_STATS procedure - some parameters - :
    o ownname => Schema to analyze (NULL means current schema).
    o indname => Name of Index.
    o partname => Name of partition.
    o See DBMS_STATS.GATHER SCHEMA_STATS





    ***********
    2. ANALYZE
    ***********

    ===================
    2.1 ANALYZE Command
    ===================

    - Old fashion
    - Oracle Corporation strongly recommends that you use the DBMS_STATS package
    rather than ANALYZE to collect optimizer statistics.
    - The ANALYZE command gathers statistics in a serial manner.
    - It can be used with an ESTIMATE or COMPUTE statistics.
    - If you specify ANALYZE TABLE ESTIMATE STATISTICS without a sample size,
    Oracle will only gather statistics based on 1064 rows.
    When using the ANALYZE command, you should always specify a sample size so that
    the default does not take place.
    (i.e. analyze table test estimate statistics sample size 20 percent;)
    - The COMPUTE statistics option will guarantee that the optimizer has the best
    statistics available in order to determine an execution path for a given query.
    (i.e. analyze table test compute statistics;)

    - ANALYZE collects only partition level statistics and derives the entire
    table level statistics by aggregation.

    - By default the ANALYZE TABLE command will generate statistics for the table
    and all it's indexes provided that the FOR clause is not used.


    ========================
    2.2 DBMS_UTILITY Package
    ========================

    The DBMS_UTILITY package provides 2 procedures to assist in the gathering of
    statistics. It generates ANALYZE command for different objects.

    2.2.1. ANALYZE_SCHEMA
    =====================
    Gathers statistics on all the tables, clusters and indexes in a schema.

    The ANALYZE_SCHEMA procedure accepts 5 arguments:
    - schema => the schema to be analyzed
    - method => ESTIMATE, COMPUTE or DELETE.
    If ESTIMATE, then either
    . estimate_rows or
    . estimate_percent must be non-zero.
    - estimate_rows => Number of rows to estimate.
    - estimate_percent => Percentage of rows to estimate.
    If estimate_rows is specified,
    then this parameter is ignored.
    - method_opt => [ FOR TABLE ] [ FOR ALL [INDEXED] COLUMNS] [SIZE n]
    [ FOR ALL INDEXES ]

    For more information on the ANALYZE_SCHEMA procedure Note:67615.1
    For more information on the ANALYZE_SCHEMA procedure Note:1011835.102


    2.2.2. ANALYZE_DATABASE
    =======================

    Gathers statistics on all the tables, clusters and indexes in a database.

    The ANALYZE_DATABASE procedure accepts 4 arguments:
    - method => ESTIMATE, COMPUTE or DELETE.
    If ESTIMATE, then either
    . estimate_rows or
    . estimate_percent must be non-zero.
    - estimate_rows => Number of rows to estimate.
    - estimate_percent => Percentage of rows to estimate.
    If estimate_rows is specified,
    then this parameter is ignored.
    - method_opt => [ FOR TABLE ] [ FOR ALL [INDEXED] COLUMNS] [SIZE n]
    [ FOR ALL INDEXES ]

    For more information on the ANALYZE_DATABASE procedure Note:67616.1


    ====================
    2.3 DBMS_DDL Package
    ====================

    The DBMS_DDL package provides one procedure which gathers also statistics:

    2.3.1 ANALYZE_OBJECT
    ====================

    Equivalent to SQL

    "ANALYZE
    TABLE|CLUSTER|INDEX [.]
    [] STATISTICS
    [SAMPLE [ROWS|PERCENT]]"

    For more information on the ANALYZE_OBJECT procedure Note:77283.1



    RELATED DOCUMENTS
    -----------------

    回复

    使用道具 举报

    您需要登录后才可以回帖 登录 | 立即注册

    本版积分规则

    运维联盟俱乐部 ( 冀ICP备19036648号 )

    GMT+8, 2024-5-2 11:38 , Processed in 0.049206 second(s), 21 queries , Gzip On.

    Powered by Discuz! X3.4

    © 2001-2023 Discuz! Team.

    快速回复 返回顶部 返回列表