Metrics vs Statistics

September 4th, 2013



Here are  the tuning metrics tables (SQL  stats are not in “metric” tables per say)

(*DBA_HIST_…_HISTORY views are sort of confusing. AFAI remember they were storing alert history, but apparently they are used for adaptive thresholds – an area for future investigation)

MH900182646

I’ve noticed a number of people posting queries using DBA_HIST_SYSSTAT instead of DBA_HIST_SYSMETRIC_SUMMARY which leads me to believe that there is some confusion or lack of information on the metric tables.

Oracle 10g introduced metric tables which compute deltas and rates of statistics thus hugely simplifying the ability to answer simple questions like “what is the I/O rate on my databases right now.” This question, before 10g, was surprisingly tedious to answer. To answer the question one would have to query v$sysstat  for example:

Select value from v$sysstat where name=’physical reads’;

but querying v$sysstat just once fails to answer the question but instead answers the question “How much I/O has been done since the database was started”. To answer the original question one would have to query v$sysstat twice and take the delta between the two values:

  • Take value at time A
  • Take value at time B
  • Delta = (B-A)
  • and/or get Rate = (B-A)/elapsed time

Getting these deltas and rates could be a pesky task especially working with a customer over the phone. Then 10g Oracle introduced metric tables which answer the questions in one single query using

V$SYSMETRIC

such as

Select  VALUE , METRIC_UNIT,INTSIZE_CSEC
from v$sysmetric
where metric_name='Physical Reads Per Sec';
VALUE METRIC_UNIT           INTSIZE_CSEC
---------- ----------------- ------------
654.6736 Reads Per Second          5959
134.9835 Reads Per Second          1515

Notice that the query returns 2 rows. The first row is the the last minute (ie 59.59 seconds) and the second row is the last 15 seconds (ie 15.15 seconds). Oracle collects both the deltas and rates for 60 second and 15 second intervals.

Oracle has the average, maximum, minimum for the values for the last hour in

V$SYSMETRIC_SUMMARY

that one can query like:

select MAXVAL,MINVAL,AVERAGE,STANDARD_DEVIATION
from V$SYSMETRIC_SUMMARY
where metric_name='Physical Reads Per Sec';

MAXVAL     MINVAL    AVERAGE      STANDARD_DEVIATION
---------- ---------- ---------- ------------------
3.71784232          0 .076930034         .478529283

Also for the last hour Oracle stores the 60 second intervals and for the last 3 minutes the 15 second intervals in

V$SYSMETRIC_HISTORY

 

Then for the last week by default, Oracle saves the values for each hour including the maximum, minimum, average, stddev etc in

 

DBA_HIST_SYSMETRIC_SUMMARY

 

Issues

One issue with using

  • V$SYSMETRIC – last 15 and 60 seconds
  • V$SYSMETRIC_SUMMARY – values  last hour (last snapshot)  like avg, max, min etc
  • V$SYSMETRIC_HISTORY – last hour for 1 minute, last 3 mintes for 15 second deltas
  • DBA_HIST_SYSMETRIC_SUMMARY – hour summaries for last week.

is becoming familiar with the statistics names which are different from v$sysstat. We can look at

V$METRICNAME

For the group_names (statistic definitions)

For trending data over multiple days, the view DBA_HIST_SYSMETRIC_SUMMARY can be used.  The view has a history of all theSystem Metrics Long Duration statistics.  If you store multiple databases in the same AWR repository you can check the statistics available to a particular DBID with the view DBA_HIST_METRIC_NAME.

 

 

Easy query

 

The view DBA_HIST_SYSMETRIC_SUMMARY  can be queried easily for trending metrics, such as the simple query for bytes read by the database per second:

select   dbid, to_char( begin_time ,'YYYY/MM/DD HH24:MI'),
         round(average)
from     dba_hist_sysmetric_summary
where    metric_name= 'Physical Read Total Bytes Per Sec' /* and DBID=[dbid if share repository] */
order by begin_time;

Tougher Query

 

Compare the above query  to the same query on DBA_HIST_SYSSTAT (note there are a lot of stats in v$sysstat)

with stats as (
           select sn.dbid,
                  st.stat_name,
                  to_char(cast(begin_interval_time as date ), 'YYYY/MM/DD HH24:MI') btime,
                  -- Lag (st.value) OVER( PARTITION BY st.stat_name,st.dbid ORDER BY sn.snap_id)
                  Lag (st.value) OVER( PARTITION BY st.stat_name,st.dbid,st.startup_time ORDER BY sn.snap_id)
                               value_beg,
                  st.value     value_end,
                  (cast(end_interval_time as date) - cast(begin_interval_time as date )) * (24*60*60) delta
           from
                  DBA_HIST_SYSSTAT  st,
                  DBA_HIST_SNAPSHOT sn
           where
                    sn.snap_id=st.snap_id and
                    sn.dbid=st.dbid and
                    (st.stat_name= 'physical read total bytes')
           order by begin_interval_time
     )
   select
          dbid, btime,
          round((value_end-value_beg)/delta) rate_per_sec
   from stats
   where (value_end-value_beg) > 0
;

Its a bit disconcerting to note that the above two queries don’t return the exact same data on my laptop. If it was roughly the same that would be fine, and in general the stats are the similar but there are cases where they differ dramatically.  I don’t see anything obvious in the way the queries are written.  Possibly has to do with database bounces or the way the database is affected by the laptop’s sleep and hibernate modes. Will have to look into this farther.

One trick to make the data easy to load into Excel is to use the html output format and spool to a file with an “.html” extension

SET markup HTML on
spool output.html

 

Other info

 

List of DBA_HIST views

https://sites.google.com/site/oraclemonitor/awr-views

 

Power Struggle Between a Man and a Woman


Oracle, performance
, , ,

  1. Trackbacks

  1. Comments

  2. September 5th, 2013 at 05:57 | #1

    Hi Kyle,

    add startup_time to the partition by clause. That will fix the wrong data problem around db bounces.
    Lag (st.value) OVER( PARTITION BY st.stat_name,st.dbid ORDER BY sn.snap_id)
    becomes
    Lag (st.value) OVER( PARTITION BY st.stat_name,st.dbid,st.startup_time ORDER BY sn.snap_id)

  3. khailey
    September 5th, 2013 at 14:14 | #2

    @Ronald. Thanks! good info. One of those nagging questions was how to protect output from spurious values due to db bounces

  4. Andrea
    September 25th, 2013 at 15:43 | #3

    Hi Mr. Hailey
    (sorry for my english)

    I noted 2 things.

    =1=
    At line 6, the startup_time belong to DBA_HIST_SNAPSHOT, so it shoul be:

    Lag (st.value) OVER( PARTITION BY st.stat_name,st.dbid, sn.startup_time ORDER BY sn.snap_id)

    =2=
    In RAC environment, the query could be confusing because there are 2 (or more) lines, one for each instance. So I modified the statement as follow (I checked it on single instance db and two RAC node);

    with stats as (
    select si.instance_name, — changed instance number with instance name
    st.stat_name,
    to_char(cast(begin_interval_time as date ), ‘YYYY/MM/DD HH24:MI’) btime,
    — Lag (st.value) OVER( PARTITION BY st.stat_name,st.dbid ORDER BY sn.snap_id)
    Lag (st.value) OVER( PARTITION BY st.stat_name,st.dbid,sn.startup_time ORDER BY sn.snap_id) value_beg, st.value value_end,
    (cast(end_interval_time as date) – cast(begin_interval_time as date )) * (24*60*60) delta
    from
    DBA_HIST_SYSSTAT st,
    DBA_HIST_SNAPSHOT sn,
    DBA_HIST_DATABASE_INSTANCE si — this is for instance name
    where
    sn.snap_id=st.snap_id and
    sn.dbid=st.dbid and
    (st.stat_name= ‘physical read total bytes’) and
    sn.dbid=si.dbid and sn.instance_number=si.instance_number
    order by begin_interval_time
    )
    select
    instance_name, btime,
    round((value_end-value_beg)/delta) rate_per_sec
    from stats
    where (value_end-value_beg) > 0
    /


four − = 2