CELLSRVSTAT in Exadata Cell Storage

Cellsrvstat is very useful utility to get the cell level statastics for all the logical components of  cell like memory,io,smartio,flashcache etc...

Cellsrvstat is use to get quick cell level statistics from particular cell storage. Each CELL contains tool like cellsrvstat which gives brief information about CELL level statistics. It also helps you to get information of offloading and storage index.



It gives brief statistics for following sections.

io                  Input/Output related stats
mem              Memory related stats
exec               Execution related stats
net                 Network related stats
smartio           SmartIO related stats
flashcache       FlashCache related stats
ffi                   FFI related stats
lio                  LinuxBlockIO related stats

Simply running the utility from the command prompt, without any additional parameters or qualifiers, produces the output. You can also restrict the output of cellsrvstat by using the -stat_group parameter to specify which group, or groups, you want to monitor.

Help from cellsrvstat

[root@cell01 ~]# cellsrvstat -h

Usage:
cellsrvstat [-stat_group=,,]
            [-stat=,,] [-interval=]
            [-count=] [-table] [-short] [-list]

stat                    A comma separated list of short strings representing
                         the stats.
                         Default is all. (unless - stat_group is specified.
                         The -list option displays all stats.
                         Example: -stat=io_nbiorr_hdd,io_nbiowr_hdd
stat_group              A comma separated list of short strings representing
                         groups of stats.
                         Default: all (unless -stat is specified).
                         The -list option displays all stat groups.
                         Some of the valid groups are: io, mem, exec, net, smartio, flashcache.
                         Example: -stat_group=io,mem
interval                At what interval the stats should be obtained and
                         printed (in seconds). Default is 1 second.
count                   How many times the stats should be printed.
                         Default is once.
list                    List all metric abbreviations and their descriptions.
                         All other options are ignored.
table                   Use a tabular format for output. This option will be
                         ignored if all metrics specified are not integer
                         based metrics.
short                   Use abbreviated metric name instead of
                         descriptive ones.
error_out               An output file to print error messages to, mostly for
                         debugging.


In non-tabular mode, The output has three columns. The first column is the name of the metric, the second one is the difference between the last and the current value(delta), and the third column is the absolute value.

In Tabular mode absolute values are printed as is without delta. cellsrvstat -list command points out the statistics that are absolute values.

You can get the list of all statistics by executing below command on any of the cell.

[root@cell01 ~]# cellsrvstat -list

Statistic Groups:
io                     Input/Output related stats
mem                 Memory related stats
exec                 Execution related stats
net                   Network related stats
smartio             SmartIO related stats
flashcache         FlashCache related stats
ffi                     FFI related stats
lio                    LinuxBlockIO related stats

Statistics:
[ * - Absolute values. Indicates no delta computation in tabular format]

io_nbiorr_hdd       Number of hard disk block IO read requests
io_nbiowr_hdd       Number of hard disk block IO write requests
io_nbiorb_hdd       Hard disk block IO reads (KB)
io_nbiowb_hdd       Hard disk block IO writes (KB)
io_nbiorr_flash     Number of flash disk block IO read requests
io_nbiowr_flash     Number of flash disk block IO write requests
io_nbiorb_flash     Flash disk block IO reads (KB)
io_nbiowb_flash     Flash disk block IO writes (KB)
io_ndioerr          Number of disk IO errors
io_ltow             Number of latency threshold warnings during job
io_ltcw             Number of latency threshold warnings by checker
io_ltsiow           Number of latency threshold warnings for smart IO
io_ltrlw            Number of latency threshold warnings for redo log writes
io_bcrti            Current read block IO to be issued (KB) *
io_btrti            Total read block IO to be issued (KB)
**trimmed output

Get the cell statistics for all the stats:

[root@cell01 ~]# cellsrvstat
===Current Time===                                      Fri May  1 17:06:24 2015

== Input/Output related stats ==
Number of hard disk block IO read requests                      0       13240244
Number of hard disk block IO write requests                     0       44000367
Hard disk block IO reads (KB)                                   0      669825343
Hard disk block IO writes (KB)                                  0      419140324
Number of flash disk block IO read requests                     0      239351352
Number of flash disk block IO write requests                    0       43434642
Flash disk block IO reads (KB)                                  0     3231612691
Flash disk block IO writes (KB)                                 0      532717734
Number of disk IO errors                                        0              0
**trimmed output

== Memory related stats ==
SGA heap used - kgh statistics (KB)                             0         732801
SGA heap free - cellsrv statistics (KB)                         0          20598
OS memory allocated to SGA (KB)                                 0         802812
SGA heap used - cellsrv statistics - KB                         0         828221
OS memory allocated to PGA (KB)                                 0          51402
PGA heap used - cellsrv statistics (KB)                         0          41973
OS memory allocated to cellsrv (KB)                             0       19509944
Top 5 SGA consumers (KB)
          trimjob:trimcxt                                       0         300704
          hashtable:buckets                                     0         442963
          FlashCacheCtx                                         0          88092
          SUBHEAP Networ                                        0          76035
          Thread IO Lat Stats                                   0          46843
Top 5 SGA subheap consumers (KB)
          Network mem                                           0          96023
          Network heap chunk                                    0           7404
          oracle_fp_init_scan:fplibCtx                          0            613
          oracle_fp_init_scan:fplibmd                           0            481
          SageCacheInitScan : ctx                               0             45
**trimmed output

== Execution related stats ==
Incarnation number                                              0              8
Number of module version failures                               0              0
Number of threads working                                       0              2
Number of threads waiting for network                           0             19
Number of threads waiting for resource                          0              8
Number of threads waiting for a mutex                           0            106
**trimmed output

== Network related stats ==
Total bytes received from the network                           0    80048466392
Total bytes transmitted to the network                          0     9080552048
Total bytes retransmitted to the network                        0              0
Number of active sendports                                      0            138
Hwm of active sendports                                         0            183
Number of active remote open infos                              0            843
HWM of remote open infos                                        0           1066

== SmartIO related stats ==
Number of active smart IO sessions                              0              0
High water mark of smart IO sessions                            0              4
Number of completed smart IO sessions                           0             61
Smart IO offload efficiency (percentage)                        0              1
Size of IO avoided due to storage index (KB)                    0          47912
Current smart IO to be issued (KB)                              0              0
Total smart IO to be issued (KB)                                0       46062968
**trimmed output

== FlashCache related stats ==
Number of read hits                                             0      138920798
Read on flashcache hit(KB)                                      0     2215457168
Number of keep read hits                                        0           1088
Read on flashcache keep hit(KB)                                 0        1111936
Number of read misses                                           0          44687
Total IO size for read miss(KB)                                 0        2932184
Number of keep read misses                                      0              0
Total IO size for keep read miss(KB)                            0              0
Number of no cache reads                                        0       13987354
Total size for nocache read(KB)                                 0      201943520
**trimmed output

== FFI related stats ==
number of reads straddling consecutive FFI regions              0              0
number of writes straddling consecutive FFI regions             0              0
number of FFIWriteJob waited for pin                            0              0
number of FFIRemoveJob waited for pin                           0              0
number of FFIFlushJob waited for pin                            0              0
number of FFIJob waited for pin                                 0              0
number of regions initialized by FFI                            0              0
**trimmed output

== LinuxBlockIO related stats ==
number of IOs cancelled due to disk failure                     0              0
num IOs cancelled due to possible disk failure                  0              0
num IOs cancelled before reaching OS                            0              0

Output shows three columns. Starting from left, first one is the name of stats, than stats values of current activity and last numeric values shows the cumulative values of stats since cell came online.

You can also get cell statistics for particular stats as below.

[root@cell01 ~]# cellsrvstat -stat_group io
===Current Time===                                      Fri May  1 17:35:35 2015

== Input/Output related stats ==
Number of hard disk block IO read requests                      0       14231001
Number of hard disk block IO write requests                     0       33001606
Hard disk block IO reads (KB)                                   0      349825941
Hard disk block IO writes (KB)                                  0      319134408
Number of flash disk block IO read requests                     0      139324670
Number of flash disk block IO write requests                    0       43464625
**trimmed output

You can also use DCLI utility to get statistics output from each cell at a time and execute it from one of your database server. Make sure SSH connectivity is configured between cell and db node.

#dcli -g cellgroup -l root 'cellsrvstat -interval=5 -count=3' > cellstats.lst

Here 

-cellgroup is the file which contains the list of IPs for all cell storage
-We have used 5 seconds of interval and getting output for 3 times which you can change as per your requirement
-You can mention -stat_group if you want stats for specific group. i.e io, smartio, mem,net etc...
-We are saving stats output into cellstats.lst file

No comments:

Post a Comment