PROVE IT !!If you know it then PROVE IT !! Skill Proficiency Test

Performance Data Collection & Important DBC Tables

DBQL – Database Query Logging

What is it?

DBQL captures important information about queries that run on your system.  With DBQL, you can find out everything from who uses the most CPU and when, to which step in a particular query was skewed and how much CPU each step used, and tons more.  This information is critical in order to know what is going on with your system, and is even more important for upgrade situations.

The tables include:
• DBC.DBQLogTbl (default table, core performance data of the query)
• DBC.DBQLSqlTbl (full SQL text of the query)
• DBC.DBQLObjTbl (objects accessed by the query)
• DBC.DBQLStepTbl (step processing performance by the query)
• DBC.DBQLExplainTbl (explain text of the query)
• DBC.DBQLSummaryTbl (summary construct typically for tactical queries)

How should you collect as a normal course of business?

The following is the recommended practice from the Performance and Workload Management COE:

Logging is best accomplished by one logging statement for each accountstring on the system. If DBQL logging is currently active, but is not done according to these guidelines, end all current logging before running the recommended statements.

Example for all regular work (detail logging statement):

BEGIN Query Logging with SQL, Objects LIMIT sqltext=0 ON ALL ACCOUNT = ‘$M1$WXYZ&S&D&H’;

For tactical (sub second, known work only) V2R5:

BEGIN QUERY LOGGING LIMIT SUMMARY=1, 3, 5 ON ALL ACCOUNT = ‘$M1$WXYZ&S&D&H’;

For tactical V2R6 and up:

BEGIN QUERY LOGGING LIMIT SUMMARY=10, 50,100 CPUTIME ON ALL ACCOUNT = ‘$M1$WXYZ&S&D&H’;

Another good option for the Tactical work would be to use Threshold logging in place of Summary logging above.  Threshold logging works like Summary logging, except if a defined threshold is met the query is logged in the DBQLogTbl (detail) table instead of the summary table.  The PS Performance and Workload Management COE is beginning to recommend threshold logging in place of summary logging for Tactical work, since it catches a little more information for longer queries.

(The temptation may be to use Threshold logging for everything, but this is not recommended.  When using threshold logging, you can never log to the other tables.  Therefore, you cannot capture full SQL, the explain or object data.  Only use threshold for Tactical work.)

If a mature DBQL maintenance process does not exist, a standard process can be provided upon request or can be downloaded from the toolbox on Teradata.com.  (More information at the bottom of this blog)

How should you collect prior to and following an upgrade?

With upgrades, DBQL data is even more important.  By capturing good detail data before and after the upgrade, you will better understand the reason behind any unanticipated changes in performance.

So, if you were not following the logging recommendations above before, you really need to start doing that for the upgrade.  Of particular importance is that you log in Detail mode, and capture Step, Explain, SQL and Object data.

An example of a logging statement to capture everything:

BEGIN Query Logging with SQL, stepinfo, explain,  Objects LIMIT sqltext=0 ON ALL ACCOUNT =$M1$WXYZ&S&D&H;

Detail data should be accumulated for all non tactical workloads for AT LEAST two weeks. Four weeks is preferred.  It is important that a large enough sample of this detail data is collected to represent a typical period for your business. 

When working with a tactical workload that would typically be logged in the summary table, ensure that a representative sample is present in the detail logging tables. AT LEAST one day of tactical workload should be available in the detail tables.  This means you should temporarily change your regular logging method for tactical work in order to capture all the detail to use for comparison after the upgrade.

Resusage

What is it?

Resusage is critical for the understanding of hardware performance characteristics. The tables involved include:
• DBC.ResusageSpma (Vproc Qty, CPU, IO, Memory, Processes, Bynet)
• DBC.ResUsageScpu (CPU per node)
• DBC.ResUsageSvpr (Vproc data – AMPs and PEs)
• DBC.ResUsageSldv (logical device I/O data)
• DBC.ResUsageShst (host traffic –channel and LAN- data)
• DBC.ResUsageSPdsk (disk level data)
• DBC.ResUsageIVPR (Internal GCC troubleshooting data)
• DBC.ResUsageIVPMA (Internal GCC troubleshooting data)
• DBC.ResUsageSAWT (Amp Worker Task data)
• DBC.ResUsageSps (schmon data)

Detailed descriptions of each of these tables can be found in the Teradata Database Resource Usage Macros and Tables Manual (B035-1099-083A).

How should you collect as a normal course of business?

System Resource data is collected by the database. Collection is not on by default. The xctl (screen RSS) utility can be used to turn on collection selectively and to set the rate of collection. This data is collected in the DBC.ResUsage tables. Note the RSS Active Row Filter Mode Enable — which limits the output to only active components. It is strongly recommended that this filter always be used with the AWT and SPS tables  (AWT and SPS information is available starting in TD12). A maximum of 600 seconds for collection and logging is required. A typical implementation will have RSS Collection Rate at 60 seconds and Node/Vproc Logging Rate at 600 seconds.

Make sure the following collections are enabled. Please note the Active Row Filter Mode for SAWT and SPS. Bolded items apply to only to Teradata 12.0. DO NOT use summary mode for SPS and SAWT.
tapout1:~ # ctl
> sc rss
(0) RSS Collection Rate: 60 sec
(1) Node Logging Rate : 600 sec (2) Vproc Logging Rate: 600 sec
RSS Table Logging Enable
(3) SPMA : On (4) IPMA : Off (5) SCPU: On 
(6) SVPR : On (7) IVPR : Off (8) SLDV: On (9) SHST: On
(A) SPDSK: On (B) SVDSK: On (C) SAWT: On 
(D) SPS : On 

RSS Summary Mode Enable
Summarize SPMA: Off Summarize IPMA : Off (E) Summarize SCPU : Off
(F) Summarize SVPR: Off (G) Summarize IVPR : Off (H) Summarize SLDV : Off
(I) Summarize SHST: Off (J) Summarize SPDSK: Off (K) Summarize SVDSK: Off
(L) Summarize SAWT: Off (M) Summarize SPS : Off
RSS Active Row Filter Mode Enable
Active SPMA: Off Active IPMA : Off Active SCPU : Off
Active SVPR: Off Active IVPR : Off Active SLDV : Off
Active SHST: Off Active SPDSK: On Active SVDSK: On 
Active SAWT: On Active SPS: On

How should you collect prior to and following an upgrade?

The above recommendation should be in effect for a minimum of 4 weeks prior to an upgrade, and immediately activated on the new system. If this collection will not provide a representative set of data, consider collecting for a longer period of time.

For upgrade purposes, consider turning on resusageIPMA and ResusageIVPR as well.  After the upgrade is successful, it can be turned off as a normal course of action.

Accounting (DBC.Acctg or AmpUsage)

What is it?

Accounting data enables a view of database query IO and CPU at a userid/accountstring level. This data provides the ability to measure "backlog" (or pent-up demand) prior to the change, and allows analysis on how much the “backlog” has been relieved after the upgrade.

While this data is also available in DBQL, AmpUsage data tends to be more accurate as DBQL does not capture all CPU and I/O for queries that are aborted or end in errors, and for some of the work done by utilities.

How should you collect as a normal course of business?

The collection of this data does not require the activation of a process because it is always on. However, it is collected based upon the accountstring expansion (ASE) variables found within the accountstring. Best practice for the ASE variables is:

Sample: account = $M1$MSIR&S&D&H

Performance Group: $M1$ substr(accountstring,2,2) = prfgrp
Workgroup Group: MSIR substr(accountstring,5,4) = wrkgrp

Where "MSI" is the Application or other identifier and "R" is for Reporting. Other designations may be: B(Batch), T(Tactical), O(Online), and A(Ad-hoc). Account String Expansion (ASE) Variables: &S&D&H (&S – session, &D – date, &H – hour.) The best practice suggests collection at the hour level, which will be accomplished by using the accountstring format above (&H). This data is used in conjunction with DBQL to provide a clear picture of system query performance and is extremely valuable for capacity planning analysis.
 

How should you collect prior to and following an upgrade?

No special logging beyond that described above is needed during upgrades.

PSF Usage Data  (Schmon data)

What is it?

PSF Usage data, or Schmon data, reports information about system usage as seen from the vantage point of the Priority Scheduler Facility.  Whether you use PSF or TASM, under the covers PSF is in charge of connecting queries to resources.

PSF Usage Data will be able to show CPU and I/O by Allocation Group or TASM Workload.  It will show important information related to Amp Worker Task (AWT) usage, including things like Service Time and Queue Length.

If you are on Teradata 12 or higher, then this information is collected in the ResUsageSPS table described in the Resource Usage section above.  For versions prior to TD 12, the following information applies.

How should you collect as a normal course of business?

Teradata Manager contains a function that provides a mechanism for capturing PSF usage data. The procedure below details activating this collection within Teradata Manager

 

Screen shots provided reflect Teradata Manager Version 7.1

Start Teradata Manager and select Administer and then Teradata Manager to access the setup.

Click on the Data Collection Tab.

 

Priority Scheduler Data Collection has three configuration entries:

  1. Priority Scheduler Configuration – Collects priority scheduler configuration definitions at the specified collection time.
  2. Priority Scheduler Node Performance – Collects priority scheduler node level monitor output at the specified collection time.
  3. Priority Scheduler System Performance – Collects priority scheduler system level monitor output at the specified collection time.

Click on Priority Scheduler Configuration, then click on Configure

  

Click on Schedule — The times should be set to match any known customer configuration changes. For example if the customer switches Priority Scheduler setting at 8:00 a.m. and 8:00 p.m. set the collection as follows.

 Schedule

Click on Priority Scheduler Node then click Configure

The next entry is for setting up node level collection. This example shows the retention set to 2 years and 10 minute collection intervals. The intent is to time collection as close as possible to Resusage collection.

 

Click on Schedule

 

Click on Priority Scheduler System Level then click on configure

The next entry is for setting up system level collection. This example shows the retention set to 2 years and 10 minute collection intervals. The intent is to time collection as close a possible to Resusage collection.

 

Click on Schedule

How should you collect prior to and following an upgrade?

The same as above

PSF/TASM Configuration

What is it?

Teradata Active System Management (TASM) is an advanced workload management system on Teradata that helps manage and control work on the system.  Prior to TASM, the same functions were handled by Priority Scheduler Facility (PSF).

How should you collect as a normal course of business?

I would recommend saving the information documented in the "How should you collect prior to and following an upgrade" periodically, just as a matter of disaster recovery. 

How should you collect prior to and following an upgrade?

Ensure that the PSF/TASM configuration information is captured prior to the upgrade to facilitate comparisons. Executing a “schmon –d” command on any Teradata node will provide PSF configuration information. The Teradata Manager collection procedure for PSF data will also capture PSF configurations.

TASM rulesets can be saved as flat files through the Teradata Workload Manager (TDWM) application. Load the active ruleset and select file – save and save the .dqm file to an accessible location. It will need to be opened using the same client version from which it was saved.

AMP Worker Task Data

What is it?

Amp worker task data is critical for identifying system concurrency saturation. Large adjustments in system configurations may cause the overall system workload to be processed differently. Maintaining consistent records of AWT usage data is necessary to validate upgrades.

How should you collect as a normal course of business?

Prior to TD12, AWT collection is done by executing PUMA commands on a system node and populating a table with the output. If no AWT collection process is available, scripts can be obtained through the Performance and Workload Management COE. Teradata 12 AWT collection is included in the resusage tables.

How should you collect prior to and following an upgrade?

Same as above.

Canary Data

Customers often have canary query processes that monitor the response times of critical workloads. Maintaining pre upgrade canary data for comparison with post upgrade canary data is an easy way to provide quick performance comparisons.

Data Structures

Capturing a list of the data structures present before and after an upgrade is important. Capture the output of the following tables prior to an upgrade:

DBC.DBASE
DBC.TVM
DBC.Indexes
DBC.TVFields

Performance Data Maintenance

Ensure that all performance data mentioned above is maintained in a separate database outside of DBC. Filling up the DBC database has many unfortunate consequences. Move the performance data to a historical database and purge the appropriate DBC tables. DBQL maintenance procedures can be obtained from the Teradata Toolbox (download here).

It may be necessary to archive and restore pre upgrade performance data onto the new machine. Ensure the archives are taken before the old machine is taken offline.

Add a Comment

Your email address will not be published. Required fields are marked *