Teradata Interview Question

 

Teradata Interview Question : SET 1

 

1) Can we compress a column which is already present in table using ALTER ?

Answer : No , A new table structure needs to be created which includes compression values and data needs to be inserted in the column. ALTER can only add new columns with compression values to table.

2) What does Pseudo table mean ?
Answer: Pseudo tables are the dummy tables, Whenever we place a request which involves full table scan and needs data to be retrieved from all AMPS then the parsing engine identifies a gatekeeper AMP which can command all other AMP to lock for a particular request from the user and does not allow multiple user to access the same table to prevent global deadlock.

3) What does ” Pseudo Table” Locks mean ?
Answer : Psuedo table lock is a false lock applied on table to prevent two user from applying locks with all amp request.

4) What is residual condition?
Answer: Residual condition means a filter is applied to a particular table to limit the number of rows to be retrieved into the spool.

Ex : sel * from employee where emp_num=100. This will limit the number of rows which will be retrived from table employee and number of rows stored in spool temporarily will be limited.

5) What do you mean by Spool “ Last use”
Answer: The particular spool file used in the step will be used for last time and the spool space will be released since it is no longer used in further steps of the query execution.

6) Why is BTET transaction processing overhead in Teradata.
Answer: 

  • BTET makes all the queries running between BT and ET as single transaction . If any of query fails then the entire set of queries will not be committed
  • BTET makes all the queries running between BT and ET as single transaction . If any of query fails then the entire set of queries will not be committed.
  • BTET also has an overhead with locking , since it holds locks on tables till the ET is occured or all the queries have executed successfully
  • DDL statements cannot be used everywhere in BTET processing , but these statements can be given towards the end of BTET transaction.
  • Using large number of BTET caused transient Journal to grow and has tendancy for System Restart.

7) What are the options not available for global temporary table ?
Answer :

  • Global temporary tables are the temporary tables and occupies the temporary space defined.
  • Temporary tables definition is stored in Data Dictionary, It act as template and provide separate instance to each user who refers to them.
  • The following options are not available for global temporary tables:
  • Any kind of constraints like check/referential cannot be applied on table
  • Identity columns since data in GTT are materialized only for session
  • Permanent Journaling cannot be done as data in tables are instances only to user in that session
  • PPI cannot be applied as data does not get stored in PERM , only TEMP space is utilized here.

8) What are the options not available for Volatile tables ?
Answer: Volatile tables occupies the spool space in Teradata, They exist for a particular session and are dropped off.

The following options are not available for volatile tables because table definition is not stored in data dictionary

  • Default values for columns
  • Title clause for columns
  • Named Indexes for table
  • Compression on columns/table level since table data are spooled
  • Stats cannot be collected since data is materialized only for session
  • Identity columns as these again would need entry in IDcol tables
  • PPI cannot be done on tables
  • Any kind of constraints like check/referential cannot be applied on table

9) What are permanent journals in Teradata ?
Answer:

  • Journals are used to capture information about table in Teradata. In case of Permanent journals they capture details of Journal enabled tables in teradata with all the pre transaction and post transaction details .
  • Journal tables are assigned PERM space and they reside in same database as of parent or they can reside on different database.
  • They are mainly used for protection of data and sometimes also for disaster recovery ( fallback is better in this case ) Permanent journal tables can be enabled or disabled by running alter database <databasename> ‘no journal’ /’ journal = <databasename.jrnltbl>’
  • Arcmain utility provides the feature of backing up Journal tables
  • We can find details about all journal tables present in teradata database using DBC.JOURNALS table.

10) How to find the skew factor of the table.
Answers:

SELECT
TABLENAME
,SUM(CURRENTPERM) /1024/1024 AS CURRENTPERM,
(100 (AVG(CURRENTPERM)/MAX(CURRENTPERM)*100)) AS SKEWFACTOR
FROM
DBC.TABLESIZE
WHERE DATABASENAME= <DATABASENAME>
AND
TABLENAME =<TABLENAME>
GROUP BY 1;




11) What is acceptable range of skew factor in the table?
Answer:
There is no particular range for skew factor. In case of production systems, it is suggested to keep skew factor between 5-10.
There are various considerations for skew factor

  • Number of AMPS
  • Size of row in a table
  • number of records in a table
  • PI of a table
  • Frequent access of table (Performance consideration)
  • Whether table getting loaded daily /monthly or how frequently data is being refreshed.

12) What is MultiValues compression in Teradata ?
Answer:
Multivalued compression or just MVC is a compression technique applied on columns in Teradata . MVC has a unique feature of compressing up-to 255 distinct values per column in a given table.
The advantage of compression are :

  • Reduced Storage cost by storing more of a logical data than physical data.
  • Performance is greatly improves due to reduced retrieval of physical data for that column.
  • Tables having compression always have an advantage since optimizer considers reduced I/O as one of the factors for generating EXPLAIN plan.

13) What are various ways by which we can use zero to replace a null value for a given column ?
Answer:
By using Teradata SQL supported command as follows

  • ZEROIFNULL : Select Col1, ZEROIFNULL(Col2) from Table_name;
  • COALESCE :       Select Col1,COALESCE(Col2,0) from Table_name;
  • CASE OPERATOR : Select Case When Col2 Is Not Null Then Col2 Else 0 End From Table_Name;

It is always suggested to use ANSI standard while coding in Teradata , since any changes in Teradata version due to upgrade/patches installation will lead to Time for regression testing, rework of code.

14) What is a join index ? What are benefits of using join index ?
Answer:
It is a index that is maintained in a system .It maintains rows joined on two or more tables. Join index is useful for queries where the index structure contains all the columns referenced by one or more joins, thereby allowing the index to cover all or part of the query
Benefits if using join index is

  • To eliminate base table access.
  • Aggregate processing is eliminated by creating aggregate join index.
  • It reduces redistribution of data since data is materialized by JI.
  • Reduces complex join conditions by using covering queries.

15) Can a macro be called inside a macro ?
Answer:
The main purpose of run a set of repeated sql queries. Macro supports only DML queries .
Hence We cant call any-other macro or not even a procedure in a macro.

One trick to have closest functionality of this is to copy all the sql queries from macro2 to macro1 and add parameters if it is necessary as shown below.

Replace macro1( val int)
as
( sel * from employee_table where empid= :val );

replace macro2( dept_no int)
as
( sel * from employee_table where deptno= :dept_no );

so, to call a macro2 inside a macro1..it is not possible. Hence follow this approach

Replace macro1 (val int, dept_no int)
as
(
sel * from employee_table where empid= :val;
sel * from employee_table where deptno= :dept_no ;
);

16 ) How do you find the list of employees named “john” in an employee table without using Like operator??
Answer :
This question seems tricky.. but yes there is another way by which we can find names/patters without using like operator.
By using “BETWEEN” , we can find the list of employees named john…

Sel * from employee where name betwee ‘J’ and ‘K’;

But at times usage of between is tricky, if there are other employees starting with J, those employees will also be listed by this query.

17) How do you list all the objects available in given database?
Answers :

  • select * from dbc.tables where databasename=’<DATABASENAME>’;
  • By running a normal help command on that database as follows.
  • help database <DATABASENAME’;

18) What are different types of Spaces available in Teradata ?
Answers :
There are 3 types of Spaces available in teradata ,they are
1. Perm space

  • This is disk space used for storing user data rows in any tables located on the database.
  • Both Users & databases can be given perm space.
  • This Space is not pre-allocated , it is used up when the data rows are stored on disk.

2.Spool Space

  • It is a temporary workspace which is used for processing Rows for given SQL statements.
  • Spool space is assigned only to users . –
  • Once the SQL processing is complete the spool is freed and given to some other query.
  • Unused Perm space is automatically available for Spool .

3. TEMP space

  • It is allocated to any databases/users where Global temporary tables are created and data is stored in them.
  • Unused perm space is available for TEMP space

19) What is hash collision ?
Answers : This occurs when there is same hash value generated for two different Primary Index Values. It is a rare occurance and Has been taken care in future versions of TD.

20) What is RAID, What are the types of RAID?
Answers : Redundant Array of Inexpensive Disks (RAID) is a type of protection available in Teradata. RAID provides Data protection at the disk Drive level. It ensures data is available even when the disk drive had failed.

There are around 6 levels of RAID ( RAID0 to RAID5) . Teradata supports Two levels of RAID protection
RAID 1 Mirrored copy of data in other disk
RAID 5 Parity bit (XOR) based Data protection on each disk array.
One of the major overhead’s of RAID is Space consumption




21) How do you find the No of AMP ‘s in the teradata Database?
Answers :

  • You can do a SELECT HASHAMP()+1 to get the total number of Amps in the given teradata system.
  • Details about amps can also be checked in Configuration management on teradata PMON tool.

22) How do you see a DDL for an existing table in Teradata?
Answers :
By using show table command as follows
show table tablename ;
This will display DDL structure of table along with following details

  • Fallback
  • Before/after journal
  • set/multiset table type
  • Index(PI,SI,PPI)
  • Details about column datatype ,default,identity/primary -foreign key .

23) How to find duplicates in a table?
Answers: To find duplicates in the table , we can use group by function on those columns which are to be used and then listing them if their count is >1 .
Following sample query can be used to find duplicates in table having 3 columns

select col1, col2,col3, count(*) from table
group by col1, col2, col3
having count (*) > 1 ;

24) What are different types of journals in teradata?
Answers :
There are 3 different types of journals available in Teradata. They are
1. Transient Journal : This maintains current transaction history. Once the query is successful it deletes entries from its table . If the current query transaction fails, It rolls back data from its table.
2. Permanent Journal : This is defined when a table is created. It can store BEFORE or AFTER image of tables. DUAL copies can also be stored. Permanent Journal maintains Complete history of table.
3.Down AMP recovery Journal (DARJ):  This journal activates when the AMP which was supposed to process goes down. This journal will store all entries for AMP which went down. Once that AMP is up, DARJ copies all entries to that AMP and makes that AMP is sync with current environment.

25) What are the reasons for product joins ?
Answers :

  • Stale or no stats causing optimizer to use product join
  • Improper usage of aliases in the query.
  • Missing where clause ( or Cartesian product join 1=1 )
  • Non equality conditions like > ,< , between example ( date)
  • Few join conditions
  • When or conditions are used.
  • last but not the least product joins are not bad always!! sometimes PJ are better compared to other types of joins.

26) How to rename columns using views?
Answers :

  • Create a view which is a subset of the employee table as follows.
  • Renaming columns in views will help increase security of sensitive tables and hiding columns under alias names

Replace view EmployeeV( number,fullname,addr,phno,depnum,sal, expr) as
locking row for access

sel
EmpNo ,
Name ,
Address,
Phone ,
DeptNo ,
Salary ,
YrsExp
from employee;

When we run sel * form employeeV, Only alias columns are displayed .

27) What are Restrictions on Views in Teradata?
Answers :

  • An index cannot be Created on a view.
  • It cannot contain an ORDER BY clause.
  • All the derived columns and aggregate columns used in the view must have an AS clause (alias defined).
  • A view cannot be used to UPDATE if it contains the following :
  • Data from more than one table (JOIN VIEW)
  • The same column twice
  • Derived columns
  • A DISTINCT clause
  • A GROUP BY clause

28) How do you you implement Multi valued compression in an existing table?
Answers : MVC can be implemented on following conditions

  • A new column with multi valued compression can be added to an existing table, but cannot modify existing compressed column.
  • Create a new table with column which has MVC and do insert .. select from original table

Or
CREATE TABLE… as with column designated MVC.

29) List Built-in functions used in teradata ?
Answers :
The main functionality of built in functions is that they dont need any arguments or paramaters and can be used directly with select to return system values.
Some of them are
• SESSION: – Returns a number for the session for current user .
• TIME: – this function provides the current time based on a 24-hour day
• USER: – This one gives the user name of the current user.
• ACCOUNT: – display’s your Teradata Account information
• CURRENT_DATE: – Returns the current system date
• CURRENT_TIME: This function returns the current system time and current session ‘Time Zone’ displacement.
• CURRENT_TIMESTAMP: Returns the current system timestamp with TimeZone
• DATABASE: – It returns the name of the default database for the current user.
• DATE: – same as Current_DATE and is teradata built in .

30) How do you whether table is locked or not?
Answer :
Just run the following query on the table.

Lock Table DBNAME.TABLENAME write nowait
Select * from DBNAME.TABLENAME;

If this table is locked currently then , then the above statement would fail as Nowait will not wait for locks to be released on that table .

Other way, You can find it by using Show locks utility in Teradata Manager Tool.

You can use Teradata Manager -> Administrator -> Database Console Utility -> Show locks




31) What are advantages of compression on tables?
Answers :

  • They take less physical space then uncompressed columns hence reducing space cost
  • They improve system performance as less data will be retrieved per row fetched , more data is fetched per data block thus increasing data loading speed
  • They reduce overall I/O

32) How many error tables are there in fload and Mload and what is their significance/use?
Answers:
Fload uses 2 error tables
ET TABLE 1: where format of data is not correct.
ET TABLE 2: violations of UPI

It maintains only error field name, error code and data parcel only.
Mload also uses 2 error tables (ET and UV), 1 work table and 1 log table

1. ET TABLE Data error
MultiLoad uses the ET table, also called the Acquisition Phase error table, to store data errors found during the acquisition phase of a MultiLoad import task.

2. UV TABLE UPI violations
MultiLoad uses the UV table, also called the Application Phase error table, to store data errors found during the application phase of a MultiLoad import or delete task
Apart from error tables, it also has work and log tables

3. WORK TABLE WT
Mload loads the selected records in the work table

4. LOG TABLE
A log table maintains record of all checkpoints related to the load job, it is essential/mandatory to specify a log table in mload job. This table will be useful in case you have a job abort or restart due to any reason.

34) How do you find out number of AMP’s in the Given system?
Answer

  • Running following query in queryman

Select HASHAMP () +1;

  • We can find out complete configuration details of nodes and amps in configuration screen of Performance monitor

35) What are the difference types of temporary tables in Teradata?
Answers:

  • Global temporary tables
  • Volatile temporary tables
  • Derived tables

Global Temporary tables (GTT) –

  • When they are created, its definition goes into Data Dictionary.
  • When materialized data goes in temp space.
  • That’s why, data is active up to the session ends, and definition will remain there up-to its not dropped using Drop table statement. If dropped from some other session then its should be Drop table all;
  • You can collect stats on GTT.
  • Defined with the CREATE GLOBAL TEMPORARY TABLE sql

Volatile Temporary tables (VTT) –

  • Local to a session (deleted automatically when the session terminates)
  • Table Definition is stored in System cache .A permanent table definition is stored in the DBC data dictionary database (DBC.Temptables) .
  • Data is stored in spool space.
  • That’s why; data and table definition both are active only up to session ends.
  • No collect stats for VTT.If you are using volatile table, you can not put the default values on column level (while creating table)
  • Created by the CREATE VOLATILE TABLE sql statement

Derived tables

  • Derived tables are local to an SQL query.
  • Not included in the DBC data dictionary database, the definition is kept in cache.
  • They are specified on a query level with an AS keyword in an sql statement.

36) List types of HASH functions used in Teradata?
Answer:
SELECT HASHAMP (HASHBUCKET (HASHROW ())) AS “AMP#”, COUNT (*) FROM GROUP BY 1 ORDER BY 2 DESC;
There are HASHROW, HASHBUCKET, HASHAMP and HASHBAKAMP.
The SQL hash functions are:
* HASHROW (column(s))
* HASHBUCKET (hashrow)
* HASHAMP (hashbucket)
* HASHBAKAMP (hashbucket)

Example:

SELECT
HASHROW (‘Teradata’) AS “Hash Value”
, HASHBUCKET (HASHROW (‘Teradata’)) AS “Bucket Num”
, HASHAMP (HASHBUCKET (HASHROW (‘Teradata’))) AS “AMP Num”
, HASHBAKAMP (HASHBUCKET (HASHROW (‘Teradata’))) AS “AMP Fallback Num” ;

This is really good, by looking into the result set of above written query you can easily find out the Data Distribution across all AMPs in your system and further you can easily identify un-even data distribution.

37) What are the advantages and dis-advantages of secondary Indexes?
Answer:
Advantages:

  • A secondary index might be created and dropped dynamically
  • A table may have up to 32 secondary indexes.
  • Secondary index can be created on any column. .Either Unique or Non-Unique
  • It is used as alternate path or Least frequently used cases. ex. defining SI on non indexed column can improve the performance, if it is used in join or filter condition of a given query.
  • Collecting Statistics on SI columns make sure Optimizer choses SI if it is better than doing Full Table Scans

Disadvantages

  • Since Sub tables are to be created, there is always an overhead for additional spaces.
  • They require additional I/Os to maintain their sub tables.
  • The Optimizer may, or may not, use a NUSI, depending on its selectivity.
  • If the base table is Fallback, the secondary index sub table is Fallback as well.
  • If statistics are not collected accordingly, then the optimizer would go for Full Table Scan.

38) Where does TD store transient journal?
Answer :
In perm space -> dbc.transientjournal
But that special table can grow over dbc’s perm limit until the whole system runs out of perm space.

39) How to select first N Records in Teradata?
Answers:
To select N records in Teradata you can use RANK function. Query syntax would be as follows:

SELECT BOOK_NAME, BOOK_COUNT, RANK(BOOK_COUNT) A FROM LIBRARY QUALIFY A <= 10;

40) How to view every column and the columns contained in indexes in Teradata?
Answers:
Following query describes each column in the Teradata RDBMS
SELECT * FROM DBC.TVFields;
Following query describes columns contained in indexes in the Teradata RDBMS
SELECT * FROM DBC.Indexes;




41) How Teradata makes sure that there are no duplicate rows being inserted when it’s a SET table?
Answers:
Teradata will redirect the new inserted row as per its PI to the target AMP (on the basis of its row hash value), and if it find same row hash value in that AMP (hash synonyms) then it start comparing the whole row, and find out if duplicate.

42) What is a clique?
Answers : A clique is a set of Teradata nodes that share a common set of disk arrays which are connected in daisy chain network to each disk array controller.
Cliques provide data accessibility if a node fails for any reason, Proms are distributed across all nodes in the system. Large multiple node systems will have clique mechanisms associated with them.

43) What is the difference between MultiLoad & Fastload interns of Performance?
Answers:
If you want to load, empty table then you use the fastload, so it will very useful than the MultiLoad ,because fastload performs the loading of the data in 2phase and its no need a work table for loading the data .
So it is faster as well as it follows the below steps to load the data in the table

  • Phase1 It moves all the records to the entire AMP first without any hashing
  • Phase2 After giving end loading command, Amp will hashes the record and send it to the appropriate AMPS.

MultiLoad:
It does the loading in the 5 phases

  • Phase1 : It will get the import file and checks the script
  • Phase2 : It reads the record from the base table and store in the work table
  • Phase3 : In this acquisition phase it locks the table header
  • Phase4 : In the DML operation will done in the tables
  • Phase5 : In this table locks will be released and work tables will be dropped.

44) How does indexing improve query performance?
Answers:

  • Indexing is a way to physically reorganize the records to enable some frequently used queries to run faster.
  • The index can be used as a pointer to the large table. It helps to locate the required row quickly and then return it back to the user.

Or

  • The frequently used queries need not hit a large table for data. They can get what they want from the index itself. cover queries.
  • Index comes with the overhead of maintenance. Teradata maintains its index by itself. Each time an insert/update/delete is done on the table the indexes will also need to be updated and maintained.
  • Indexes cannot be accessed directly by users. Only the optimizer has access to the index.

45) What is error table? What is the use of error table?
Answers:
The Error Table contains information concerning:

  • Data conversion errors Constraint violations and other error conditions:
  • Contains rows which failed to be manipulated due to constraint violations or Translation error
  • Captures rows that contain duplicate Values for UPIs.
  • It logs errors & exceptions that occurs during the apply phase.
  • It logs errors that are occurs during the acquisition phase.

46) How to find out list of indexes in Teradata?
Answer :
IndexType        Description
P                            Nonpartitioned Primary
Q                            Partitioned Primary
S                             Secondary
J                             Join index
N                            Hash index
K                            Primary key
U                            Unique constraint
V                            Value ordered secondary
H                            Hash ordered ALL covering secondary
O                            Valued ordered ALL covering secondary
I                             Ordering column of a composite secondary index
M                           Multi column statistics
D                            Derived column partition statistics
1                             Field1 column of a join or hash index
2                             Field2 column of a join or hash index

SELECT
databasename, tablename, columnname, indextype, indexnumber, indexname
FROM
dbc.indices
ORDER BY
databasename, tablename, indexnumber;

47) When should the statistics be collected?
Answer :
Here are some excellent guidelines on when to collect statistics:

  • All Non-Unique indices
  • Non-index join columns
  • The Primary Index of small tables
  • Primary Index of a Join Index
  • Secondary Indices defined on any join index
  • Join index columns that frequently appear on any additional join index columns that frequently appear in WHERE search conditions
  • Columns that frequently appear in WHERE search conditions or in the WHERE clause of joins.

48) How to make sure Optimizer chooses NUSI over Full Table Scan (FTS)?
Answer:

  • A optimizer would prefer FTS over NUSI, when there are no Statistics defined on NUSI columns.
  • It would prefer scanning for full table instead of going for Subtables in NUSI since optimized does not have information about subtables of NUSI
  • It is always suggested to collect statistics whenever NUSI columns are defined on the table.
  • Verify whether index is being used by checking in Explain plan.

49) What are the basic rules that define how PI is defined in Teradata?
Answer:
The following rules govern how Primary Indexes in a Teradata Database must be defined as well as how they function:

  • One Primary Index per table.
  • A Primary Index value can be unique or non-unique.
  • The Primary Index value can be NULL.
  • The Primary Index value can be modified.
  • The Primary Index of a populated table cannot be modified.
  • A Primary Index has a limit of 64 columns.

50) What are the basic criteria to select Primary Index column for a given table?
Answer:
A thumb rule of ADV demographics is followed.
Access Demographics

  • Identify index candidates that maximize one-AMP operations.
  • Columns most frequently used for access (Value and Join).

Distribution Demographics

  • Identify index candidates that optimize parallel processing.
  • Columns that provide good distribution.

Volatility Demographics

  • Identify index candidates with low maintenance I/O.

51) What is explain in teradata?
Answer : The EXPLAIN facility is a teradata extension that provides you with an “ENGLISH” translation of the steps chosen by the optimizer to execute an SQL statement. It may be used on any valid teradata database with a preface called “EXPLAIN”.
The following is an example:-

  • EXPLAIN select last_name first_name FROM employees;
  • The EXPLAIN parses the SQL statement but does not execute it.
  • This provides the designer with an “execution strategy”.
  • The execution strategy provides what an optimizer does but not why it chooses them.
  • The EXPLAIN facility is used to analyze all joins and complex queries.

52) What are the different return codes(severity errors) in Teradata utilities?
Answer :
There are 3 basic return codes (severity errors) in teradata utilities.
0      Success
4      Warning
8       User error
12     System error
16     System error
Please note that apart from this there are separate error codes for each of the error returned from sql queries.

For more details on error codes and fixing them please refer to General reference manual available from teradata documentation section in the teradata site.

53) What are the different date formats available in Teradata system?
Answer:
There are two different date formats available , they are
The Teradata default format is:  YY/MM/DD
The ANSI display format is:         YYYY-MM-DD

54) How do you set default date setting in BTEQ?
Answer
There are two default date setting in BTEQ. They have to be set after logging on to the session
Set session dateform = ANSIDATE ; /*format is yyyy-mm-dd */
Set session dateform = integerdate ; /* format is yy/mm/dd -teradata date format */

55) What does DROP table command do?
Answer :

  • It deletes all data in table_name
  • Removes the definition from the data dictionary
  • Removes all explicit access rights on the table

56) Is Like comparison case-sensitive in Teradata?
Answer
LIKE operator is not case sensitive in Teradata session mode.
Consider the following example

Select F_name from employee where F_name like ‘%JO%’;

  • The following query will pick values matching with ‘JO’ and ‘jo’ as well, since Teradata is not case-sensitive
  • To overcome this problem, a new function called “CASESPECIFIC” is used in TERADATA as follows

Select F_name from employee where F_name (CASESPECIFIC) like ‘%JO%’;

57) How do you submit bteq script (batch mode)?
Answer
Start the BTEQ , by typing BTEQ
Enter the following command
.run file = BTEQScript.btq
                    OR

1. Bteq < BTEQScript.btq
BTEQScript.btq contains following
.logon 127.0.0.1/dbc, dbc;
sel top 10 * from dbc.tables;
.quit

58) What is the command in BTEQ to check for session settings ?
Answer
The BTEQ .SHOW CONTROL command displays BTEQ settings.
The .SHOW CONTROL command shows all BTEQ session parameters
Example
[SET] FOLDLINE = OFF ALL
[SET] FOOTING = NULL
[SET] FORMAT = OFF
[SET] FORMCHAR = OFF
[SET] HEADING = NULL
[SET] INDICDATA = OFF
[SET] NOTIFY = OFF
[SET] NULL = ?
[SET] OMIT = OFF ALL
[SET] PAGEBREAK = OFF ALL
[SET] PAGELENGTH = 55
[SET] QUIET = OFF
[SET] RECORDMODE = OFF
[SET] RETCANCEL = OFF
[SET] RETLIMIT = No Limit
[SET] RETRY = ON
[SET] RTITLE = NULL
[SET] SECURITY = NONE
[SET] SEPARATOR = two blanks
[SET] SESSION CHARSET = ASCII
[SET] SESSION SQLFLAG = NONE
[SET] SESSION TRANSACTION = BTET
[SET] SESSIONS = 1
[SET] SIDETITLES = OFF for the normal report.
[SET] SKIPDOUBLE = OFF ALL
[SET] SKIPLINE = OFF ALL
[SET] SUPPRESS = OFF ALL
[SET] TDP = l5442
[SET] TITLEDASHES = ON for the normal report.
And, it is ON for results of WITH clause number: 1 2 3 4 5 6 7 8 9.
[SET] UNDERLINE = OFF ALL
[SET] WIDTH = 75

59) What are benefits of Permanent Journal?
Answer :
The benefits of Permanent Journal are

  • Permits capture of before images for database rollback.
  • Permits capture of after images for database roll forward.
  • Permits archiving change images during table maintenance.
  • Reduces need for full-table backups.
  • Provides a means of recovering NO FALLBACK tables.
  • Requires additional disk space for change images.
  • Requires user intervention for archive and recovery activity

60) What are the benefits of fallback?
Answer :
The benefits of fallback are

  • Protects your data from hardware (disk) failure.
  • Protects your data from software (node) failure.
  • Automatically recovers with minimum recovery time, after repairs or fixes are complete




61) What’s the difference between TIMESTAMP (0) and TIMESTAMP (6)?
Answer :
TIMESTAMP (0) is CHAR (19) and TIMESTAMP (6) is CHAR (26)
Timestamp(0) is YYYY-MM-DDbHH:MI:SS
Timestamp(6) is YYYY-MM-DDbHH:MI:SS.ssssss ( milliseconds extra)

62) What is a Dirty-Read or Stale-Read Lock?
Answer :
This occurs when a access lock is applied on the table which is doing a update.
May produce erroneous results if performed during table maintenance resulting in Dirty Read or stale read , which might result in inconsistent result set.

63) Difference between BTEQ and Sql assistant (query man)?
Answer :
BTEQ : Basic Teradata Query utility

  • SQL front-end : Report writing and formatting features
  • Interactive and batch queries
  • Import/Export across all platforms
  • The default number of sessions, upon login, is 1.
  • Teradata Query Manager / Queryman / TeradataSQL Assistant

SQL front-end for ODBC compliant databases

  • Historical record of queries including:
  • Timings
  • Status
  • Row counts
  • Random sampling feature
  • Limit amount of data returned
  • Import/Export between database and PC
  • Export to EXCEL or ACCESS

64) How do you execute the given SQL statement repeatedly in BTEQ?
Answer:
Select top 1* from database.table1;
=n
Here “=n” is to run the previous sql statement, “n” number of times.

65) What are types of PARTITION PRIMARY INDEX (PPI) in Teradata?
Answer:
1. Partition by CASE
CREATE TABLE ORDER_Table
(
ORD_number integer NOT NULL,
customer_number integer NOT NULL,
order_date date ,
order_total integer
)
PRIMARY INDEX (customer_number)
PARTITION BY case1 (
order_total < 10000 ,
order_total < 20000 ,
order_total < 30000,
NO CASE OR UNKNOWN ) ;

2. Partition by Range example using date range
CREATE TABLE ORDER_Table
(
ORD_number integer NOT NULL,
customer_number integer NOT NULL,
order_date date ,
order_total integer
)
PRIMARY INDEX (customer_number)
PARTITION BY range1 (
Order_date BETWEEN date ’2010-01-01′ AND date ’2010-12-01′
EACH interval ’1′ month ,
NO RANGE
OR UNKNOWN);

P.S:If we use NO RANGE or NO CASE then all values not in this range will be in a single partition.
If we specify UNKNOWN, then all null values will be placed in this partition

66) Can we define PARTITION PRIMARY INDEX (PPI) on a Primary Index column in Table? Explain Why?
Answer:
PPI cannot be defined on PI column in Table. Since PI is used to locate an AMP and store the data based on HASH value (ROW ID ) it cannot be used as PPI column.
In case of PPI , The data stored in AMP’s are Partitioned based on PPI column after they are row hashed (ROW KEY = ROW ID +PPI value )
P.S: If you want to create UPI on table, then PPI column can be added as part of PI .

67) How to skip 1st record while using Bteq IMPORT?
How to skip 1st record while using Bteq IMPORT?
Answer:
By using SKIP=1 ; , we can skip first record.
.import infile=<filename>, skip=1;

68) What is TENACITY? What is its default value?
Answer
TENACITY specifies the amount of time in hours, to retry to obtain a loader slot or to establish all requested sessions to logon. The default for Fast Load is “no tenacity”, meaning that it will not retry at all. If several FastLoad jobs are executed at the same time, we recommend setting the TENACITY to 4, meaning that the system will continue trying to logon for the number of sessions requested for up to four hours.

69) What does SLEEP function does in Fast load?
Answer

  • The SLEEP command specifies the amount minutes to wait before retrying to logon and establish all sessions.
  • Sleep command can be used with all load utilities not only fastload.
  • This situation can occur if all of the loader slots are used or if the number of requested sessions is not available. The default value is 6 minutes. If tenacity was set to 2 hours and sleep 10 minutes, Sleep command will try to logon for every 10 minutes up to 2 hours duration.

70) What is Cross Join?
Answer:
It is a Teradata specified Join, which is used as equivalent to product join.
There is no “On” clause in case of CROSS join

SELECT EMP.ename , DPT.Dname
FROM employee EMP
CROSS JOIN
Department DPT
WHERE
EMp.deptno = DPT.depto ;




71) How many types of Index are present in teradata?
Answer:
There are 5 different indices present in Teradata
1. Primary Index
a.Unique primary index
b. non Unique primary index
2. Secondary Index
a. Unique Secondary index
b. non Unique Secondary index
3. Partitioned Primary Index
a. Case partition (ex. age, salary…)
b. range partition ( ex. date)
4. Join index
a. Single table join index
b. multiple table join index
c. Sparse Join index ( constraint applied on join index in where clause)
5. Hash index

71) Difference between Stored Procedure and Macro?
Answer:
Stored Procedure:

  • It does not return rows to the user.
  • It has to use cursors to fetch multiple rows
  • It used Inout/Out to send values to user
  • It Contains comprehensive SPL
  • It is stored in DATABASE or USER PERM
  • A stored procedure also provides output/Input capabilities

Macros:

  • It returns set of rows to the user.
  • It is stored in DBC SPOOL space
  • A macro that allows only input values

72) What is a Sparse Index?
Answer:
Sparse Join Indexes are a type of Join Index which contains a WHERE clause that reduces the number of rows which would otherwise be included in the index. All types of join indexes, including single table, multitable, simple or aggregate can be sparse.

73) How to handle nulls in Teradata??? How many columns can be there in a table???
Answer

  • Use zeroifnull, nullifzero in select and NULL in insert directly.
  • 256 columns max per table.

74) How to find average sal with out using avg function????
Answer
Without using “avg” we can find the avg salary by using sum (sal)/count (sal);
sel sum(sal)/count(sal) as avgsal from tablename

75) What is difference B/w User and database in Teradata?
Answer:

  • User is a database with password but database cannot have password
  • Both can contain Tables , views and macros
  • Both users and databases may or may not hold privileges
  • Only users can login, establish a session with Teradata database and they can submit requests

76) How do you create materialized view in Teradata?
Answer:

  • There is no such thing as a “materialized view” in Teradata. The equivalent in Teradata would be a join index (or aggregate index) with a corresponding view put on top of it. The command to create one is “CREATE JOIN INDEX…(lots of options)”.
  • Join indices are maintained when the data is maintained. They cannot be accessed directly, but are invoked when the optimizer determines it is beneficial. Even though you cannot invoke a join index directly, you create a view that looks like the join index and in that way, it would be similar to having direct access. However, this approach does not guarantee that the join index will be used.

77) What are Differences between Teradata and ANSI Session modes in Teradata?
Answer:
TERADATA MODE
————-
1. Comparison is not Case sensitive
2. Create table are default to SET tables
3. Each transaction is committed implicitly
4. Supports all Teradata commands
5. It follows BTET (Begin and End Transaction) Mode

ANSI MODE
————-
1. Comparison is CASE sensitive
2. Create table are default to MULTISET tables
3. Each transaction has to be committed explicitly
4. Does not support all Teradata commands
5. It does not follow BTET Mode

78) What are the scenarios in which Full Table Scans occurs?
Answer :

  • The where clause in SELECT statement does not use either primary index or secondary index
  • SQL Statement which uses a partial value (like or not like), in the WHERE statement.
  • SQL Statement which does not contain where clause.
  • SQL statement using range in where clause. Ex. (col1 > 40 or col1 < =10000)

79) How to identify PPI columns?
Answer:
Select databasename , tablename , columnposition ,columnname from dbc.indices
where indextype =’Q’
order by 1 ,2,3 ;

80) How to skip the header row in the fastload script
Answer
RECORD 2; /* this skips first record in the source file */
DEFINE …




81) Explain types of re-distribution of data happening for joining of columns from two tables
Answer:
Case 1 P.I = P.I joins
Case 2 P.I = N.U.P.I joins
Case 3 N.U.P.I = N.U.P.I joins
Case1 there is no redistribution of data over amp’s. Since amp local joins happen as data are present in same AMP and need not be re-distributed. These types of joins on unique primary index are very fast.
Case2 data from second table will be re-distributed on all amps since joins are happening on PI vs. NUPI column. Ideal scenario is when small table is redistributed to be joined with large table records on same amp
case3 data from both the tables are redistributed on all AMPs. This is one of the longest processing queries , Care should be taken to see that stats are collected on these columns

82) Can you load multiple data files for same target table using Fastload?
Answer:
Yes, we can Load a table using multiple datafiles in Fastload.
Before giving “end loading” statement user can define file path and use insert sql for multiple source files and give “end loading” statement at the end

83) Why does varchar occupy 2 extra bytes?
Answer :
The two bytes are for the number of bytes for the binary length of the field.
It stores the exact no of characters stored in varchar

84) How many types of Skew exist?
Answer:
If you utilized unequally TD resources (CPU,AMP,IO,Disk and etc) this is called skew exists. Major are 3 types of skews (CPU skew, AMP/Data skew, IO Skew).
-Data skew?
When data is not distributed equally on all the AMPs.
-Cpu skew?
Who is taking/consuming more CPU called cpu skew.
-IO skew?
Who perform more IO Operation? Resulting in IO Skew

85) Why Fload doesn’t support multiset table?
Answer

  • Fload does not support Multiset table because of following reason.
  • Say, the fastload job fails. Till the fastload failed, some number of rows was sent to the AMP’s.
  • Now if you restart FLOAD, it would start loading record from the last checkpoint and some of the consecutive rows are sent for the second time. These will be caught as duplicate rows are found after sorting of data.
  • This restart logic is the reason that Fastload will not load duplicate rows into a MULTISET table. It assumes they are duplicates because of this logic. Fastload support Multiset table but does not support the duplicate rows. Multiset tables are tables that allow duplicate rows. When Fastload finds the duplicate rows it discards it. Fast Load can load data into multiset table but will not load the duplicate rows.

86) What is Global Temporary table? What is the use of this Global Temporary table?
Can we take collect stats on Derived Tables and Volatile tables and Temporary tables?
What is Global Temporary table?
What is the use of this Global Temporary table?
Answer

  • No for volatile and derived tables and yes for global tables.
  • Global tables are temp tables like volatile tables but unlike volatile tables, their definitions are retained in dd.
  • It is used whenever there is a need for a temporary table with same table definition for all users.

87) What is the default join strategy in Teradata???
Answer:
The Teradata Cost based optimizer will decide the join strategy based on the optimum path. The common strategies followed are from Merge, Hash & Nested Join
Three strategies followed by optimizer are:
1. Duplication of rows of one table to every amp
–> This one is opted by optimizer when the non-PI column is on a small table.
2. Redistribution of the non PI rows of the table to the amp containing the matching PI row amp.
–> This one is opted when the tables are large. The non PI column is in one table is redistributed to the amp containing the matching PI.
3. Redistribute both the rows of the table by hash values.
–> This is chosen when the join is on a column that is not the PI in either table. This will require the most spool space.

88) What do High confidence, Low confidence and No confidence mean in EXPLAIN plan?
Answer
Explain gives the execution strategy means what are the different steps that the query will go through.
HIGH CONFIDENCE: Statistics are collected.
LOW CONFIDENCE: Statistics are not collected. But the where condition is having the condition on indexed column. Then estimations can be based on sampling.
NO CONFIDENCE: Statistics are not collected and the condition is on non indexed column.

89) How to Skip or Get first and Last Record from Flat File through MultiLoad?
Answer

  • In .IMPORT command in Mload we have a option to give record no. from which processing should begin. i.e. ‘FROM m’ ‘m’ is a logical record number, as an integer, of the record in the identified data source where processing is to begin. You can mention ’m’ as 2 and processing will start from second record.
  • THRU k and FOR n are two options in the same Mload command, functions same towards the end of the processing.
  • Adding to the above, if from n”start record” and for n “stop record” are not mentioned, mload considers records from start till the end of the file.

90) Which is faster – MultiLoad delete or Delete command?
Answer
MultiLoad delete is faster then normal Delete command, since the deletion happens in data blocks of 64Kbytes, where as delete command deletes data row by row. Transient journal maintains entries only for Delete command since Teradata utilities doesn’t support Transient journal loading.




91) What is Teradata Virtual storage?
Answer

  • This concept is introduced in TD12. It does the following tasks
  • Maintains Information On Frequency Of Data Access
  • Tracks Data Storage Task On Physical Media
  • Migrating Frequently Used Data To Fast Disks And Less Frequently Used Data To Slower Disks.
  • Allocating Cyclinders From Storage To Individual Amps

92) How to start / stop a database in windows?
Answer

  • logon to CMD
  • check for state pdestate -d
  • run the following command to start “net start recond”
  • check for status pdestate -d
  • to STOP the database Trusted Parallel Application or TPA
  • tpareset -x comment
  • The -x option stops Teradata without stopping the OS.

93) What is a role?
Answer
A role is a set of access rights which can be assigned to the users. They indirectly help in performance by reducing the number of rows entered in DBC.accessrights

94) What is a profile?
Answer
A profile contains set of user parameters like accounts, default database, spool
space, and temporary space for a group of users
To assign the profile to a user, use the AS PROFILE modified in a CREATE USER or
MODIFY USER statement:
MODIFY USER username AS PROFILE=profilename ;
To remove a profile from a member but retain the profile itself:
MODIFY USER username AS PROFILE=NULL ;

95) Why are AMPs and PEs called as vprocs ?
Answer

  • AMPs and PEs are implemented as “virtual processors vprocs”.
  • They run under the control of PDE and their number is software configurable.
  • AMPs are associated with “virtual disks – vdisks” which are associated with logical units (LUNs) within a disk array

Answer2:

  • Vprocs:Virtual process From PE to AMP (This is the network root via MSP(message passing layer),The processing data will store in Disks(These are Physical disks),Each Amp have too many P.disks,to migrate these P.disks The Bynet Network maintains Virtual disks.These V.disks will responsible for data migration.hence they are called as Virtual Process(VPROCS).

96) What is residual condition in explain plan ?
Answer
It is a condition which help u to reduce the number used for join condition. Residual condition does not help in locating a row

97) How to check if given object is a database or user ?
Answer
To check whether the given object is user or database , we can use following query
sel * from dbc.databases where dbkind =’U’ or dbkind=’D’;

98) What are set tables and multiset tables in Teradata?Explain with an appropriate example?
Answers:
1) Set tables cannot accept duplicate at row level not Index or key level.
Example of rows for set table:
R1 c2 c3 c4 ..cn
1 2 3 4 … 9

Accepted
1 2 3 4 … 9
Duplicate is Rejected
2 1 2 4 … 9
3 2 4 4 … 9
4 3 4 4 … 9

2) Multi set Tables can accept
duplicate at row level not Index or key level.Exmaple of rows for multi set table:
R1 c2 c3 c4 .. cn
1 2 3 4 … 9
1 2 3 4 … 9
Duplicate is Accepted
2 1 2 4 …9
3 2 4 4 …9
3 2 4 4 …9
Duplicate is Accepted

99) In a table can we use primary key in one column and in another column both unique and not null constrains.if yes how?
Answers:
Yes, you can have a column for Primary key and have another column which will have no duplicates or null.e.g.A Salary Table will have employee ID as primary key.
The table also contains TAX-ID which can not be null or duplicate

100) How many codd’s rules are satisfied by teradata database?
Answers:
12 codd’s rules.