PROVE IT !!If you know it then PROVE IT !! Skill Proficiency Test

Identity Column

  • Identity columns are used mainly to ensure row uniqueness by taking a system-generated unique value. They are valuable for generating simple unique indexes and primary and surrogate keys when composite indexes or keys are not desired.
  • Identity columns are also useful for ensuring column uniqueness when merging several tables or to avoid significant preprocessing when loading and unloading tables.

You should not create an identity column for a table that you intend to use with Teradata Unity. Teradata Unity provides its own mechanism for generating deterministic unique values that do not present the problems that identity columns do for managing multiple Teradata Database instances

Note: Loading a row from a client system that duplicates an existing row in an identity column table is permitted because the assignment of the identity column to the row makes it unique. If this presents a problem, you must filter your data for duplicates before you load it into an identity column table.

The various parameters for identity columns are maintained by the DBC.IdCol system table.

Identity columns have the following properties.

  • Begin with the value 1 unless a different START WITH value is specified.
  • Increment by 1 unless a different INCREMENT BY value is specified.
  • Can decrement if a negative INCREMENT BY value is specified.
  • Can specify maximum values for incrementing or minimum values for decrementing a value series.
  • Values can be recycled.

Identity columns have the following rules and restrictions.

  • Cannot be used for tables that are managed by the Teradata Unity product. See Teradata Unity Installation and User Guidefor details.
  • Support the following bulk insert operations only.

o   Single statement INSERT requests through multiple concurrent sessions.

For example, parallel BTEQ imports into the same table.

  • Multistatement INSERT requests through multiple concurrent sessions.

For example, Teradata Parallel Data Pump inserts.

o   INSERT … SELECT requests.

o   Replicated tables only under the following conditions.

o   The column is specified as GENERATED BY DEFAULT on the destination (subscriber) side to enable the correct values to be inserted.

  • If replication is bidirectional, then the identity columns on both systems must be specified as GENERATED BY DEFAULT.
  • If replication is bidirectional, then the value ranges on the 2 systems must be disjoint (odd on one system and even on the other).

Note: Teradata Replication Services Using Oracle GoldenGate is not scheduled for future enhancements after Release 13.10, although it continues to be supported. Teradata Replication Services 13.10 can be used with Teradata 14.0; however, attempts to use Teradata 14.0 new features with replication will return an error. Teradata Access Module for Oracle GoldenGate (TAM) 13.10 is compatible with Teradata 14.0. No further enhancements of TAM is planned. 

  • If a triggered event is an INSERT into a table with an identity column, then the triggered action statement block or the WHEN clause of the trigger cannot reference the identity column.
  • GENERATED ALWAYS columns cannot be updated.
  • GENERATED ALWAYS column values that also specify NO CYCLE are always unique.
  • GENERATED BY DEFAULT columns generate a value only when the column is set null by an INSERT request in one of the following ways.

o   Explicit specification of NULL in the multivalue.

For example, INSERT INTO table VALUES (1,NULL);

o   Implicit specification of NULL by the omission of a value in the multivalue when no column name list is provided.

For example, INSERT INTO table VALUES (1,);

o   Omission of a column name from the column name list.

For example, INSERT INTO table (x) VALUES (1);

where table has more than a single column.

  • GENERATED BY DEFAULT columns can be updated.
  • GENERATED BY DEFAULT column values are not guaranteed to be unique.
  • You must comply with all of the following restrictions if you want to guarantee the uniqueness of GENERATED BY DEFAULT column values.
  • You must specify NO CYCLE.
  • Any user‑specified values you specify must be outside the range of any system‑generated values.
  • You must enforce the uniqueness of the user‑specified values yourself.
  • Identity columns cannot have a UDT, Geospatial, ARRAY, VARRAY, or Period data type.
  • GENERATED ALWAYS identity columns cannot be null.
  • The cardinality of a table with unique identity column values is limited by the maximum value for the identity column data type.
  • Because of this, you should specify a numeric type that ensures the largest possible number of unique values for the identity column can be generated.
  • The maximum numeric data type ranges are DECIMAL(18,0), NUMERIC(18,0), and exact NUMBER(18,0), or approximately 1 x 1018 rows. You cannot use approximate NUMBER columns for identity columns.
  • This is true even when the DBS Control parameter MaxDecimal is set to 38 (see SQL Data Types and Literals and Utilities: Volume 1 (A-K)). You can define an identity column with more than 18 digits of precision, or even as a large fixed NUMBER or BIGINT type, without the CREATE TABLE or ALTER TABLE request aborting, or even returning a warning message, but the values generated by the identity column feature remain limited to the DECIMAL(18,0) type and size.
  • Inserts into GENERATED BY DEFAULT identity columns using Teradata Parallel Data Pump cannot reuse Teradata Parallel Data Pump field variables in another parameter of the same insert operation.
  • The following examples are based on this table definition.

CREATE MULTISET TABLE test01 (

a1 INTEGER GENERATED BY DEFAULT AS IDENTITY

(START WITH 1

INCREMENT BY 20

MAXVALUE 1000

),

a2 INTEGER);

The following simple INSERT requests fail because they reuse Teradata Parallel Data Pump field variables.

.LAYOUT layoutname;

.FIELD uc1 INTEGER;

.FIELD uc2 INTEGER;

.DML LABEL labelname;

INSERT INTO test01 VALUES (:uc1, :uc1);

.LAYOUT layoutname;

.FIELD uc1 INTEGER;

.FIELD uc2 INTEGER;

.DML LABEL labelname;

INSERT INTO test01 VALUES (:uc2, (:uc2 + :uc2));

The following simple INSERT requests succeed because they do not reuse Teradata Parallel Data Pump field variables.

.LAYOUT layoutname;

.FIELD uc1 INTEGER;

.FIELD uc2 INTEGER;

.DML LABEL labelname;

INSERT INTO test01 VALUES (:uc1, :uc2);

.LAYOUT layoutname;

.FIELD uc1 INTEGER;

.FIELD uc2 INTEGER;

.DML LABEL labelname;

INSERT INTO test01 VALUES (:uc2, (:uc1 + :uc1));

o   INSERT operations into GENERATED BY DEFAULT identity columns using Teradata Parallel Data Pump field variables cannot specify field variables if the value inserted into the identity column is derived from an expression that includes a Teradata Parallel Data Pump field variable.

o   INSERT operations into both GENERATED BY DEFAULT and GENERATED ALWAYS identity columns using Teradata Parallel Data Pump field variables cannot insert into the identity column of more than 1 identity column table.

o   You cannot specify an identity column for an unpartitioned NoPI table, but you can specify an identity column for a column‑partitioned table.

The GENERATED … AS IDENTITY keywords introduce a clause that specifies the following.

o   The column is an identity column.

o   Teradata Database generates values for the column when a new row is inserted into the table except in certain cases described later.

IF you specify GENERATED … THEN the system …  
ALWAYS Always generates a unique value for the column when Teradata Database inserts a new row into the table and NO CYCLE is specified.  
BY DEFAULT Generates a unique value for the column when Teradata Database inserts a new row into the table only if the INSERT request does not specify a value for the column.  
     
IF you are using the identity column for this purpose … THEN you should specify this option …  
 
To ensure a UPI, USI, PK, or some other row uniqueness property ALWAYS … NO CYCLE  
• To load data into or unload data from a table
• To copy rows from one table to another table
• To fill in gaps in the sequence
• To reuse numbers that once belonged to now-deleted rows
BY DEFAULT.  

TroubleShooting

1)      Failure 7545 Numbering for Identity Column TEST_Table_HP.Row_I is over its limit. 

Solution :

  • The error persist when the value being inserted is higher than the limit set (this is because of DBS Control setting (IdCol Batch Size).Identity columns will be generated and this general DBS control setting controls how large the number you could insert. There is an upper limit , In this case you can drop / Recreate the table and insert the data on the table.
  • Incase drop and recreate does not work , then verify if the table structure has “ALWAYS” for the Identity column, If YES then change it to BY DEFAULT”

 2)      Failure 5753 Numbering for Identity Column %VSTR is over its limit

Solution :  Identity values are assigned by PEs or AMPs based on the type of query:
– INSERT VALUES: by the session's PE
– INSERT SELECT: by the source AMP
– MLoad/FastLoad: by the AMP which receives the data in Acquisition phase

Each AMP/PE requests a batch of values from dbc.idcol:
– Whenever you logoff and logon again you're probably assigned to a different PE requesting a new batch of values for INSERT VALUES.
– When you submit an INSERT SELECT each AMP will request a batch of values

By default the range of values for a batch is 100,000, thus the first PE already reaches the maximum of 32767 and your second session hits the limit. And no, that default can't be set on a table level, only on system level (this is IMHO quite stupid).

You might use following query to check for the current max value(if you got appropriate rights on dbc):

SELECT
dbase.DatabaseName,
tvm.TVMName AS TableName,
tvfields.FieldName AS ColumnName,
tvfields.FieldType AS ColumnType,
tvfields.Nullable,
tvfields.CommentString,
tvfields.TotalDigits(FORMAT 'Z9') AS DecimalTotalDigits,
tvfields.ImpliedPoint(FORMAT 'Z9') AS DecimalFractionalDigits,
tvfields.IdColType,
idcol.AvailValue,
idcol.StartValue,
idcol.MinValue,
idcol.MaxValue,
idcol.Increment,
idcol.Cyc
FROM DBC.TVM
JOIN DBC.tvfields
ON tvm.tvmid = tvfields.tableid
JOIN DBC.Dbase
ON tvm.DatabaseId = dbase.DatabaseId
JOIN dbc.idcol
ON idcol.TableId = tvm.tvmid
AND IdColType IS NOT NULL;

Never use SMALLINT for an identity, even INT might be bad:
For a 2000 AMP system the first INS SEL will reserve 2,000*100,000 = 200,000,000. Those values are lost uring a restart, thus after 10 restart you might already run out of values.

3)    Duplicate Unique Prime key error in Identity column Table

Solution : Check the max(identity col) value against the avail_value using the following query:

SELECT
dbase.DatabaseName,
tvm.TVMName AS TableName,
tvfields.FieldName AS ColumnName,
tvfields.FieldType AS ColumnType,
tvfields.Nullable,
tvfields.CommentString,
tvfields.TotalDigits(FORMAT 'Z9') AS DecimalTotalDigits,
tvfields.ImpliedPoint(FORMAT 'Z9') AS DecimalFractionalDigits,
tvfields.IdColType,
idcol.AvailValue,
idcol.StartValue,
idcol.MinValue,
idcol.MaxValue,
idcol.Increment,
idcol.Cyc
FROM DBC.TVM
JOIN DBC.tvfields
ON tvm.tvmid = tvfields.tableid
JOIN DBC.Dbase
ON tvm.DatabaseId = dbase.DatabaseId
JOIN dbc.idcol
ON idcol.TableId = tvm.tvmid
AND IdColType IS NOT NULL
and dbase.databasename = 'dbname'
and tvm.tvmname = 'tblname'
order by 1,2,3,4;

Drop the table and then re-create (preserving the data) without adjusting the starts with in the table definition, the avail_value in the dictionary will start over at 1 setting the stage for a potential dupe. The reason I say 'potential', is because there are gaps in the sequencing and one can get lucky for a period of time.

4)     Illegal usage of Identity Column %VSTR.

Solution : User attempted to define an invalid identity column or use an identity column incorrectly. The error is returned if: 1) an identity column is defined as a) part of a composite index b) a join index or hash index c) a primary partition index d) a value-ordered index. 2) the input parameter of an INSERT into identity column is a using field (e.g., :F1) which is part of an expression, e.g. :F1+:F2 or :F1+2. 3) an identity column is defined in a temporary or volatile table. It may only be defined in a permanent table. 4) a USING statement contains multiple INSERT statements that insert into different identity column tables. 5) the input parameter of an INSERT into identity column of type BY DEFAULT is a using field (e.g., :F1) that is being reused in another parameter in the insert statement, e.g. USING(F1 INT, F2 INT) INS tab(:F1,:F1);

Remedy: Modify the statement accordingly and re-submit the request.

Add a Comment

Your email address will not be published. Required fields are marked *