Monday, 3 August 2015

Top Teradata Interview Questions and Answers (2 of 7)

Top Teradata Interview Questions and Answers
#Top Teradata Interview Questions and Answers:
The teradata interview questions of second set.
  • What is the usage of Fallback?
A) When a table is fallback protected the rows are duplicated on other AMPs. Fallback is similar to mirroring, but different. The similarities is that both provide a duplicate copy, but the difference is that Fallback places copies of its rows on multiple AMPs so if a failure occurs Teradata can use the parallelism to help the failed AMP.

Teradata+Interview +Questions-Part-1
  • How many primary keys we will apply on a table?
A) Only one
  • How does rows store in Teradata?
A) It spreads the rows across all AMPs
  • Tell me the types of Indexes?
A) PI - Primary index
NoPI-No primary index
UPI-Unique primary index
NUPI-Non unique primary index
USI-Unique secondary index
NUSI-Non unique secondary index
  • Difference between UPI and NUPI and the situation where exactly? We will use these?
Teradata indexes
  • The role of SI in teradata and explain the types?
Teradata Secondary indexes
  • How do we create a join and hash indexes and explain the scenerios? When we will use exactly?

A) There are four basic types of joins that Teradata can perform depending on the characteristics of the table definition. When the join domain is the primary index (PI) column, with a unique secondary index (USI) the join is referred to as a nested join and involves, at most, three AMPs. The second type of join is a merge join, with three different forms of a merge join, based on the request.

  • The newest type of join in Teradata is the Row Hash join using the pre-sorted Row Hash value instead of a sorted data value match. This is beneficial since the data row is stored based on the row hash value and not the data value. The last type is the product join.
  • In Teradata, each AMP performs all join processing in parallel locally. This means that matching values in the join columns must be on the same AMP to be matched. When the rows are not distributed and stored on the same AMP, they must be temporarily moved to the same AMP, in spool. Remember, rows are distributed on the value in the PI column(s). If joins are performed on the PI of both tables, no row movement is necessary. This is because the rows with the same PI value are on the same AMP - easy, but not always practical. Most joins use a primary key, which might be the UPI and a foreign key, which is probably not the PI.
  • Regardless of the join type, in a parallel environment, the movement of at least one row is normally required. This movement puts all matching rows together on the same AMP. The movement is usually required due to the user's choice of a PI. Remember, it is the PI data value that is used for hashing and row distribution to an AMP. Therefore, since the joined columns are mostly columns other than the PI, rows need to be redistributed to another AMP. The redistributed rows will be temporarily stored in spool space and used from there for the join processing.

  • What is the meaning of Transient journal overhead?
A) Transient Journal – Each AMP has a distinct Transient Journal for database integrity. If an AMP has a row that is about to change (Insert, Update, Delete), it will take a before picture and store it in its Transient Journal. If the transaction fails, then the AMP can Rollback the before image. If, however, the transaction is successful, the Transient Journal throws away the image! The Transient Journal lives for the life of a transaction.

Permanent Journal – All AMPs copy any changes to the database and store them in the Permanent Journal permanently. The DBA or Table Creator must request a Permanent Journal. Each day, the DBA usually does a "Check Point with Save", and that days Permanent Journal is stored permanently on tape or disk off of the Teradata system
  • What is skewness in teradata?
A) Skewness is the statistical term, which refers to the row distribution on AMPs. If the data is highly skewed, it means some AMPs are having more rows and some very less i.e. data is not properly/evenly distributed. This affects the performance/Teradata's parallelism. The data distribution or skewness can be controlled by choosing indexes.
  • What is a subtable and when it requires?
A) As soon as the DBA uses the SQL to create a secondary index Teradata immediately gets to work. Teradata must build the secondary index Subtable immediately before it can become an alternate path to the data. Each AMP Hashes the secondary index value for each row they own with the Hash Formula. The result is a 32-bit Row Hash which points to a bucket in the Hash Map, which tells the secondary index row which AMPs Subtable it will be on. All UNIQUE Secondary Indexes are hashed and the value plus the real Row-ID of the base table are sent to the proper AMP over the BYNET. 

No comments:

Post a Comment