The following advantages Teradata is having over other RDBMS:
- Parallel processing
- Shared nothing architecture
- Super fast data retrieval
- TASM-Teradata active control system controls traffic efficiently
- Teradata VIEWPOINT gives DBA and users their own view of data
- Excellent data processing options-JOINS, Secondary Indexes, Hash index, Portioned Tables for range queries,
2. Draw the picture of Teradata warehouse?
Refer the diagram
3. What are the differences between Star schema and Snow flake schema?
- In a star schema every dimension will have a primary key.
- In a star schema, a dimension table will not have any parent table.
- Whereas in a snowflake schema, a dimension table will have one or more parent tables.
- Hierarchies for the dimensions are stored in the dimensional table itself in star schema.
- Whereas hierarchies are broken into separate tables in snowflake schema. These hierarchies help to drill down the data from topmost hierarchies to the lowermost hierarchies.
Snowflake Schema is similar to the star schema. "A schema is called a snow flake if one or more dimension tables do not join directly to the fact table but must join through other dimension tables."
Fact tables are large and usually distributed by hash. Each star schema contains a fact table that is home to measurements describing a particular process. The measurements, or facts, are given context by their related dimensions. The grain of the fact table describes the level of detail at which the facts are recorded.
Dimension tables are usually small and often distributed by replication, but dimension tables can be distributed by hash. The dimensions provide contextual information, without which reports would be meaningless. Successful dimension design hinges on the proper use of keys, the development of a richly detailed set of dimension columns, and a rejection of the urge to save space.
- A fact table that contains no facts is called a factless fact table. This oxymoron aptly describes the design technique discussed in this chapter. Although no facts are explicitly recorded in a factless fact table, it does support measurement. A factless fact table is useful in two kinds of situations:
- Factless fact tables for events record the occurrence of activities. Although no facts are stored explicitly, these events can be counted, producing meaningful process measurements. Examples include the number of documents processed or approved, the number of calls to a customer support center, or the number of impressions of an advertisement.
- Factless fact tables for conditions are used to capture significant information that is not part of a business activity. Conditions associate various dimensions at a point in time. When compared with activities, they provide valuable insight. Examples of conditions include eligibility of people for programs, the assignment of salesreps to customers, active marketing programs for a product, or special weather conditions in effect.
- Start schema
- Galaxy Schema
- Fact constellation Schema
- Snow flake schema
7. Explain Teradata important components with architecture?
Refer Teradata Architecture
8. Difference between SMP and MPP?
- Stands for Massively parallel processing
- A computing that uses many CPUs in parallel to execute a single program
- CPU has its own memory which prevents hold up
- Does not suffer from bottleneck when all CPUs attempt to access memory once
- Difficult to program as applications must be divided to communicate with each other
- Stands for symmetric processing
- A computing where many CPUs are available for individual processes simultaneously
- CPU attempts to access the memory at once, so there is hold up
- Does suffer from bottleneck when all CPUs attempt to access memory once
- Easier to program
If the user request asks for all of the rows in a table, every AMP should participate along with all the other AMPs to complete the retrieval of all rows. This type of processing is called an all AMP operation and an all rows scan.
However, each AMP is only responsible for its rows, not the rows that belong to a different AMP. As far as the AMPs are concerned, it owns all of the rows.
Within Teradata, the AMP environment is a "shared nothing" configuration. The AMPs cannot access each other's data rows, and there is no need for them to do so.
Set table: SET TABLE means that Duplicate ROWS are rejected. If your system is in Teradata mode, then SET tables will be the default. You can be in Teradata mode and explicitly define a Multiset table.
Multiset table: A MULTISET Table means the table will ALLOW duplicate rows. If your system is in ANSI mode, then MULTISET tables will be the default. In either Teradata mode or ANSI mode, you can specifically state (SET or MULTISET) for the table type desired. The problem with Multiset tables is if you have a Non-Unique Primary Index, and accidentally load the table twice, you have duplicate rows.
Keep reading next posts.