G C Reddy
Software Testing Complete Reference
DATA INTEGRITY
DATA INTEGRITY
It is very important for the data in a database to be accurate, consistent and reliable. This implies that it is very crucial to enforce data integrity.
Data integrity ensures the consistency and correctness of data stored in a database. It is broadly classified into the following four categories.
- Entity integrity – ensures that each row in a table is unique. It enforces integrity of the data contained in the columns, which uniquely identify the rows in a table.
- Domain integrity – ensures that only valid ranges of values are allowed to be stored in a column. It can be enforced, by restricting the type of data, the range of values and the format of the data.
- Referential integrity – ensures that the data in the database remains uniformly consistent, accurate and usable even after the data in it has been changed. It maintains the integrity of data by ensuring that the changes made in the parent table are also reflected in all the dependent tables.
- User defined integrity – refers to a set of rules specified by a user, which do not belong to the entity, domain and referential integrity categories.
ENFORCING DATA INTEGRITY
Incorporating business rules and specifying the relevant constraints can enforce data integrity.
Business rules refer to specific policies followed by an organization for the smooth running of its business. Business rules ensure that the database stores accurate data in relation to the business policies.
For example an organization may have a personnel policy, which states that the minimum salary of an employee is $500. This is a business rule.
Constraints refer to rules, which restrict the values that are inserted in the columns of a table.
CONSTRAINTS:
A Constraint can either be created at the time of creating a table or can be added later. When a constraint is created after table creation, it checks the existing data. If the existing data does not conform to the rule being enforced by the constraint, then the constraint is rejected.
A constraint can be created using either of the following statements:
- CREATE TABLE statement.
- ALTER TABLE statement.
Using CREATE TABLE statement:
A constraint can be defined on a column at the time of creating a table. It can be created with the CREATE TABLE statement:
The syntax is :
|
Using ALTER TABLE statement:
A constraint can also be defined on a column after a table has been created. This can be done using the ALTER TABLE statement.
The syntax is:
|
TYPES OF CONSTRAINTS
The following are the different types of Constraints:
- CHECK constraint
- UNIQUE KEY constraint
- PRIMARY KEY constraint
- FOREIGN KEY constraint
CHECK CONSTRAINT
A CHECK constraint enforces domain integrity by restricting the values to be inserted in a column. It allows only valid values in a column.
It is possible to define multiple CHECK constraints on a single column.
Syntax:
|
Example:
|
The rules regarding the creation of CHECK constraint are as follows:
- It can be created at the column level as well as table level.
- It is used to limit the values that can be inserted into a column.
- It can contain user-specified search conditions.
- It cannot contain sub queries. A sub query is one or more select statements embedded within another select statement.
- It does not check the existing data into the table if created with the No Check Option.
- It can reference other columns of the same table.
UNIQUE CONSTRAINT
UNIQUE constraints are used to enforce uniqueness on non-primary key columns.
A primary key constraint column automatically includes a restriction for uniqueness. However a unique constraint allows null values.
It is not advisable to have columns with null values, though these are allowed. A null value is also validated, hence there can only be one record in the table with a null value. The table can’t contain another row with null value.
For example:
|
The rules regarding the creation of UNIQUE constraint are:
- It does not allow two rows to have the same non-null values in a table.
- It gets enforced automatically when a UNIQUE index is created.
- Multiple UNIQUE constraints can be placed on a table.
PRIMARY KEY constraint:
A primary key constraint is defined on a column or a set of columns whose values uniquely identify the rows in a table. These columns are referred to as primary key columns. A primary key column cannot contain NULL values since it is used to uni8quely identify rows in a table.
While defining a PK constraint, you need to specify a name for the constraint. If no name is specified, SQL Server automatically assigns a unique name to the constraint.
Any column or set of columns that uniquely identifies a row in a table can be a candidate for the primary key. These set of columns are referred to as candidate keys. One of the candidate keys is chosen to be the primary key, based on familiarity and greater usage. The other key, which is not chosen as the primary key is called as alternate key.
Syntax:
|
Example:
|
CLUSTERED AND NON CLUSTERED INDEX
An index is a database object which provides faster access to data by searching the data using the key values. When an index is created on a table, the data in the table is sorted based on the key value on which the index is created.
Sql server provides two types of indexes:
- Clustered index
- Non clustered index
In a clustered index the data is physically sorted, as a result, only one clustered index is possible per table. A clustered index changes the physical order of the rows. Therefore, it is advisable to create a clustered index before a non-clustered index so that the non clustered index is not rebuilt. A clustered index is faster in performance in comparison to a con clustered index.
Hence a clustered index is one in which the logical order of the index is same as the physical sorted order of the corresponding rows that exist in a table.
A non-clustered index is a separate index structure independent of the physical sort order of the data in the table. As such, the data rows may be randomly placed in the table.
Hence a non-clustered index one in which the logical order of the index does not match the physical sort order of the rows on disk.
If a column has a PRIMARY KEY constraint with a clustered index defined on it, then the constraint needs to be dropped before a new constraint is defined on the column
- It creates unique clustered index by default
- It cannot be created on columns that have been defined as NULL during table creation
- It can be created only once for a table, that is only primary key constraint can be created per table
- It always checks for existing data even if it is created with the WITH NO CHECK OPTION
When a PK constraint is created a unique clustered index gets created automatically. The index cannot be dropped explicitly. It is dropped only when the table or the constraint is dropped. If a clustered index already exists, then the PK constraint gets rejected.l
Foreign key constraints
A foreign key is a column or combination of columns whose values match the primary key of another table.
A foreign key does not have to be unique. However, foreign key values must be copies of the primary key values of the master table.
A foreign key is a column on which a FOREIGN KEY constraint has been defined. A FOREIGN KEY constraint associates one or more columns of a table with an identical set of columns on which a PRIMARY KEY constraint has been defined (primary key column) in another table.A foreign key may refer to the primary key of another table or same table.
Syntax:
|
For Example:
|
The rules regarding the creation of a FOREIGN KEY constraint are as follows:
- The no.of columns specified in the foreign key statement must match the no.of columns in the references clause and must be specified in the same order.
- The foreign key can reference the primary key or a UNIQUE constraint of another table.
- The foreign key constraint that is created using the WITH NOCHECK option will prevent the foreign key constraint from validating existing data.
DROPPING CONSTRAINTS:
A constraint can be dropped using the alter table statement in the query analyzer.
The syntax of the ALTER TABLE statement for dropping a constraint is;
|
Drops the constraint, con_city, defined on the city column of the publishers table.
All the constraints defined on a table are dropped automatically when the table is dropped.
CONSTRAINTS AND TRIGGERS:
Each category of data integrity is best enforced through the following constraints.
Entity integrity is best enforced through the PRIMARY KEY constraint, UNIQUE constraint and the IDENTITY property.
The IDENTITY property creates an identity column in the table. An identity column is a primary key column for which numeric values are generated automatically. Each generated value is different from the previous value, thus ensuring uniqueness of data in each row at the table. The IDENTITY property also ensures that while inserting data, the user need not explicitly insert a value for the identity column.
Domain integrity is best enforced through the DEFAULT constraint, CHECK constraint and the FOREIGH KEY constraint.
Referential integrity is best enforced through the FOREIGN KEY constraint and the check constraint.
User-defined integrity is best enforced through all column and table level constraints that are specified with the CREATE TABLE constraint.
RULES AND DEFAULTS
Rules and defaults are objects, which help enforce the data integrity. These objects are bound to columns or user defined data types to ensure that only valid values are allowed to insert into the tables.
Rules
A rule provides a mechanism for enforcing domain constraints for columns or user defined data types. The rule is applied before any modification is to be done on the table. In other words, a rule specifies the restriction on the values for a column or a user defined data type.
The syntax of the CREATE RULE statement is:
|
More Examples:
CREATE RULE dept_name_rule
AS @deptname NOT IN (‘accounts’,’stores’)
CREATE RULE max_price_rule
AS @maxprice >=$5000
CREATE RULE emp_code_rule
AS @empcode LIKE ‘[F-M][A-Z]
Information on a rule can be obtained using the sp_help system stored procedure. The text of a rule can be displayed using the sp_helptext system stored procedure with the name of the rule as its parameter.
Binding Rules
A rule can be bound using the sp_bindrule system stored procedure.
Syntax:
|
Binds the rule, type_rule,to the type column of the titles table
The restrictions on the use of the rules are as follows:
- Only one rule can be bound to a column or a user defined datatype.
- A rule cannot be bound to system datatypes.
- If a new rule is bound to a column or datatype that is already been inserted in the table, the existing values in the tables do not have to meet the criteria specified by the rule.
- A rule cannot be defined for a system-defined datatype
Unbinding Rules
A rule can be unbound from a column or user-defined datatype using the sp_unbindrule system stored procedure.
The syntax of the sp_unbindrule is:
|
Unbinds the rule attached to the type column of the titles table.
DEFAULTS
Default is a constant value assigned to a column, into which the user need not insert values.A default can be bound to a column or a user-defined datatype.
The syntax of the CREATE DEFAULT statement is:
|
Any constant, built-in funtion, mathematical expression or a global variable can be used in the constant expression. The character and date constants must be included in single quotation marks (‘), whereas money, integer, and floating-point constants can be specified without quotation marks.
Binding Defaults
The sp_bindefault system stored procedure is used for binding a default.
The syntax of sp_bindefault is:
|
Creates a default, city_default, and binds the default value, Oakland, to city column of the authors table.
The statement
|
Binds the default city_default to the user-defined datatype, city_datatype.
UNBINDING DEFAULTS
Defaults can be unbound from a column or user defined datatype using the sp_unbindefault system stored procedure.
The syntax of sp_unbindefault is:
|
Unbinds the default specified on the city column of the authors table.
RENAMING RULES AND DEFAULTS
The sp_rename system stored procedure can be used for renaming rules and defaults.
The syntax of sp_rename is:
|
DROPPING RULES AND DEFAULTS
The DROP RULE and DROP DEFAULT statement can be used to drop a rule and default respectively.A rule or default must be unbound from the column or the user-defined datatype before it is dropped.
The syntax for dropping a rule is:
|
Where rule_name is the name of the rule to be dropped.
The syntax for dropping a default is :
|
Where default_name is the name of the default to be dropped.
Tags: CONSTRAINTS, DATA INTEGRITY, Data integrity ensures the consistency and correctness of data stored in a database., User defined integrity

[...] DATA INTEGRITY [...]