In: Categories » Computers and technology » Programming » Creating your physical data model
Physical data models are extracted from their logical counterparts by mapping logical objects to physical objects supported by the target database platform. The following table shows the relationship between logical and physical objects. In many cases, your physical data model almost mirrors your logical data model because entities typically map directly to the tables that store them. Exceptions to this rule are logical data models that contain nonspecific relationships and entity subtypes. Although entity subtypes are an advanced topic, beyond the scope of this article, we do have a nonspecific relationships that must be resolved into a physical database structure. Remember that the physical mechanism used to relate a child to its parent is the sharing of key values: The child’s foreign key matches the parent’s primary key. A nonspecific relationship doesn’t migrate keys, so something must change between the structure of the logical data model and the physical data model extracted from it. To store these shared keys you must create a separate table—a table with no entity counterpart. Now you can represent related categories and inventory items simply by storing their shared keys in the newly created CategoryInventoryItem table. In other words, for every category to which a particular inventory item belongs, you create a row in the CategoryInventoryItem table that contains foreign keys to both that InventoryItem row and the related Category row. So if that inventory item is related to three categories, three rows in the CategoryInventoryItem table all contain the same foreign key to that InventoryItem row, but each row contains a different foreign key—one for each related Category row. No real-world entity is called a CategoryInventoryItem; this table exists in the physical realm only to store the keys shared by the Category and InventoryItem entities. Therefore, CategoryInventoryItem is a physical table, but it has no corresponding logical entity. Data types must also be mapped between the logical and physical data models. The data types that you specify for the attributes in your logical data model may not be directly supported in the target database platform, so each must be mapped to its closest possible supported type. The Datetime data type, for example, is directly supported by SQL Server 2000 but not by Oracle 9i, which maps Datetime to the DATE data type. If the relationships in our database could talk, they would say things such as the following: “All values entered in OrderItem.SalesOrderID must match values in SalesOrder.SalesOrderID.” “If a user deletes a SalesOrder, automatically delete all OrderItems related to it.” But how does the database enforce these hard-and-fast rules? In two ways: declarative referential integrity constraints and triggers. A constraint restricts or constrains a table to accept only rows that satisfy the constraint’s rules. A declarative referential integrity (DRI) constraint’s rules constrain a child table, but they operate with respect to operations and conditions in the related parent table. Take, for example, the following statements: “All values entered in OrderItem.SalesOrderID must match values in SalesOrder.SalesOrderID.” “If a user deletes a SalesOrder, automatically delete all OrderItems related to it.” If you formalize them as a DRI constraint the statements become the following code:
ALTER TABLE OrderItem ADD CONSTRAINT FK__OrderItem_SalesOrder FOREIGN KEY (SalesOrderID) REFERENCES SalesOrder(SalesOrderID) ON DELETE CASCADE
Why is this statement called a declarative referential integrity constraint? Because it declares the table’s referential integrity behavior as part of the table definition itself, as opposed to implementing such behavior as a separate piece of executable computer code. We now break down this DRI constraint and look at what each phrase means, starting with the following:
ALTER TABLE OrderItem ADD CONSTRAINT FK__OrderItem_SalesOrder
The preceding phrase modifies the definition of the OrderItem child table to add the constraint object named FK__OrderItem_SalesOrder. This object contains the remaining three lines of code, which make up the actual instructions of the constraint, as follows:
FOREIGN KEY (SalesOrderID) REFERENCES SalesOrder(SalesOrderID)
This phrase is the “meat and potatoes” of the DRI constraint. It creates the restriction between the foreign key OrderItem.SalesOrderID and the primary key SalesOrder. SalesOrderID so that, for every value stored in OrderItem.SalesOrderID, an equal value must be stored in SalesOrder.SalesOrderID with which OrderItem.SalesOrderID can match. If any other value is attempted in OrderItem.SalesOrderID, the database throws an error and the attempt fails. Finally, the last line of the DRI constraint is as follows:
ON DELETE CASCADE
This line instructs the database to first delete all OrderItem rows related to a SalesOrder row that is about to be deleted. Without the ON DELETE CASCADE clause in this DRI constraint, attempting to delete a SalesOrder row with related OrderItem rows would throw an error, and no changes would be made to either table’s contents. The first four lines of constraint code are standard DRI syntax supported by most database platforms. Some database platforms, such as SQL Server 2000 and Oracle 9i, support the ON DELETE CASCADE extension to the basic DRI syntax to facilitate these cascading deletes. For database platforms that do not support cascading deletes in DRI constraints, the cascading delete behavior must be implemented by using a separate piece of code that automatically executes whenever a user attempts to delete a SalesOrder row. Such a piece of automatically executing code is called a trigger. Although we thoroughly discuss triggers later, we need to cover the basics of what they are and how they work here so that you can contrast and compare them to DRI constraints. First of all, why are they called triggers? The answer is that the execution of one is “triggered” by a database event (INSERT, UPDATE, or DELETE). A trigger is attached to a specific table and is instructed to automatically execute whenever one or more specific database events take place on that table. You can place just about any SQL code that you want into a trigger, including code that modifies the contents of other tables. Take, for example, the following clause:
ON DELETE CASCADE
If implemented as a trigger, the preceding clause becomes the following code:
CREATE TRIGGER tD_SalesOrder ON SalesOrder FOR DELETE AS BEGIN DECLARE @Rows int SELECT @Rows = @@rowcount IF @Rows = 0 RETURN DELETE OrderItem FROM OrderItem ch, deleted WHERE ch.SalesOrderID = deleted.SalesOrderID END
Now to break down what’s happening in this code. First, start with the trigger header. This header both creates the trigger and attaches it to a specific table and also specifies the event or events that fires it, as follows:
CREATE TRIGGER tD_SalesOrder ON SalesOrder FOR DELETE AS
In other words, you’re creating a trigger on the SalesOrder table named tD_SalesOrder, and you’re instructing it to execute every time that a DELETE event occurs on the SalesOrder table. So far, so good. Now look at the body of the trigger—the part that actually executes— as follows:
BEGIN DECLARE @Rows int SELECT @Rows = @@rowcount IF @Rows = 0 RETURN
You start the code block with BEGIN and then immediately declare a local variable of type Integer named @Rows. This variable holds the count of rows in the SalesOrder table affected by the DELETE event. The global variable @@rowcount is controlled by the database server; it always contains the count of rows affected in the table most recently modified by the currently executing code, which in this case is the SalesOrder table. So SELECT @Rows = @@rowcount places the count of rows deleted from the SalesOrder table into the local variable that you just declared. If the number of rows affected is zero, the trigger code gracefully exits by using the RETURN statement, and nothing further happens. If this RETURN test doesn’t exist in the trigger code, the rest of the trigger code executes, and you don’t want that to happen because you’d be executing the code that should only run if there are related child rows. Now to the part of the trigger that actually performs the cascading delete, as follows:
DELETE OrderItem FROM OrderItem ch, deleted WHERE ch.SalesOrderID = deleted.SalesOrderID END
This relational delete deletes those rows in the OrderItem table that are related to all rows in the Deleted pseudotable—in essence, all OrderItem rows related to the SalesOrder rows deleted. Finally, you close the body of the trigger by using END. Triggers have very different capabilities on the various database platforms. The trigger that we just discussed, for example, operates on the SQL Server 2000 platform, which defines an event at the statement level—meaning that this trigger fires once for each DELETE statement executed on the table, no matter how many rows are affected by that statement. By comparison, Oracle triggers can operate at either the statement level, as in SQL Server, or at the row level, such that the trigger code executes once for each individual row that is affected by the statement. Another distinction between database platforms is when they can fire triggers. SQL Server fires triggers after the event takes place and all values are modified by the statement (but before they are committed to disk); Oracle enables you to define a trigger to execute either before or after the event takes place. Other distinctions also exist. So when do you implement referential integrity by using DRI constraints and when do you use triggers? Well, a DRI constraint is almost always faster to execute than the equivalent operation implemented in a trigger—and they certainly are simpler to code—but you may sometimes need more complex logic to be involved, and for that, you need to use triggers.
Put another way, normalization is the process of structuring data to fit the formal definition of a relational database. The layered approach that we recommend that you use to design your databases helps you create inherently normalized database models from the very beginning, but you often inherit a database with a very bad data structure. This article gives you the basics of how to normalize a bad database structure into a good one. You begin with a very bad design that places everything into one big entity. The example here is a database that tracks puppies and the tricks that they can perform. Your first clue that the database shouldn’t be structured as a single entity is that huge part of repeating attributes (TrickName_1, TrickName_2, etc.), but how do you go about normalizing it? You normalize a database by modifying it through increasing degrees of normalization known as normal forms. To achieve First Normal Form, you must eliminate all repeating groups of attributes from your entities. A repeating group of attributes is fairly easy to spot, such as the group of TrickID_N, TrickName_N, TrickWhereLearned_N, and SkillLevel_N, where N with Databases goes from 1 to 40. When you build all those repeating groups into a single entity, you invariably leave one or more of them empty, which not only requires that you leave most or all of their attributes nullable but also leaves a lot of empty space in the physical tables produced from your design. Both of these practices constitute bad database design. You move this repeating group into an entity of its own so that you can have as many or as few of these groups as you need, and you don’t have to make any columns nullable. Now your database is said to be in First Normal Form, or 1NF. To achieve Second Normal Form, we must eliminate redundant attributes from all our entities. Redundant data depends on only part of a multivalued key. The name of a trick, for example, depends on the TrickID, which is only one part of the primary key of the Trick entity. This means that you can move these attributes to an entity of their own. To achieve Third Normal Form, we must remove all attributes that are not dependent on the keys of each entity. None of the Kennel data, for example, depends on the key of the Puppy table in any way; the entire collection of Kennel attributes is redundantly stored in each row of the Puppy table. You can split the Kennel attributes into a separate entity because none of them depend on the value of the Puppy entity’s key. Now your database is said to be in Third Normal Form, or 3NF. Further degrees of normalization are possible, but they are beyond the scope of this article. If you normalize your database designs to Third Normal Form, you are typically normalizing far enough.
Where normalization is the process of structuring data to fit the formal definition of a relational database, denormalization is the process of introducing controlled redundancy into your normalized database for specific performance enhancement reasons. The more that you denormalize your database structure, the less it fits the formal definition of a relational database, and the less its data can take advantage of relational database features and functionality. For this reason, you should denormalize only if you have absolutely compelling reasons to do so (such as unbearably slow relational selects)—and even then only as little as possible. Normalization takes place in the logical data model; denormalization takes place in the physical data model for a specific database platform and uses performance-enhancement techniques best suited to that platform. The most common denormalization technique is to redundantly store ancestor columns in a child table. If you don’t want to perform a relational join every time that you want a report of customers and their customer type, for example, you can redundantly store CustomerTypeName in the Customer table. Now you can return the value of CustomerTypeName directly from the Customer table, which saves the database from needing to perform a costly relational join and thereby increases reporting performance. The downside to denormalization is that, because the denormalized portion of the database no longer conforms to the formal definition of a relational database, you must implement hand-crafted code to automatically handle editing changes migrating from CustomerType. CustomerTypeName to Customer.CustomerTypeName. You also must implement code that performs an initial copy of CustomerType.CustomerTypeName to Customer. CustomerTypeName after the Customer row is first inserted. Such code would be implemented as triggers so that these operations occur automatically. The more that you denormalize, the more code you must write to maintain this nonrelational, migrated data, so denormalize sparingly.
Many times you want to automatically supply default values to rows inserted into a table— for example, the current date and time of a sale, or a status value that represents some beginning point in a business process. This behavior is implemented by using defaults Defaults are defined on specific table columns, and they tell your database, “If this column isn’t included in the INSERT statement for this table, go ahead and automatically enter this default value. If the INSERT statement does include this column, use whatever value was supplied for it.” Here’s how a default declaration looks in the table definition:
CREATE TABLE SalesOrder( SalesOrderID int IDENTITY(1,1), CustomerNumber varchar(12) NULL, SaleDate datetime DEFAULT GetDate() NOT NULL, <more column definitions> CONSTRAINT PK_SalesOrder PRIMARY KEY NONCLUSTERED (SalesOrderID))
You’ll notice throughout this article that we occasionally boldface portions of code. This is a simple yet effective mechanism for helping you “see the forest for the trees” as we discuss a topic. For example, highlighting the DEFAULT phrase in a block of code while discussing defaults helps you quickly acquire the most applicable part of the code without being distracted by the rest during discussion. This default declaration automatically places the result of SQL Server’s GetDate() function—the current date and time—into the SaleDate column as long as the SaleDate column is not a part of the INSERT statement. If the SaleDate column is part of the INSERT statement, the value supplied by the statement will be used.
legal notice
Our website is not responsible for the information contained by this article. Web-articles is a free articles resource.
Suggestion: If you need fresh, daily updated content for your website, feel free to use our service. Click here for more information.
Useful tools and features
related articles
In this article, you create the template that puts your form data into the database. Create a file named CompanyAddAction.cfm inside the Ch02 directory, type the code into the file’s editing window, and save the file. <cfquery name=”InsertCompany” datasource=”#Request.MainDSN#”> INSERT INTO Company( CompanyName, Address, City, State, ZipCode, Comments ) VALUES ( ‘#Trim(Form.CompanyName)#’, ‘#Trim(Form.Address)#’, ...
2. Modifying a Company in the Database
The data that CFQUERY requests from the database comes back in a result set, and the Name attribute tells ColdFusion what that result set is to be named. The SQL statement consists of three clauses: SELECT, FROM, and ORDER BY. SELECT tells the database which columns to retrieve from the database; FROM tells the database which table to retrieve those columns from; and ORDER BY tells the database how to sort the results. The result set returned from CFQUERY contains multiple rows of data, and each row has multiple columns. It would ...
3. Building the company edit action template
The code in CompanyEditAction.cfm is like the code in CompanyAddAction.cfm, but the edit action updates rather than inserts. Create a file named CompanyEditAction.cfm inside the Ch02 directory, type the code into the file’s editing window, and save the file. <cfquery name=”UpdateCompany” datasource=”#Request.MainDSN#”> UPDATE Company SET CompanyName = ‘#Trim(Form.CompanyName)#’, Address = ‘#Trim(Form.Address)#’, City = ‘#Trim(Form.City)...
4. Adding a New Employee to the Database
The CFQUERY in CompanyDeleteAction.cfm uses a DELETE statement with two SQL clauses: DELETE and WHERE. DELETE tells the database the table from which to delete a record, and WHERE tells the database which record to delete. You can watch the company delete process in action. Point your Web browser to http://<yourserver>/CFMXBible/Ch02/CompanyGetDeleteForm.cfm and enter a CompanyID. (To get a valid ID, go to the company list and pick a number from the ID column.) Click Submit to see the chosen company’s informatio...
5. Modifying an Employee in the Database
The DateFormat() function around the DateOfBirth column in the codereturns the employee’s birth date reformatted according to a display mask. DateOfBirth normally comes back from the database in the following format: 2002-01-01 00:00:00.0 That format is not very user-friendly. Calling DateFormat() with a mask of “mm/dd/yyyy” returns the date as follows: 01/01/2002 This version is, of course, more natural and easy to read. The same is true for the employee edit process...
6. Removing an Employee From the Database
The user must have the capability to remove employees from the database. The employee delete process is a simple combination of techniques that you have already learned, such as retrieving a record from the database, displaying that record in a template, and so on. The first page in this process is nearly identical to the employee get edit form. Create a file named EmployeeGetDeleteForm.cfm inside the Ch02 directory, type the code into the file’s editing window, and save the file. <html> <head> <ti...
7. Making direct links to the forms
Say that you want to modify or delete a company. Right now, you need to remember the company’s ID, go back to the launch pad, click Company Edit, and enter the company ID, all just to get to the edit form. Wouldn’t you rather click a company in the list and go directly to the edit form? ... <table> <tr> <td><b>ID</b></td> <td><b>Name</b></td> <td><b>Address</b></td> <td><b>City</b></t...
8. Using a Custom Tag
Custom tags are reusable, developer-authored extensions to the ColdFusion language. The custom tag that you create in the following parts displays today’s date in a familiar format. Create a file named TodaysDate.cfm inside the Ch02 directory, type the code into the file’s editing window, and save the file. <cfoutput>#DateFormat(Now(), “ddd, mmm d, yyyy”)#</cfoutput> Now() returns the current date and time, and DateFormat() reformats the date. To call your custom tag, open index.cfm, a...
9. Understanding All Relational Result Sets
This article can help you better understand complicated SQL containing multitable joins, group-related clauses, and aggregate functions—by far the most problematic topics for most database developers. You also learn the correct way to handle database exceptions and incorporate them as actual functionality in your ColdFusion application. You learn, too, how to increase performance by caching queries in memory for fast access. You can memorize SQL clauses and Bachus-Naur forms until you’re blue in the face, ...
