A column in a table is functionally dependent upon another column when there is only one value in the second column for each value in the first column. If this is true, the table is said to be normalized.
$ cat inventoryItem Quantity Cost Value Description ---- -------- ---- ----- -------------- 1 3 50 150 rubber gloves 2 100 5 500 test tubes 3 5 80 400 clamps 4 23 19 437 plates 5 99 24 2376 cleaning cloth 6 89 147 13083 bunsen burners 7 5 175 875 scales
The first column, Item, is the key column for the table. It is supposed to functionally determine all of the other columns. Each value in Item is unique so each value determines, "or is the key to," its row and all of the column values on it.
Cost does not functionally depend on Quantity because there are two 5 values in Quantity and each has a different value for Cost. The common sense or semantics of the table implies that knowing the quantity of items in the inventory does not tell us how much they cost.
Now let's see if fd can find the dependencies by answering the following questions: ``is the Cost of the item functionally dependent on the Item?'' and ``is the Quantity of the item functionally dependent on its Cost?''
$ fd Item Cost < inventory trueNORMALIZATION$ fd Amount Cost < inventory false
When functional dependency is found with non-key columns, the columns should be projected and uniqued to form a second table. Then the functionally dependent column, but not the determining column, should be removed from the first table.
$ column column1 column2 < table | uniq > newtable $ rmcol column2 < table > tmp $ mv tmp table
In newtable the column that determines the other column is the key for that table. The original table can be recreated by joining table and newtable with the jointable command:
$ jointable -j column1 table newtable > oldtable
Normalization, or simplification, is an important process because normalized files are smaller, easier to maintain and easier to conceptualize.