Further Details
This chapter fleshes out some more details of the concept of a Relation Distinguishing Tuple-Mark.
Types of Missing Information
Earlier I mentioned that there could be multiple meanings of missing information, which would have to have independent relations and relation variables. The example was
S_UnknownCity
S_MultiNational
Which have identical attribute sets and the same "missing" information (the City) but have different meanings for the missing information. For example, maybe all MultNationals must be located in Delaware so their location is guaranteed to not be in London which is quite different from just not knowing the location (so it could be in London).
The tuple-marks discussed so far do not allow for a multi-relation variable to have two relations with the same set of attributes. There would be no way to distinguish among the tuples. There is nothing that prevents this if the appropriate mechanisms are added. If there were multiple marks available ("--m1--", "--m2--", etc.) and the choose-marked restriction could select which mark, we could have any number of relations in a single variable:
S_All : (S_City.2, S_NoCity.2)
|
Relation
|
S#
|
SName
|
City
|
|
|
S_City
|
S1
|
Jones
|
London
|
|
|
S_City
|
S2
|
Smith
|
Bristol
|
|
|
S_City
|
S4
|
Eiffel
|
Paris
|
|
|
S_UnknownCity
|
S3
|
DuPont
|
--m1--
|
|
|
S_MultiNational
|
S5
|
Grid
|
--m2--
|
| We can answer the questions:
Q1: Find the suppliers that can proved to be in London.
Q2: Find the suppliers that can proved to be possibly in London.
Q3: Find the suppliers that can proved to be NOT in London.
Q4: Find the suppliers that can proved to be in Deleware
Using the following queries:
Q1: SELECT S# FROM S_All WHERE City = London.
Q2: Q1 UNION (SELECT S# FROM S_All [!m1!City])
Q3: (SELECT S# FROM S_All WHERE City <> London) UNION (SELECT S# FROM S_All [!m2! City])
Q4: (SELECT S# FROM S_All WHERE City = Deleware) UNION (SELECT S# FROM S_All [!m2! City])
Although adding the functionality to the database appears to mostly a syntax addition, it is certainly a much more difficult mental model to keep track of. The invisibility of the multiple overlapping relations would significantly hinder a user from understanding the database.
Relations with no Attributes: Dee and Dum
What if we have only a single attribute column that can be tuple marked? For example:
Names : (Name, NoName)
|
Name |
|
Jones |
|
Smith |
|
DuPont |
|
-- | What does it mean to have a "NULL" in the only column left? Well, to know that we have to define what relations we have. Suppose we have the following two base relations.
Name: {Name} There exists a supplier with the name Name.
NoName: {} There exists a supplier for whom we do not know the name.
So by adding a tuple with the relation NoName we specify that there exists a supplier for whom we do not know the name, where otherwise we know all the names of the suppliers. The relation for Names as a whole is:
Names: {} There exists a supplier
We can ask several interesting (informally phrased) questions:
Q1: What are the names of the Suppliers? "Names [Name]"
Q2: Are there any suppliers without names? "Names [!Name]"
Q3: Are there any suppliers? "Names []"
Note that Q2 and Q3 will return a relation with no attributes and either zero rows or one row. These are Tweedle-Dum and Tweedle-Dee respectively (see [Warden 90]). Tuple-marks have no problem with them.
Outer Joins
Tuple-Marks can be used just as NULLs are normally used in outer joins. The number of intermediary relations will be quite extensive and it may be difficult to process the intermediary results. [To be completed]
Tuple-Marks in final results
It is expected that the final results of a query would include the tuple-mark "NULL"s for presentation and application purposes when no explicit mention of the attribute occurred within the query. For example, a query of
SELECT * FROM S_ALL
Would return
|
S#
|
SName
|
City
|
|
|
S1
|
Jones
|
London
|
|
|
S2
|
Smith
|
Bristol
|
|
|
S4
|
Eiffel
|
Paris
|
|
|
S3
|
DuPont
|
--
|
|
|
S5
|
Grid
|
--
|
| Although
SELECT * FROM S_ALL [*]
would return
|
S#
|
SName
|
City
|
|
|
S1
|
Jones
|
London
|
|
|
S2
|
Smith
|
Bristol
|
|
|
S4
|
Eiffel
|
Paris
|
| Note that the SELECT operation is performing two operations: it is projecting as part of the query and it is ordering the output columns for the application API. The second has nothing to do with the relational model itself so it would be nice to order the columns without implying a projection and without removing tuples that have tuple-marks in that column.
|