When you use a topic to perform a search, the search agent starts its analysis by considering the evidence topics for that topic. If the evidence topic is present, it is given 1.00 score and is considered relevant to the search. If the evidence topic is absent, it is given a 0.00 score and is considered irrelevant to the search. If the evidence topics are weighted, the scores of the evidence topics are multiplied by the weights, then combines the resulting products in a manner specified by the operator of the parent topic. If this parent topic is, in turn, the child of another topic which is being searched, its score is multiplied by its assigned weight, and the resulting product is combined with the products of its siblings in a manner specified by the operator assigned to the parent topic. This process continues until the parent topic is reached.
The operators you use determine how parent and child scores contribute to the importance of a selected document. As each child in the topic is given an importance score, the following calculations are performed:
If a topic uses an ACCRUE operator, the highest ranking result is taken from the products of each child’s weight and score, then adds a little to the score for each child that is present in the document.
If a topic uses an AND operator, the products of each child’s own weight and score are compared, and the lowest product (the minimum) is taken as the score.
If a child uses an OR operator, the products of each child’s weight and score are compared, and highest product (the maximum) is taken as the score.
If a child uses a proximity operator (PHRASE, SENTENCE, or PARAGRAPH), or a relational operator, the child receives a score of 1.00 if the topic is present, and a score of 0.00 if the topic is not present.
An evidence topic receives a score of 1.00 if it is present, and no score of 0.00 if it is not present.
Once the final calculations for the parent topic have been performed, a matched document becomes available to the Verity application so that users can view it with its highlights.
The following example provides a breakdown of how evidence topics and subtopics are calculated to illustrate the process by which importance is assigned to selected documents.
In the following illustration, the parent topic BOEINGCO is being used in a search.
The evidence topics of each subtopic are first checked against the documents to determine if they are present. Evidence topics that are present are assigned scores of 1.00; evidence topics that are absent are assigned a score of 0.00.
The operators at the next level of a topic structure are used to combine the scores of the evidence topics. Because the operatorsat this level are all proximity operators (thus, no weights assigned), they all produce scores that are either 0.00 or 1.00.
For example, assume that the following evidence topics appear within a given document:
The evidence topic “Boeing Computer Services” appears within a phrase.
The evidence topic “Boeing Defense” appears within a paragraph. The evidence topic “Boeing Company” appears within the document.
The evidence topic “Ron Woodard” appears within a phrase.
The other evidence topics are only partially present, or are absent. Table 8-9 shows how the presence or absence of these evidence topics affect topic scores. The score for each topic reflects the presence of all related evidence topics, based on the operators that have been assigned to the parent topics.
Topic |
Evidence topic |
Evidence topic present |
Evidence topic absent |
Topic score |
---|---|---|---|---|
boeing-comp-services |
boeing computer services |
1 1 1 |
1 |
|
boeing-aerospace |
boeing aerospace electronics |
1 |
1 1 |
0 |
boeing-defense |
boeing defense |
1 1 |
1 |
|
boeing-label |
boeing company |
1 1 |
1 |
|
paul-binder |
paul binder |
1 1 |
0 |
|
frank-shrontz |
frank shrontz |
1 1 |
0 |
|
ron-woodard |
ron woodard |
1 1 |
1 |
Given the above topic scores, the operators at the next level of topics in the structure are calculated as follows:
The subtopic boeing-comps, which uses the AND operator, has a score of 0.50.
The subtopic boeing-people, which uses the ACCRUE operator, has a score of 0.50.
Finally, the topic BOEINGCO, which uses the OR operator, compares the products of each child’s weight and score, and takes the highest product (the maximum) as its score. The selected document is thus scored as 0.50.
This process is repeated for each document. The documents are sorted by the scores of the BOEINGCO topic, and displayed in ranked order.