A category groups documents by content, independent of location or type of document store. Use categories to filter your search results.
By setting up a well-organized category strategy, you can manage information by grouping documents of similar content. You can also view lists of documents for each category. This lets you present users with a predefined set of categories; they can then browse the documents in each category.
Set up categories by defining an initial query and a relevance threshold. Sybase Search assigns a document to the category if the document's relevance percent is equal to or greater than the threshold.
For example, a query that consists of search terms and a minimum document relevance creates a category of documents that are grouped by their relevance to the search terms defined in the given query. Document relevance helps ensure that the documents in the category are valid matches.
Another way to categorize documents is based on the content from one or more training documents. In this method, Sybase Search extracts the most relevant content from training documents and uses this information as a new internal query to generate matching documents. This categorization is like the "find similar" feature, except, category training extracts relevant content from more than one document.
Extracting content from training documents has these benefits:
Categories are created automatically, based on example documents and without a base query.
For example, a recruitment company wants to create a category based on a Java programmer’s resume. If categories were not created automatically, the company would need to produce a "seed" query, which would vary, depending on the individual who was creating the category. With training documents, Sybase Search extracts the most relevant content and creates a newly trained Java programmer category relevant to the original Java programmer’s resume.
Retrains a category that was originally created using a seed query. Categories can be trained repeatedly on new training documents to achieve the best results.
For example, creating a category using the seed query "football team" can contain documents on English football or American football, depending on the documents that have been indexed by the system. Retraining this category on a few sample documents about American football ensures that the documents in the category are more relevant.
Removes the need for manual tagging and continual maintenance of untrained categories.
Send your feedback on this help topic to Sybase Technical Publications: pubs@sybase.com
Your comments will be sent to the technical publications staff at Sybase, Inc. For product-related issues or technical support, contact Sybase Technical Support at 1-800-8SYBASE.