Enhancing Search by Improving Content Quality
One of the most common complaints about search is related to the quality and relevancy of results. The result set does not contain all the relevant documents. Or it has too many irrelevant documents. Or the combination of these two.
Maybe the expected, most relevant results are not displayed on the top of the result set. Or the presented results are old and outdated.
These complaints happen quite often, and in most cases, users blame the search engine. However, all these can be fixed by proper changes in the configuration:
- Content sources: Make sure that all the relevant content sources are added to search, and they are indexed correctly.
- Crawl rules: The more content we add to search, the more garbage we have to deal with. Filtering out the unnecessary, irrelevant content is essential - this is where Crawl rules can help us.
- Search schema: The essence of search is metadata. Having proper metadata is critical - in the search engine, we can define and configure them in the search schema.
- Crawl schedule: Content freshness is one of the most important components of search success. Scheduling crawls the right way is essential - you have to plan it thoroughly.
- Result sources: In the previous versions of SharePoint, we could create pre-defined search scopes. In SharePoint 2013 and 2016 (and Office 365), we can create result sources to define subsets of the whole search index, to use them on separate pages (verticals), for example.
- Search pages: Search pages can be configured to display different result sources, with different refiners, different display templates, etc. Each search page has its own configuration which is independent of others’.
- Display Templates: With the help of Display Templates, we can define how the results, hover panels and refiners are supposed to be presented on the user interface. It is a very modern and dynamic way of UI customization.
However, in many cases, changing the search configuration is not enough. Search quality also relies on the quality of the content: on its accuracy, timeliness, and completeness.
- Accuracy: Accurate content contains the precise, and correct information. Improving accuracy is possible by developing the content itself, mostly by investing more time and human effort.
- Timeliness: Getting the information too late is not acceptable in most business situations. Getting an outdated document instead of the current version is to be avoided, too. However, also, getting something in the search results too early (before it is supposed to be published)
- Completeness: Content is complete if it has all the necessary information, and no further research is needed to get it ready. Complete content helps to users to get their jobs done, rather than creating a demand for more and more information over the time.
These content quality factors can be improved several ways: by involving more human resources, or by automated tools. These tools can help with several things, including auto-tagging (generating proper metadata in an automated way) as well as OCRing documents to make their content fully searchable (see Aquaforest Searchlight). Now, it is your turn. What’s going to be your next step to improve search quality in your organization?