Full Text Search in PostgreSQL

We all have apply search using like or ilike in our SQL queries. And if you have used full text search then you know how useful is to use the full text search on large database instead of using like or equal. But if you haven’t used or starting new on full text search then check out these points for better understanding:-

  1. Full text search use an “inverted index” support . This is an index where the keys are individual terms, and the associated values are sets of records that contain the term. Indexing increases the efficiency of searching to greater extent.
  2. It allows searches to find variant forms of the same word, without tediously entering all the possible variants. Also, this step typically eliminates stop words, which are words that are so common that they are useless for searching.
  3. Full text search is optimized to compute the intersection, union, etc. of the record sets, and usually provides a ranking algorithm to quantify how strongly a given record matches search keywords.
  4. It provide multilingual support like english, spanish etc.
  5. It gives you a facility of less irrelevant results and higher relevant results.

HOW TO USE IN QUERY

I will explain you with an example for easy understanding. Let us suppose you have a table named as article, it has columns like title, sub_heading, article_content , start_date and so on…

You have to apply your text search on three columns of this table like on title, sub_heading, and article content and you want to your search result in ranking order.

Method to do :-

STEP-1:Full text searching in PostgreSQL is based on the match operator @@, which returns true if a tsvector (document ) matches a tsquery (query). In more easy terms ts_vector contains the columns in which search  is applied  and ts_query contains search text.

Now this Query will search on article table , on columns title, sub_heading and article_content . ? is search term  .

This is the most used query . Now there are more function available to modify your query according to your need and use.

plainto_tsquery : If you want to search the text with whitespaces e.g. Auto Bajaj , then you have to use function plainto_tsquery in place of ts_query. Now Query would be :-

STOP WORDS : Now you see there are few words in your search which are not searchable that are very common words like a, the , doing etc

PGSQL full text search disallows those words search by default. They are called as stop words .

If you wants those words in your search then you have to create its seperate dictionary in pg_catalog schema and then use it in your query. I have created a english_stem_nostop dictionary that copy the same english language , template of snowball and no stopwords .

 

You can check in pg_ts_dict table of pg catalog  if you want.

CREATE TEXT SEARCH DICTIONARY english_stem_nostop ( Template = snowball , Language = english );

Then create text search configuration as  english_nostop.

CREATE TEXT SEARCH CONFIGURATION public.english_nostop ( COPY = pg_catalog.english );

 

ALTER TEXT SEARCH CONFIGURATION public.english_nostop ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, hword, hword_part, word WITH english_stem_nostop;

 

Then you use that configuration  i.e. english_nostop in the query  

 

STEP- 2:- As i consider you know how to create index on columns of table. So i am creating index on these three columns fro a better results .

CREATE INDEX artcile_title_indx ON article USING gin(title);

CREATE INDEX artcile_sub_heading_indx ON article USING gin(sub_heading);

CREATE INDEX artcile_article_content_indx ON article USING gin(article_content);

To study more about indexing refer this link.
For more detailed explanation about full text search you can this page.

Check out our post for Solving session problems in CodeIgniter after upgrading PHP version.

 

Please follow and like us:
20