Full-text search & trigram similarity

SearchVector/SearchQuery/SearchRank are PostgreSQL full-text search; websearch parses Google-style queries (OR, quotes, -).

python

# ======================================================================
# `SearchVector`/`SearchQuery`/`SearchRank` are PostgreSQL full-text search;
# `websearch` parses Google-style queries (`OR`, quotes, `-`).
# `TrigramSimilarity` (pg_trgm) is typo-tolerant fuzzy matching, and
# `__unaccent` strips diacritics so 'muller' finds 'Müller'.
# ======================================================================

# ----------------------------------------------------------------------
# websearch 'lord OR postgres' ranked
# ----------------------------------------------------------------------

# Django:
query = SearchQuery("lord OR postgres", search_type="websearch")
Book.objects.annotate(rank=SearchRank("search", query)) \
    .filter(search=query).order_by("-rank") \
    .values_list("title", "rank")

# SQL:
#   SELECT "examples_book"."title" AS "title", ts_rank(to_tsvector(COALESCE(("examples_book"."search")::text, '')), websearch_to_tsquery('lord OR postgres')) AS "rank"
#   FROM "examples_book"
#   WHERE "examples_book"."search" @@ (websearch_to_tsquery('lord OR postgres'))
#   ORDER BY 2 DESC

# Result:
#   The Lord of the Rings: 0.0304
#   Postgres for Authors: 0.0304

# ----------------------------------------------------------------------
# trigram similarity to 'potter'
# ----------------------------------------------------------------------

# Django:
Book.objects.annotate(sim=TrigramSimilarity("title", "potter")) \
    .filter(sim__gt=0.2).order_by("-sim") \
    .values_list("title", "sim")

# SQL:
#   SELECT "examples_book"."title" AS "title", SIMILARITY("examples_book"."title", 'potter') AS "sim"
#   FROM "examples_book"
#   WHERE SIMILARITY("examples_book"."title", 'potter') > 0.2
#   ORDER BY 2 DESC

# Result:
#   Harry Potter: 0.538

# ----------------------------------------------------------------------
# __unaccent: 'muller' matches 'Müller'
# ----------------------------------------------------------------------

# Django:
Author.objects.filter(name__unaccent__icontains="muller")

# SQL:
#   SELECT "examples_author"."id", "examples_author"."name", "examples_author"."bio", "examples_author"."born", "examples_author"."rating", "examples_author"."nickname"
#   FROM "examples_author"
#   WHERE UPPER(UNACCENT("examples_author"."name")::text) LIKE '%' || UPPER(REPLACE(REPLACE(REPLACE((UNACCENT('muller')), E'\\', E'\\\\'), E'%', E'\\%'), E'_', E'\\_')) || '%'

# Result:
#   Karl Müller

Related Posts

Text functions

Math functions

Date / time functions

Comparison & conversion (Greatest, Least, NullIf, Cast, Collate, KT)

Dates & standard aggregates (Count/Avg/Min/Max/Sum, ExtractYear)

Text & math functions (Concat, Upper, Length, Round)