Torvald's Tech Tales

Relevancy-Sorted Searching Using Fuzzy Logic

2022-10-02 18:08 Share This Article

This article tries to explain the relevancy-sorted searching implemented on the Furdex.

If you know a thing or two about the search engine, using the Fuzzy Logic for this problem is almost a basic thing; and this article, indeed, is going to explain the basics.


Parsing the Search Query

This is an almost completely different topic, but I’m going to explain the gist of it to make the following sections more sense.

The website runs customised version of the TerranBASIC. The user-entered search form is examined and the searching conditions are translated into the TerranBASIC syntax and the tiny BASIC interpreter is executed for each records on the website’s database.

Each operator of the TerranBASIC is a Javascript function that takes the given condition and a record on the database to check if the condition is met, then returns the result.

Relevancy Scoring

Normally, a predicate function (e.g. “does this entry’s colour combination contain x and y?”) returns a boolean value: Yes or No. This is not useful when we need to know how closely the condition was matched, but we can modify it to return a relevancy between 0% and 100%.

Let’s say you search for the colour combination of [Indigo, Violet] over the following records:

KeyValuesRelevancy
TaimuIndigo, Violet, White, Black???
CharBlue, Violet, Indigo, White???
MariBlue, Indigo, Violet, White???
SpicyViolet, Indigo, White???

and the relevancy is defined as following:

Relevancy= MatchedRank+UnmatchedRank 2

where:

MatchedRank= 1 2one-based matched index
UnmatchedRank= 1 Count of the unmatched+1

If we apply the equation to our records, this is the outcome:

KeyValuesMatchedUnmatchedRelevancy
TaimuIndigo, Violet, White, Black[1,2]→0.752→0.3330.5417
CharBlue, Violet, Indigo, White[3,2]→0.3752→0.3330.3542
MariBlue, Indigo, Violet, White[2,3]→0.3752→0.3330.3542
SpicyViolet, Indigo, White[1,2]→0.751→0.50.625

and thus, our search result will be sorted as either [Spicy, Taimu, Char, Mari], or [Spicy, Taimu, Mari, Char].

Notes:

Fuzzifying the Boolean Logic

But the relevancy scoring only solves the half of our problem: the boolean logic runs over boolean values, not real numbers; and this is where the Fuzzy Logic comes in: we are going to expand the boolean functions over a domain of real numbers.

The AND, OR, and NOT will be redefined as such:

FunctionExpressionRemarks
AND p qp × q
NOT p1 - p
OR p q1 - (1 - p)(1 - q)(1 - …)derived from NOT(AND(NOT(p), NOT(q)))

And if you put 1 or 0 over the p and q, the functions work identical to their boolean counterparts.

🖘 Go Back to General