Five New Optimizely Certifications are Here! Validate your expertise and advance your career with our latest certification exams. Click here to find out more

Boost PDF files if name contains query

Vote:
 

Hi!

I have a request from a client asking to boost PDF files if the name of the PDF contains the query, but I haven´t found a good way to accomplish that.

Basically what I want to do is this:

var searchResult = SearchClient.Instance.UnifiedSearchFor(searchString, findLanguage)
                                          .BoostMatching(p => p.MatchTypeHierarchy(typeof(PdfFile)) & ((PdfFile)p).Name.Contains(searchString), 4)
                                          .ApplyBestBets()
                                          .Track()
                                          .Skip((page - 1) * hitsPerPage)
                                          .Take(hitsPerPage)
                                          .GetResult();

Of course .BoostMatching(p => p.MatchTypeHierarchy(typeof(PdfFile)) & ((PdfFile)p).Name.Contains(searchString), 4) doesn´t work because that won´t be compiled to a filter, but it´s the essence of what I want. I can´t use AnyWordBeginsWith() since the requirement isn´t that the word in the name should start with the query. The MatchFuzzy() won´t work either because as I understand it, it will match on like similiar words which isn´t the requirement either.

Have I missed something or is this just not possible? Seems quite simple so I thought that would be possible. Hopefully I´ve just missed something :)

Best,
Petra

#188256
Feb 15, 2018 20:04
Vote:
 

Try changing

.BoostMatching(p => p.MatchTypeHierarchy(typeof(PdfFile)) & ((PdfFile)p).Name.Contains(searchString), 4)

to this

.BoostMatching(p => p.MatchTypeHierarchy(typeof(PdfFile)) & ((PdfFile)p).Name.Match(searchString), 4)
#188267
Feb 16, 2018 12:15
Vote:
 

But that would only match PDF files where property Name matches the search query exactly, right? At least that was how I was taught Match works. I want to have PDF files where Name fully or partly contains the search query so if the query is "brev" then I want to boost PDF files with name "Skicka brev" etc.

#188271
Feb 16, 2018 12:45
Vote:
 

Ah, you are right. Then I guess you have to add some kind of wildcard search extention.

Something similar to the FuzzyFilter method specified here:
https://world.episerver.com/forum/developer-forum/-Episerver-75-CMS/Thread-Container/2016/12/episerver-find---how-to-do-fuzzy-search/

I did a quick test: Document with Name "Sending letters" got match when using FuzzyFilter with "letters".

#188272
Feb 16, 2018 13:27
Vote:
 

Hmm, do you mind showing how you use that filter? I saw that post but didn´t really think it would work together with the boosting, since it's not only that I want to use a fuzzy search in general, but actually boost those items.

#188275
Feb 16, 2018 14:03
Vote:
 

I just did a quick test to verify that filter actually would match. I altered the extension to this:

public static ITypeSearch<T> FuzzyFilter<T>(
            this ITypeSearch<T> search,
            Expression<Func<T, string>> fieldSelector,
            string almost,
            double? minSimilarity = null)
        {
            var fieldName = search.Client.Conventions
                .FieldNameConvention
                .GetFieldNameForAnalyzed(fieldSelector);

            var wildcardQuery = new FuzzyQuery(fieldName, almost.ToLowerInvariant())
            {
                MinSimilarity = minSimilarity
            };

            //Add it to the search request body
            return new Search<T, WildcardQuery>(search, context =>
            {
                if (context.RequestBody.Query != null)
                {
                    var boolQuery = new BoolQuery();
                    boolQuery.Should.Add(context.RequestBody.Query);
                    boolQuery.Should.Add(wildcardQuery);
                    boolQuery.MinimumNumberShouldMatch = 1;
                    context.RequestBody.Query = boolQuery;
                }
                else
                {
                    context.RequestBody.Query = wildcardQuery;
                }
            });
        }

And then used it like this:

_client.Search<Document>()
                .FuzzyFilter(d => d.Name, query, 1)
                .GetResult();
#188277
Feb 16, 2018 14:10
Vote:
 

That works alright, by itself as a separate filter to match the Name property against the query (which of course could be useful), but it´s not really working with boosting the way I want with PDF files specifically. However, I´m not sure as to where the requirement came from and the background so I´m going to dig a bit deeper into that to see if this could be a solution.

Thanks a bunch for this!

#188284
Feb 16, 2018 18:12
* You are NOT allowed to include any hyperlinks in the post because your account hasn't associated to your company. User profile should be updated.