top of page
Writer's pictureThe Tech Platform

Full-Text Search with Entity Framework Core & SQL Server

Updated: May 26, 2023

When it comes to searching for addresses in a database table, a typical approach involves using the SQL Like (%) operator for an exact match. However, this method becomes impractical when dealing with large datasets containing millions of addresses, as it can take an incredibly long time to return results. This limitation makes it unsuitable for enterprise applications.


To address this issue, Full-Text Search offers a faster alternative. It provides a highly efficient query language for quickly searching through extensive datasets. However, using Full-Text Search requires careful consideration of various parameters to ensure it aligns with specific requirements.


In the case of rewriting the previous query using Full-Text Search, it would look like this:

SELECT * FROM dbo.Addresses a WHERE CONTAINS(a.FullAddress, ' "1/18 Clifford" ')

There are a few important aspects to note when working with Full-Text Search:


1. Exact Match in Full-Text Search:

By default, SQL discards commonly occurring strings in Full-Text Search indexes. This means that exact matches may not work if the stoplist is enabled. However, by including double quotation marks (" ") within the query, it is possible to disable the stoplist and enable exact matching. Disabling the stoplist will be explained later in the article.


2. Free-Text vs. Full-Text:

When it comes to efficient data retrieval from database columns, there are two types of searches: Free-Text and Full-Text. While Full-Text searches for exact word matches, Free-Text also considers the semantic meaning of the words for potential matches. For address lookup purposes, Full-Text search is used. The CONTAINS operator is used for Full-Text search, while FREETEXT is used for Free-Text search.


3. Setting Up SQL Server for Full-Text Search:

Setting up Full-Text Search involves several elements, which are discussed in detail later. These elements include the catalog, unique index, and the FullText index.


3.1 Catalog:

A Full-Text catalog serves as a logical container for a group of Full-Text indexes. Before creating a Full-Text index, a catalog must be created. The following command is used to create a catalog:

CREATE FULLTEXT CATALOG FTCAddress AS DEFAULT;

3.2 Unique Index:

After creating a catalog, a unique index is necessary to distinguish the results. In this case, relying on the table's primary key, which is a GUID column, can serve as the unique index.


3.3 Full-Text Index:

Finally, the Full-Text index is created using the catalog and the unique index with the following command:

CREATE FULLTEXT INDEX ON dbo.Addresses(FullAddress) KEY INDEX PK_Addresses ON FTCAddress WITH STOPLIST = OFF, CHANGE_TRACKING AUTO;

The command creates the index on the FullAddress column of the Addresses table, specifying the unique index and catalog. Disabling the stoplist is achieved by setting it to OFF for the purpose of the work.


3.3.1 Maintaining the Full-Text Index:

Once the Full-Text index is created, the database starts populating it. While an initial full population is required, it's essential to address the issue of data changes, such as adding or removing rows. There are three methods to adjust the index:

  • Full Population: Performing another full population is resource-intensive and affects searches while the population is in progress. Therefore, it's recommended to rely on the first full population and utilize one of the following methods for subsequent changes.

  • Change Tracking: SQL Engine internally tracks changes in a separate table and applies those changes to the Full-Text index. This method incurs a minor overhead on the database.

  • Time Stamp: This method requires adding a Time Stamp column to the Address table and tracking the timestamps to identify changed rows.


4. Entity Framework Setup

Now it’s time to set up our ASP.NET API and Entity Framework to run the Full-Text query.

As we are using Code-First, every change that goes into our database should be through codified Migrations managed by the ASP.NET API.


The first step in doing so is to create a database migration:

dotnet-ef migrations — startup-project ..\Api\LocationApi.csproj add FullAddressIndex — verbose

This creates a migration in which we can inject all the Catalogue, and Full-Text index setup:

public partial class FullAddressFullTextIndex : Migration
{
    protected override void Up(MigrationBuilder  migrationBuilder)    
    {
        string sql = GeneralFunctions.ReadSql
                (typeof(FullAddressFullTextIndex), 
                "20220702124309_FullAddressFullTextIndex.sql");
        migrationBuilder.Sql(sql, true);    
    }
    
    protected override void Down(MigrationBuilder migrationBuilder)    
    {
        string sql = GeneralFunctions.ReadSql
                (typeof(FullAddressFullTextIndex), 
                "20220702124309_FullAddressFullTextIndexUndo.sql");
        migrationBuilder.Sql(sql, true);    
    }
}

To keep my migrations clean, put the actual SQL scripts in a separate file, and loaded them at the time of migration as embedded resources.


Here is the Roll Forward migration:

IF  EXISTS (SELECT * FROM sys.fulltext_indexes fti WHERE fti.object_id = OBJECT_ID(N'[dbo].[Addresses]'))
ALTER FULLTEXT INDEX ON [dbo].[Addresses] DISABLE

GO
IF  EXISTS (SELECT * FROM sys.fulltext_indexes fti WHERE fti.object_id = OBJECT_ID(N'[dbo].[Addresses]'))
BEGIN	
    DROP FULLTEXT INDEX ON [dbo].[Addresses]
End

Go
IF EXISTS (SELECT * FROM sys.fulltext_catalogs WHERE [name]='FTCAddress')   
BEGIN	
    DROP FULLTEXT CATALOG FTCAddress 
END

CREATE FULLTEXT CATALOG FTCAddress AS DEFAULT;
CREATE FULLTEXT INDEX ON dbo.Addresses(FullAddress) KEY INDEX PK_Addresses ON FTCAddress WITH STOPLIST = OFF, CHANGE_TRACKING AUTO;

and here is the Roll Back migration:

DROP FULLTEXT INDEX on dbo.Addresses;
DROP FULLTEXT CATALOG FTCAddress;

Once run, the initial Full Population gets started, and after some time, the Full-Text Queries can be run against the database.


But how do you run a Full-Text Query from a .NET application?


5. Run Full-Text Query

To run Full-Text Queries, Entity Framework provides a set of helper functions to execute the Contains method using the EF construct as below:

public IQueryable<Address> GetFullText(
        Expression<Func<Address, string>> predicate,
        string query,
        int currentPage = 0,
        int pageSize = 0)
{
     if (pageSize == 0)    
     {
          return Context.Addresses            
               .Where(e=>EF.Functions.Contains(e.FullAddress, $"\"
               {query}\""));    
         }
     return Context.Addresses            
          .Where(e=>EF.Functions.Contains(e.FullAddress, $"\"
          {query}\""))            
          .Skip((currentPage-1) *pageSize)            
          .Take(pageSize);
}

And finally here is how to call this method to fetch all the records fast:

public async Task<IEnumerable<AddressViewModel>> GetAddresses(string query)
{
    var foundAddresses = await 
    LocationApiRepository.AddressRepository.GetFullText
    (t => t.FullAddress, query)        
        .Select(c=>newAddressViewModel()        
        {
            Id=c.Id,
            FullAddress=c.FullAddress,        
        })        
        .Take(30)        
        .ToListAsync();
    
    return foundAddresses;
}


Exploring Other Search Options:

In addition to Full-Text Search, there are alternative options available for efficient search operations. One such option is Azure Cognitive Search, which offers advanced semantic search capabilities. However, it comes with a higher cost, which grows as the data size increases.


By implementing Full-Text Search and understanding its nuances, developers can significantly enhance the speed and accuracy of address lookup operations within their applications.

0 comments

Comments


bottom of page