top of page

Writing SQL in C# or When You should not use ORM


I noticed that when it becomes necessary to work with SQL databases in .Net applications, developers often chose some ORM library (.Net Entity Framework in most cases) without even considering other alternatives. This is a reasonable decision at first sight, since for popular ORM libraries there are many tutorials, great toolkit, and a lot of experienced developers. However, it is not always a good choice — I would at least consider other options if your application:

  1. Mainly works with fact relationships rather than objects

  2. Requires using dynamic (not predefined) queries


Fact Relationships

If your application mainly implements use cases similar to these ones:

  1. “Give me the list of employees with the highest salary broken down by department”

  2. “Give a discount to all customers registered in April”

  3. “Reduce the price of T-shirts from last year’s collection by 20%”

then your application is focused on fact relationships rather than objects since all these scenarios involve handling an unpredictable amount of data. For example, organization might have 2 departments and 30 employees or hundreds of departments and hundreds of thousands of employees. If some ORM is used, it is assumed that all the data is loaded into memory, then analyzed and, in the case of write operations, uploaded back into the SQL database. Obviously, this cannot but affect performance.


Note. Modern ORM libraries have many ways to solve the performance problems such as LINQ expressions convertible to SQL, lazy loading, etc. But we must remember that these are just tricks that partially mitigate the conceptual issue and problems inevitably arise at some point.


On the other hand, SQL is specially designed to deal with fact relationships and the scenarios described above could be implemented with a couple relatively simple expressions. Obviously, it makes sense to use pure SQL without any ORMs here, but what options do we have? Not so many. The common ones will be:

  1. Stored Procedures

  2. Text based queries

  3. Some LINQ based query builders (e.g. “LINQ to DB”)

  4. My library “SqExpress” :)


Stored Procedures

Stored procedures are the first thing that comes to mind when you need to use the full power of native SQL, but this approach has several serious disadvantages:

  1. Difficulties in maintenance — it is always necessary to make sure that the .Net code is synchronized with code of stored procedures in all database instances that your application works with. This means that you are doomed to mess with migration scripts in almost every deployment and inevitably something will go wrong at some point.

  2. SQL is a great query language, but a very bad programming language — Once you start writing stored procedures, it will be difficult to resist the temptation to move as much logic into them as possible, because it is such an easy way to solve performance problems and quickly fix issues in application design. But the downside is that you will not be able to break your logic into small modules and test them in isolation (how often I have seen “super” procedures for 5 thousand lines with hundreds of variables inside)

  3. Copy-paste — The code in procedures is very difficult to reuse (functions, however, partly solve this problem), so some common logic will be duplicated in many procedures.

  4. It is difficult to implement batch processing. UpdateUserById — it is the typical name for a stored procedure. But what if I have 100 users? No problem! Let’s call the procedure 100 times! The only problem is that it will take 100 times longer (Some databases support Table-Valued Parameters, but such parameters are difficult to use and maintain).

  5. Stored procedures don’t help if you need dynamic queries

  6. If you actively use stored procedures, then migrating to another SQL database will be incredibly difficult.


Text based queries

Queries resulting from simple string concatenation are pure evil. Such code is difficult to write, read, and maintain. Security is also a huge issue — Sql Injections are still relevant today.


However, surprisingly, there are some positive aspects to this approach:

  1. You can use the approach in addition to any other one

  2. It is still the most popular way to create dynamic queries which is available to everyone.


LINQ based query builders

Such query builders (e.g. “LINQ to DB”) can be a good alternative when you need reach SQL abilities. “LEFT”, “RIGHT”, “FULL”, “CROSS” JOINS etc. are no longer a problem — you can express them directly in C# code (using LINQ):

//Full Join
var query =
    from c in db.Category
    from p in db.Product.FullJoin(pr => pr.CategoryID == c.CategoryID)
    where !p.Discontinued
    select c;

CTEs are also supported

//CTE
var employeeSubordinatesReport  =
   from e in db.Employee
   select new
   {
      e.EmployeeID,
      e.LastName,
      e.FirstName,
      NumberOfSubordinates = db.Employee
          .Where(e2 => e2.ReportsTo == e.ReportsTo)
          .Count(),
      e.ReportsTo
   };

var employeeSubordinatesReportCte = employeeSubordinatesReport
                                       .AsCte("EmployeeSubordinatesReport");var result =
   from employee in employeeSubordinatesReportCte
   from manager in employeeSubordinatesReportCte
                      .LeftJoin(manager => employee.ReportsTo == manager.EmployeeID)
   select new
   {
      employee.LastName,
      employee.FirstName,
      employee.NumberOfSubordinates,
      ManagerLastName = manager.LastName,
      ManagerFirstName = manager.FirstName,
      ManagerNumberOfSubordinates = manager.NumberOfSubordinates
   };

However, when it comes to dynamic queries, LINQ is not the best solution. Theoretically it is possible to change the expression but it is not a simple action.


SqExpresss

For many years I was unhappy working with SQL from .Net, but then I wrote my own library (SqExpress) that allows me to write C # code as close to SQL as possible. It does not use LINQ — query building is implemented through helper functions and operator overloading. As a result, there are no problems with dynamic queries. Here’s an example:

//Sample url: users?field=LastName&field=FirstName&field=Address&NameFilter=John
[HttpGet]
public async Task<ContentResult> 
    Get(string nameFilter = null, [FromQuery]string[] field=null)
{
    //Table descriptors
    var user = Tables.GetUser();
    var userExtraFields=Tables.GetUserExtraFields();
    
    //Selecting column list
    var columns = new List<ExprColumn> {user.UserId};
    
    //Adding requested fields
    //If there are fields from the second table it will be joined
    bool joinExtraTable = false;
    if (field! = null && field.Length>0)    
    {
        columns.AddRange(user.Columns.Where
            (c=>field.Any(f=>string.Equals
            (f, c.ColumnName.Name, 
            StringComparison.InvariantCultureIgnoreCase))));
        int l = columns.Count;
        columns.AddRange(userExtraFields.Columns.Where
            (c=>field.Any(f=>string.Equals(f, c.ColumnName.Name, 
            StringComparison.InvariantCultureIgnoreCase))));
            joinExtraTable=l<columns.Count;    
     }
     
     //Result filter
     var filter = !string.IsNullOrEmpty(nameFilter)
         //Operators overload is used here
         ?user.FirstName==nameFilter|user.LastName==nameFilter
         :null;
     
     //!!!!!!
     //Building the base SQL expression
     //!!!!!!
     var selectExpression=Select(columns)        
         .From(user)        
         .Where(filter)        
         .OrderBy(user.FirstName, user.LastName)        
         .Done();
      
      //LEFT join with the second table (if it is required)
      if (joinExtraTable)    
      {
          selectExpression=selectExpression            
              .WithLeftJoin(
                  userExtraFields,
                  on: userExtraFields.UserId==user.UserId);    
       }
       
       //Request to database
       var result = await selectExpression.Query
       (this. _database, new JArray(),        
           (array,r) =>        
           {
           //All requested fields are put into a json object
           JObjecto = new JObject();
           foreach (var column in columns)            
           {
               switch (column)                
               {
                   case Int32TableColumncol:
                       o[column.ColumnName.Name] =col.Read(r);
                       break;
                   
                   case StringTableColumncol:
                     //LEFT JOIN might lead to NULL
                     o[column.ColumnName.Name] =col.ReadNullable(r);
                     break;
                   
                   case NullableStringTableColumncol:
                       o[column.ColumnName.Name] =col.Read(r);
                      break;
                  default: throw new Exception("Unknown column type");                
               }            
         }
         array.Add(o);
         
         return array;        
    });
    
  return new ContentResult    
  {
      Content=result.ToString(),
      ContentType="application/json",
      StatusCode=200



Still ORM?

There is one more question left: “When ORM is a good solution?” Well.. let’s just invert those conditions under which ORM is not recommended to use:

  1. In the vast majority of scenarios, you can accurately predict the number of rows in database tables that will be read or modified (that means that all the data can be mapped to objects)

  2. No frequent need in dynamic queries

If these conditions are met, then most likely ORM is the right choice.



Source: paper.li


The Tech Platform

0 comments

Comments


bottom of page