I noticed that when it becomes necessary to work with SQL databases in .Net applications, developers often chose some ORM library (.Net Entity Framework in most cases) without even considering other alternatives. This is a reasonable decision at first sight, since for popular ORM libraries there are many tutorials, great toolkit, and a lot of experienced developers. However, it is not always a good choice — I would at least consider other options if your application:
Mainly works with fact relationships rather than objects
Requires using dynamic (not predefined) queries
Fact Relationships
If your application mainly implements use cases similar to these ones:
“Give me the list of employees with the highest salary broken down by department”
“Give a discount to all customers registered in April”
“Reduce the price of T-shirts from last year’s collection by 20%”
then your application is focused on fact relationships rather than objects since all these scenarios involve handling an unpredictable amount of data. For example, organization might have 2 departments and 30 employees or hundreds of departments and hundreds of thousands of employees. If some ORM is used, it is assumed that all the data is loaded into memory, then analyzed and, in the case of write operations, uploaded back into the SQL database. Obviously, this cannot but affect performance.
Note. Modern ORM libraries have many ways to solve the performance problems such as LINQ expressions convertible to SQL, lazy loading, etc. But we must remember that these are just tricks that partially mitigate the conceptual issue and problems inevitably arise at some point.
On the other hand, SQL is specially designed to deal with fact relationships and the scenarios described above could be implemented with a couple relatively simple expressions. Obviously, it makes sense to use pure SQL without any ORMs here, but what options do we have? Not so many. The common ones will be:
Stored Procedures
Text based queries
Some LINQ based query builders (e.g. “LINQ to DB”)
My library “SqExpress” :)
Stored Procedures
Stored procedures are the first thing that comes to mind when you need to use the full power of native SQL, but this approach has several serious disadvantages:
Difficulties in maintenance — it is always necessary to make sure that the .Net code is synchronized with code of stored procedures in all database instances that your application works with. This means that you are doomed to mess with migration scripts in almost every deployment and inevitably something will go wrong at some point.
SQL is a great query language, but a very bad programming language — Once you start writing stored procedures, it will be difficult to resist the temptation to move as much logic into them as possible, because it is such an easy way to solve performance problems and quickly fix issues in application design. But the downside is that you will not be able to break your logic into small modules and test them in isolation (how often I have seen “super” procedures for 5 thousand lines with hundreds of variables inside)
Copy-paste — The code in procedures is very difficult to reuse (functions, however, partly solve this problem), so some common logic will be duplicated in many procedures.
It is difficult to implement batch processing. UpdateUserById — it is the typical name for a stored procedure. But what if I have 100 users? No problem! Let’s call the procedure 100 times! The only problem is that it will take 100 times longer (Some databases support Table-Valued Parameters, but such parameters are difficult to use and maintain).
Stored procedures don’t help if you need dynamic queries
If you actively use stored procedures, then migrating to another SQL database will be incredibly difficult.
Text based queries
Queries resulting from simple string concatenation are pure evil. Such code is difficult to write, read, and maintain. Security is also a huge issue — Sql Injections are still relevant today.
However, surprisingly, there are some positive aspects to this approach:
You can use the approach in addition to any other one
It is still the most popular way to create dynamic queries which is available to everyone.
LINQ based query builders
Such query builders (e.g. “LINQ to DB”) can be a good alternative when you need reach SQL abilities. “LEFT”, “RIGHT”, “FULL”, “CROSS” JOINS etc. are no longer a problem — you can express them directly in C# code (using LINQ):
//Full Join
var query =
from c in db.Category
from p in db.Product.FullJoin(pr => pr.CategoryID == c.CategoryID)
where !p.Discontinued
select c;
CTEs are also supported
//CTE
var employeeSubordinatesReport =
from e in db.Employee
select new
{
e.EmployeeID,
e.LastName,
e.FirstName,
NumberOfSubordinates = db.Employee
.Where(e2 => e2.ReportsTo == e.ReportsTo)
.Count(),
e.ReportsTo
};
var employeeSubordinatesReportCte = employeeSubordinatesReport
.AsCte("EmployeeSubordinatesReport");var result =
from employee in employeeSubordinatesReportCte
from manager in employeeSubordinatesReportCte
.LeftJoin(manager => employee.ReportsTo == manager.EmployeeID)
select new
{
employee.LastName,
employee.FirstName,
employee.NumberOfSubordinates,
ManagerLastName = manager.LastName,
ManagerFirstName = manager.FirstName,
ManagerNumberOfSubordinates = manager.NumberOfSubordinates
};
However, when it comes to dynamic queries, LINQ is not the best solution. Theoretically it is possible to change the expression but it is not a simple action.
SqExpresss
For many years I was unhappy working with SQL from .Net, but then I wrote my own library (SqExpress) that allows me to write C # code as close to SQL as possible. It does not use LINQ — query building is implemented through helper functions and operator overloading. As a result, there are no problems with dynamic queries. Here’s an example:
//Sample url: users?field=LastName&field=FirstName&field=Address&NameFilter=John
[HttpGet]
public async Task<ContentResult>
Get(string nameFilter = null, [FromQuery]string[] field=null)
{
//Table descriptors
var user = Tables.GetUser();
var userExtraFields=Tables.GetUserExtraFields();
//Selecting column list
var columns = new List<ExprColumn> {user.UserId};
//Adding requested fields
//If there are fields from the second table it will be joined
bool joinExtraTable = false;
if (field! = null && field.Length>0)
{
columns.AddRange(user.Columns.Where
(c=>field.Any(f=>string.Equals
(f, c.ColumnName.Name,
StringComparison.InvariantCultureIgnoreCase))));
int l = columns.Count;
columns.AddRange(userExtraFields.Columns.Where
(c=>field.Any(f=>string.Equals(f, c.ColumnName.Name,
StringComparison.InvariantCultureIgnoreCase))));
joinExtraTable=l<columns.Count;
}
//Result filter
var filter = !string.IsNullOrEmpty(nameFilter)
//Operators overload is used here
?user.FirstName==nameFilter|user.LastName==nameFilter
:null;
//!!!!!!
//Building the base SQL expression
//!!!!!!
var selectExpression=Select(columns)
.From(user)
.Where(filter)
.OrderBy(user.FirstName, user.LastName)
.Done();
//LEFT join with the second table (if it is required)
if (joinExtraTable)
{
selectExpression=selectExpression
.WithLeftJoin(
userExtraFields,
on: userExtraFields.UserId==user.UserId);
}
//Request to database
var result = await selectExpression.Query
(this. _database, new JArray(),
(array,r) =>
{
//All requested fields are put into a json object
JObjecto = new JObject();
foreach (var column in columns)
{
switch (column)
{
case Int32TableColumncol:
o[column.ColumnName.Name] =col.Read(r);
break;
case StringTableColumncol:
//LEFT JOIN might lead to NULL
o[column.ColumnName.Name] =col.ReadNullable(r);
break;
case NullableStringTableColumncol:
o[column.ColumnName.Name] =col.Read(r);
break;
default: throw new Exception("Unknown column type");
}
}
array.Add(o);
return array;
});
return new ContentResult
{
Content=result.ToString(),
ContentType="application/json",
StatusCode=200
Still ORM?
There is one more question left: “When ORM is a good solution?” Well.. let’s just invert those conditions under which ORM is not recommended to use:
In the vast majority of scenarios, you can accurately predict the number of rows in database tables that will be read or modified (that means that all the data can be mapped to objects)
No frequent need in dynamic queries
If these conditions are met, then most likely ORM is the right choice.
Source: paper.li
The Tech Platform
Comments