Porting LINQ-2-SQL to Dapper for great justice
about 13 years ago
I got this email over the weekend:
I created the page which is extra embarrassing. Mister performance created a big fat performance mess. It is not the most important page on the site, it is only hit a few 100 times a day at most - however - perfomance is a feature.
First thing I do when I see these issues is dig in with MiniProfiler.
In seconds the reason for the big slowdown is isolated. The ugly looking query is LINQ-2-SQL doing its magic. The 700ms+ gap is LINQ-2-SQL generating SQL from an expression and starting to materialize objects. It takes it almost a second after that to pull all the objects out. This happens every damn hit.
Enough is enough. Time to port it to Dapper.
There is a snag though, the page is using some pretty fancy expression stuff.
var baseQuery = Current.DB.TagSynonyms as IQueryable<TagSynonym>;
if (filter == TagSynonymsViewModel.Filter.Active)
{
baseQuery = baseQuery.Where(t => t.ApprovalDate != null);
}
else if (filter == TagSynonymsViewModel.Filter.Suggested)
{
baseQuery = baseQuery.Where(t => t.ApprovalDate == null);
if (!CurrentUser.IsAnonymous && !CurrentUser.IsModerator)
{
// exclude ones I can not vote on
baseQuery = from ts in Current.DB.TagSynonyms
join tags in Current.DB.Tags on ts.TargetTagName equals tags.Name
join stats in Current.DB.UserTagTotals on
new { Id = tags.Id, UserId = CurrentUser.Id }
equals
new { Id = stats.TagId, UserId = stats.UserId }
where ts.ApprovalDate == null && stats.TotalAnswerScore > Current.Site.Settings.TagSynonyms.TagScoreRequiredToVote
select ts;
}
}
switch (tab.Value)
{
case TagSynonymsViewModel.Tab.Newest:
baseQuery = baseQuery.OrderByDescending(s => s.CreationDate);
break;
case TagSynonymsViewModel.Tab.Master:
baseQuery = baseQuery.OrderBy(t => t.TargetTagName).ThenBy(t => t.AutoRenameCount);
break;
case TagSynonymsViewModel.Tab.Synonym:
baseQuery = baseQuery.OrderBy(t => t.SourceTagName).ThenBy(t => t.AutoRenameCount);
break;
case TagSynonymsViewModel.Tab.Votes:
baseQuery = baseQuery.OrderByDescending(t => t.Score).ThenBy(t => t.TargetTagName).ThenBy(t => t.AutoRenameCount);
break;
case TagSynonymsViewModel.Tab.Creator:
baseQuery = (
from s in baseQuery
join users1 in Current.DB.Users on s.OwnerUserId equals users1.Id into users1temp
from users in users1temp.DefaultIfEmpty()
orderby users == null ? "" : users.DisplayName
select s
);
break;
case TagSynonymsViewModel.Tab.Renames:
baseQuery = baseQuery.OrderByDescending(t => t.AutoRenameCount).ThenBy(t => t.TargetTagName);
break;
default:
break;
}
if (search != null)
{
baseQuery = baseQuery.Where(t => t.SourceTagName.Contains(search) || t.TargetTagName.Contains(search));
}
var viewModel = new TagSynonymsViewModel();
viewModel.CurrentTab = tab.Value;
viewModel.CurrentFilter = filter.Value;
viewModel.TagSynonyms =
(
from synonym in baseQuery
join sourceTagsTemp in Current.DB.Tags on synonym.SourceTagName equals sourceTagsTemp.Name into sourceTagsTemp1
join targetTagsTemp in Current.DB.Tags on synonym.TargetTagName equals targetTagsTemp.Name into targetTagsTemp1
from sourceTag in sourceTagsTemp1.DefaultIfEmpty()
from targetTag in targetTagsTemp1.DefaultIfEmpty()
where filter != TagSynonymsViewModel.Filter.Merge || (synonym.ApprovalDate != null && sourceTag != null && sourceTag.Count > 0)
select new TagSynonymsViewModel.TagSynonymRow() { SourceTag = sourceTag, TargetTag = targetTag, TagSynonym = synonym }
).ToPagedList(page.Value, pageSize.Value);
return viewModel;
Depending on the input parameters the code could be running one of 30 or so variations of the same SQL. Translating this to SQL without an API that allows you to compose a query can easily result in spaghetti code that is very hard to maintain. You need to attach different parameters to the query, depending on the inputs. You need to be able to sort it in many ways. You need to be able to page through the results.
To overcome this mess I created a new SqlBuilder that is somewhat inspired by PetaPoco’s one. It lives in Dapper.Contrib. Even in its early prototype phase it is incredibly powerful:
var builder = new SqlBuilder();
int start = (page.Value - 1) * pageSize.Value + 1;
int finish = page.Value * pageSize.Value;
var selectTemplate = builder.AddTemplate(
@"select X.*, st1.*, tt1.*, u1.* from
(
select ts.*, ROW_NUMBER() OVER (/**orderby**/) AS RowNumber from TagSynonyms ts
left join Tags st on SourceTagName = st.Name
left join Tags tt on TargetTagName = tt.Name
/**leftjoin**/
/**where**/
) as X
left join Tags st1 on SourceTagName = st1.Name
left join Tags tt1 on TargetTagName = tt1.Name
left join Users u1 on u1.Id = X.OwnerUserId
where RowNumber between @start and @finish", new { start, finish }
);
var countTemplate = builder.AddTemplate(@"select count(*) from TagSynonyms ts
left join Tags st on SourceTagName = st.Name
left join Tags tt on TargetTagName = tt.Name
/**leftjoin**/
/**where**/");
if (filter == TagSynonymsViewModel.Filter.Active)
{
builder.Where("ts.ApprovalDate is not null");
}
else if (filter == TagSynonymsViewModel.Filter.Suggested)
{
builder.Where("ts.ApprovalDate is null");
if (!CurrentUser.IsAnonymous && !CurrentUser.IsModerator)
{
builder.Where(@"ts.TargetTagName in (
select Name from Tags where Id in
(select Id from UserTagTotals where UserId = @CurrentUserId and TotalAnswerScort > @TagScoreRequiredToVote)
)", new { CurrentUserId = CurrentUser.Id, Current.Site.Settings.TagSynonyms.TagScoreRequiredToVote});
}
}
switch (tab.Value)
{
case TagSynonymsViewModel.Tab.Newest:
builder.OrderBy("ts.CreationDate desc");
break;
case TagSynonymsViewModel.Tab.Master:
builder.OrderBy("ts.TargetTagName asc, ts.AutoRenameCount desc");
break;
case TagSynonymsViewModel.Tab.Synonym:
builder.OrderBy("ts.SourceTagName asc, ts.AutoRenameCount desc");
break;
case TagSynonymsViewModel.Tab.Votes:
builder.OrderBy("ts.Score desc, TargetTagName asc, AutoRenameCount desc");
break;
case TagSynonymsViewModel.Tab.Creator:
builder.LeftJoin("Users u on u.Id = ts.OwnerUserId");
builder.OrderBy("u.DisplayName");
break;
case TagSynonymsViewModel.Tab.Renames:
builder.OrderBy("ts.AutoRenameCount desc, ts.TargetTagName");
break;
default:
break;
}
if (search != null)
{
builder.Where("(SourceTagName like @search or TargetTagName like @search)", new { search = "%" + search + "%"});
}
if (filter.Value == TagSynonymsViewModel.Filter.Merge)
{
builder.Where("ApprovalDate is not null and isnull(st.Count,0) > 0");
}
var viewModel = new TagSynonymsViewModel();
viewModel.CurrentTab = tab.Value;
viewModel.CurrentFilter = filter.Value;
int count = Current.DB.Query<int>(countTemplate.RawSql, countTemplate.Parameters).First();
var rows = Current.DB.Query<TagSynonym, Tag, Tag, User, TagSynonymsViewModel.TagSynonymRow>(
selectTemplate.RawSql,
(ts,t1,t2,u) =>
{
var row = new TagSynonymsViewModel.TagSynonymRow { TagSynonym = ts, SourceTag = t1, TargetTag = t2 };
row.TagSynonym.User = u;
return row;
},
selectTemplate.Parameters);
var list = new PagedList<TagSynonymsViewModel.TagSynonymRow>(rows, page.Value, pageSize.Value,
forceIndexInBounds: true, prePagedTotalCount: count);
viewModel.TagSynonyms = list;
return viewModel;
Not only is resulting code is both simpler and way faster it is also much more flexible. I no longer need to juggle LINQ’s horrible LEFT JOIN
semantics in my head. I can easily tell when I am attaching parameters and how.
This new implementation retained all of the complex conditional selection and counting logic and happens to be 2 or so orders of magnitude faster.
###How the SqlBuilder works
The SqlBuilder allows you to generate N SQL templates from a composed query.
Here is a trivial example:
var builder = new SqlBuilder();
var count = builder.AddTemplate("select count(*) from table /**where**/");
var selector = builder.AddTemplate("select * from table /**where**/ /**orderby**/");
builder.Where("a = @a", new {a = 1});
// defaults to composing with AND
builder.Where("b = @b", new {b = 2})
builder.OrderBy("a");
// defaults to composing with ,
builder.OrderBy("b");
var count = cnn.Query(count.RawSql, count.Parameters).Single();
var rows = cnn.Query(selector.RawSql, selector.Parameters);
// Same as:
var count = cnn.Query("select count(*) from table where a = @a and b = @b", new {a=1,b=1});
var rows = cnn.Query("select * from table where a = @a and b = @b order by a, b", new {a=1,b=1});
This API is not finalized. However, this simple templating solution allows us to solve most of the complex need for expressions and still have clean and maintainable code.
We dug ourselves into this performance mess, we are slowly paying back the technical debt of using LINQ-2-SQL. Are you?
Compiled queries dont work in this case? is it because the query is being composed?