Foliotek Developer Blog

SQL Query Optimization for Reporting: a Different Approach

Synopsis–Not Just Faster Queries, but Fewer

Several months ago, we identified that some of the reports on our site were simply taking extremely too long to run (sometimes 30 minutes to an hour even). ?Users were getting frustrated, and it just made the system annoying to use. ?Consequently, we began investigating how we could best optimize the report queries. ?(We were using SQL Server with Linq.) ?What we eventually found was that what we needed was not just faster queries–but fewer.

The Old Approach

Here’s an example of how the reports were being generated:

  • Query the database for all the 1st Level Options associated with the selected Item
  • Bind this to a repeater.
  • For each repeater item/1st LevelOption (on data bind) - Query the database for all the different 2nd Level Options associated with the current 1st Level Option
  • Bind this to a repeater.
  • For each 2nd Level Option (on data bind) - Query the database to get all the 3rd Level options
  • Bind this to a repeater.
  • For each 3rd level option - Compute any necessary values for the report

Our initial approach was to identify which of the queries were taking the longest amount of time and optimize them as best we could (computed queries, adding indexes, etc.). ?While we did see marginal improvements, what we really needed was something that would give us drastic improvements–not just marginal ones.

The Problem

What we realized was that no matter how optimized the inner queries were, each of those queries was being run hundreds and even thousands of times, so there was a real limit to how much improvement we could ever gain in this way. ?In fact, some of the inner queries would be run, only to find that there was no data to be returned, making the time it took to run the query a complete waste.

The New Approach

The new approach we took was to get all of the data we would need for the report at the very beginning. ?So, we went from hundreds or thousands of queries (depending on the report data) to 3-5 queries that returned all the data we would need. ?Then in code, we just filtered these results when we needed them.

The Results

Admittedly, the first time I ran the report after implementing the new approach (and fixing the errors that arose), I thought something was wrong. ?What had been taking 30 minutes had just taken a few seconds and was giving the same data. ?Thinking something weird had happened, I reran the report from the start and found once again that* it completed in just a few seconds. *

The Conclusion

So, what we needed in this case was not just faster queries, but fewer queries. Before, we were running a few queries that returned very small amounts of data and doing this hundreds, thousands, and even tens of thousands times. ?Now, we run a few queries that return a lot of data only one time a piece. ?We continued to implement this strategy across all of our reports, and the results were quite significant in every case, making our system more responsive and easier to use.

*Note. *We wondered how the system would respond by collecting such a large amount of data on the front end, but we’ve experienced no problems in this regard. In fact, since the queries finish so quickly, the amount of memory required only takes up a few seconds anyway before the report is generated and the memory released.

An Example

[sourcecode lang="csharp"] protected void repListItemDataBound(object sender, System.Web.UI.WebControls.RepeaterItemEventArgs e) { if (e.Item.ItemType ListItemType.Item || e.Item.ItemType ListItemType.AlternatingItem) { CurrentSection = (dac.ScoringGuideSection)e.Item.DataItem; Repeater repEvaluators = (Repeater)e.Item.FindControl("repEvaluators"); //EvaluationScorers was run once and contains all scorers on any section in the report. repEvaluators.DataSource = EvaluationScorers; repEvaluators.DataBind(); Repeater repSubSections = (Repeater)e.Item.FindControl("repSubSections"); //ScoringGuideSections contains all sections in the report and is filtered here to get only the subsections of this articular main section. repSubSections.DataSource = ScoringGuideSections.Where(s => s.ParentID == CurrentSection.SectionID); repSubSections.DataBind(); } } [/sourcecode]