Pages

Wednesday, February 1, 2012

SQL Server: Fixing Integer Division Division Issues

This was an interesting one I wasn't expecting to stumble upon this morning.

At the moment I'm writing some cost monitoring SQL that looks at the Execution logs on a reporting services instance and works out, at a rate per hour, how much time and money the business is spending just waiting for reports to execute (either fetching the data, process the data, or rendering) - this is on the assumption that when we add processors/ memory to a box it's nice to be able to show users the direct saving to them.

We also have a target of "6 seconds" so we can also measure how much we're missing our target by (and how much that's costing).

Now in order to do this I was trying to use the SQL;

SELECT 
  DATEADD(dd,0, datediff(dd,0, [TimeStart])) "Day",
  SUM([TimeDataRetrieval])/1000/60/60 "Time Spent Retrieving Data",
  SUM([TimeProcessing])/1000/60/60 "Time Spent Processing",
  SUM([TimeRendering])/1000/60/60 "Time Spent Rendering",
  (SUM([TimeDataRetrieval]) + SUM([TimeProcessing]) + SUM([TimeRendering]))/1000/60/60 "Total Time Spent",
  (6000 * COUNT(DISTINCT [ExecutionId]))/1000/60/60 "Six Second Watermark",
  (SUM([TimeDataRetrieval]) + SUM([TimeProcessing]) + SUM([TimeRendering]))/1000/60/60 * @Hourly_Rate "Total Cost",
  (6000 * COUNT(DISTINCT [ExecutionId]))/1000/60/60 * @Hourly_Rate "Six Second Cost"
  FROM [ReportServer].[dbo].[ExecutionLog2]
 GROUP BY DATEADD(dd,0, datediff(dd,0, [TimeStart]))
 ORDER BY DATEADD(dd,0, datediff(dd,0, [TimeStart])) DESC

As you can see it's pretty simple, it relies on the ExecutionLog2 view (included in Reporting Services as standard), applies a bit of daily grouping, and then works out some times and costs by dividing the time from the views (in milliseconds) by 1000, 60, and 60 in order to give hours.

The surprising thing (to me anyway!) was when you ran it;
SQL Execution: Integer-only Results?

Everything seemed to be coming back as nice round numbers. Clearly that's not going to be the case with execution times so what was going on?

After a bit of digging I turned up this useful page;


http://msdn.microsoft.com/en-us/library/aa276874%28v=sql.80%29.aspx

And looking at the section, abotu half-way down, titled "Result Types" you see the text;

"Returns the data type of the argument with the higher precedence. For more information about data type precedence, see Data Type Precedence."

The effect of this is that if you divide an integer by an integer you can only get an integer in return. Apparently, to the designers of SQL Server, one divided-by two is zero. In what way did they think this would be useful?!

Looking back at the SQL you can fix it either explicitly or implicitly. I'll provide both below and you can choose which one to go for.


First the explicit fix (using CASE(XXX As Float);

SELECT 
  DATEADD(dd,0, datediff(dd,0, [TimeStart])) "Day",
  CAST(SUM([TimeDataRetrieval]) As Float)/1000/60/60 "Time Spent Retrieving Data",
  CAST(SUM([TimeProcessing]) As Float)/1000/60/60 "Time Spent Processing",
  CAST(SUM([TimeRendering]) As Float)/1000/60/60 "Time Spent Rendering",
  CAST((SUM([TimeDataRetrieval]) + SUM([TimeProcessing]) + SUM([TimeRendering])) As Float)/1000/60/60 "Total Time Spent",
  CAST((6000 * COUNT(DISTINCT [ExecutionId])) As Float)/1000/60/60 "Six Second Watermark",
  CAST((SUM([TimeDataRetrieval]) + SUM([TimeProcessing]) + SUM([TimeRendering])) As Float)/1000/60/60 * @Hourly_Rate "Total Cost",
  CAST((6000 * COUNT(DISTINCT [ExecutionId])) As Float)/1000/60/60 * @Hourly_Rate "Six Second Cost"
  FROM [ReportServer].[dbo].[ExecutionLog2]
 GROUP BY DATEADD(dd,0, datediff(dd,0, [TimeStart]))
 ORDER BY DATEADD(dd,0, datediff(dd,0, [TimeStart])) DESC

The result;
SQL Server: Floating Point Values Displayed
Now the implicit fix;

SELECT 
  DATEADD(dd,0, datediff(dd,0, [TimeStart])) "Day",
  SUM([TimeDataRetrieval])/3600000.0 "Time Spent Retrieving Data",
  SUM([TimeProcessing])/3600000.0 "Time Spent Processing",
  SUM([TimeRendering])/3600000.0 "Time Spent Rendering",
  (SUM([TimeDataRetrieval]) + SUM([TimeProcessing]) + SUM([TimeRendering]))/3600000.0 "Total Time Spent",
  (6000 * COUNT(DISTINCT [ExecutionId]))/3600000.0 "Six Second Watermark",
  (SUM([TimeDataRetrieval]) + SUM([TimeProcessing]) + SUM([TimeRendering]))/3600000.0 * @Hourly_Rate "Total Cost",
  (6000 * COUNT(DISTINCT [ExecutionId]))/3600000.0 * @Hourly_Rate "Six Second Cost"
  FROM [ReportServer].[dbo].[ExecutionLog2]
 GROUP BY DATEADD(dd,0, datediff(dd,0, [TimeStart]))
 ORDER BY DATEADD(dd,0, datediff(dd,0, [TimeStart])) DESC

The result;
SQL Server: Floating Point Values Displayed

The advantage of the implicit fix is speed but that speed comes at a cost of clarity. I know if I'd spotted something that didn't seem to be doing anything useful in a piece of SQL I'm supporting I'd remove it and maybe in a couple of years time you'll have forgotten why the .0 is there.

Both work through so I'll leave it up to you to decide which to use.

NOTE: You'll notice that the two "result" images above are slightly different. This is due to the same problem (integer division) being caused by the milliseconds to hours conversion (/1000/60/60 vs /3600000). Not quite sure what to make of this!


No comments: