SQLug.se Challenge 2010

By | October 8, 2014

In Sweden, we have since 2010 arranged an official national SQL Server developer championship. The competition has been run in Swedish (the local language). However I think the competition problems could be interesting also outside of Sweden. Therefore I will be publishing them in English on my blog.

The competition has been arranged by the Swedish SQL Server User Group and the winner each year has been allowed to display this badge on their website/blog.

SQLug.se Challenge Award

Competition problem 2010

We have collected data about fuel consumption for a number of vehicles.

ID Timestamp VehicleName MeterValue
1 2010-02-01 08:00:00.000 LIFT1 10000
2 2010-02-01 08:05:00.000 LIFT1 10025
3 2010-02-01 08:15:00.000 LIFT1 10065
4 2010-02-01 08:15:00.000 LIFT2 20000
5 2010-02-01 08:30:00.000 LIFT2 20150
6 2010-02-01 09:05:00.000 LIFT1 10315
7 2010-02-01 09:10:00.000 LIFT2 20510
8 2010-02-01 09:55:00.000 LIFT2 20810
9 2010-02-01 10:00:00.000 LIFT1 10480
10 2010-02-01 10:00:00.000 LIFT2 20885
11 2010-02-01 10:10:00.000 LIFT1 10530
12 2010-02-01 10:15:00.000 LIFT1 10575
13 2010-02-01 11:00:00.000 LIFT1 10725
14 2010-02-01 11:05:00.000 LIFT2 21445
15 2010-02-01 11:05:00.000 LIFT1 10755
16 2010-02-01 11:50:00.000 LIFT1 11295
17 2010-02-01 12:00:00.000 LIFT1 11325

We can calculate the consumption for a given time period by taking the difference in MeterValue. For example LIFT1 had a consumption of 25 between 08:00 and 08:05.

Task

Calculate the average consumption for each vehicle for the 20% periods that had the highest consumption per minute. Calculate the time-weighted average of these periods.

  • The 20% number of periods should be rounded upwards. If there are 11 periods, the 20% should be rounded to 3 periods.
  • If multiple periods has the same consumption per minute, chose the latest periods.

Your task is to calculate this time-weighted average for all vehicles as fast as possible using T-SQL or SQLCLR (safe mode only).

Other restrictions in the competition:

  • The solutions had to run on a SQL Server 2008 R2
  • You are not allowed to make any changes to the MeterReadings table except creating non clustered indexes
  • Indexed views are not allowed

Example

Let’s look at the data for LIFT1.

VehicleName StartTime EndTime Consumption TimeInMinutes ConsumptionPerMinute
LIFT1 08:00:00 08:05:00 25 5 5
LIFT1 08:05:00 08:15:00 40 10 4
LIFT1 08:15:00 09:05:00 250 50 5
LIFT1 09:05:00 10:00:00 165 55 3
LIFT1 10:00:00 10:10:00 50 10 5
LIFT1 10:10:00 10:15:00 45 5 9
LIFT1 10:15:00 11:00:00 150 45 3,33333333333333
LIFT1 11:00:00 11:05:00 30 5 6
LIFT1 11:05:00 11:50:00 540 45 12
LIFT1 11:50:00 12:00:00 30 10 3

This example contains 10 periods, which means the 20% will be 2 periods. The 2 periods with the highest consumption per minute are marked in bold above.

The time-weighted average consumption for these periods is:

(12 x 45 + 9 x 5) / (45 + 5) = 11,7

For LIFT2 the time-weighted average consumption would be 15.

Author

This competition problem was authored by Johan Åhlén and based on a real world scenario.

SQL Script & Test data

SQL setup script
Testdata

Winning solution

-------------------------------------
-- Author: Henrik Nilsson
-- Company: Brummer & Partners
-- Edition: 1
-- Version: 1
-------------------------------------

-- Setup
SET NOCOUNT ON
CREATE UNIQUE NONCLUSTERED INDEX IX ON dbo.MeterReadings(VehicleName ASC, MeterReadingID ASC) INCLUDE (StartTime, MeterValue) WITH (FILLFACTOR = 100) ON [PRIMARY]
CREATE NONCLUSTERED INDEX IX2 ON dbo.MeterReadings (VehicleName ASC) WITH (FILLFACTOR = 100) ON [PRIMARY]
TRUNCATE TABLE dbo.ConsumptionSummary
GO

-- Main Run
INSERT INTO dbo.ConsumptionSummary(VehicleName, Top20PctAveragePerMinute)
SELECT m.VehicleName
,(
SELECT (60000.0 * SUM(d.Consumtion) / SUM(d.TimeInMilliseconds))
FROM (
SELECT TOP 20 PERCENT
m1.VehicleName
,m2.MeterValue - m1.MeterValue AS Consumtion
,DATEDIFF(millisecond, m1.StartTime, m2.StartTime) AS TimeInMilliseconds
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY mr.MeterReadingID) AS row
,mr.VehicleName
,mr.StartTime
,mr.MeterValue
FROM dbo.MeterReadings AS mr
WHERE mr.VehicleName = m.VehicleName
) AS m1
INNER JOIN (
SELECT ROW_NUMBER() OVER (ORDER BY mr.MeterReadingID) AS row
,mr.VehicleName
,mr.StartTime
,mr.MeterValue
FROM dbo.MeterReadings AS mr
WHERE mr.VehicleName = m.VehicleName
) AS m2 ON m2.row = m1.row - 1
WHERE 1 = 1
ORDER BY (m2.MeterValue - m1.MeterValue) /(1.0 * datediff(ms, m1.StartTime, m2.StartTime)) DESC, m1.StartTime DESC
) AS d
WHERE 1 = 1
GROUP BY d.VehicleName
) AS Top20PctAveragePerMinute
FROM dbo.MeterReadings AS m
GROUP BY m.VehicleName
GO

-- Cleanup
DROP INDEX IX ON dbo.MeterReadings
DROP INDEX IX2 ON dbo.MeterReadings
SET NOCOUNT OFF