I have a SQL query that runs on a SQL 2008 R2 server. It is run on the 1st and the 16th of the month. The query contains the following logic for setting StartDate and EndDate parameters:
DECLARE @RunDateTime datetime
DECLARE @RunDate datetime
DECLARE @StartDate datetime
DECLARE @EndDate datetime
DECLARE @Month int
DECLARE @Year int
DECLARE @strStartDate varchar(10)
SET @RunDateTime = GetDate()
SET @RunDate = CAST(@RunDateTime AS DATE)
IF DATEPART(d, @RunDate) = 16
BEGIN
SET @StartDate = DATEADD(d, -15, @RunDate)
SET @EndDate = @RunDate
END
ELSE
BEGIN
IF Month(@RunDate) = 1
SET @Month = 12
ELSE
SET @Month = Month(@RunDate) - 1
IF Month(@RunDate) = 1
SET @Year = Year(@RunDate) - 1
ELSE
SET @Year = Year(@RunDate)
SET @strStartDate = CAST(@Year AS VARCHAR) + CAST(@Month AS VARCHAR) + '16'
SET @StartDate = CONVERT(datetime, @strStartDate)
SET @EndDate = @RunDate
END
If the query is run on the 16th, we want the date range to be from the 1st to the 15th of the month. If the query is run on the 1st, we want the date range to be from the 16th to the end of the previous month.
Due to changing requirements at work, we’ve been told we need to find a way to do this without using any conversion of dates to strings. Is this possible, and does anyone have any idea how it might be done? I am very quickly getting out of my depth here.
This will meet your specification. It doesn’t care what day you run it on, and it adjusts the EndDate to be the current date if it is not on an exact boundary. I could not compare exactly to your given code because it has errors for some dates (such as
20120201).I tested this code against a year of dates and I believe everything is working correctly.
Verify its correctness yourself by checking out the results of a year of calculations in this SqlFiddle.
Basically: Calculate the first of the month from yesterday’s date. Using this date, determine the offset from the start of the month to compute the starting date (if the 1st – 15th, use 0; if the 16th – 31st, use 15). Finally, compute the end date from the start of the month + the calculated day, making a special adjustment for the 1st and 16th to use the prior day’s end period.
There’s actually no harm in always specifying through to the end of the period (1 – 16 or 1 – 28/29/30/31) rather than through the day run on, but that’s not what you asked for.
To answer your questions about what is going on in the query.
A derived table is a rowset-returing query, normally starting with a
SELECT, wrapped in parentheses and given an alias. For example:Here we have derived table
Xconsisting of the result of the query inside it. I like to skip theASpart because to me it just adds clutter. Note this was just a simple example–derived tables are generally more complex than this and are useful in a variety of situations. I used no alias on thedbo.Usertable but it is best practice to do so. But it is important to note that all column references in a derived table are resolved from within the derived table. No using tables from outside.VALUESis a way to create a rowset from values provided in a query. The normal syntax you might be familiar with isINSERT dbo.Table VALUES (1, 'a');. In SQL Server 2008 this was extended to allow multiple rows at the same time, as inINSERT dbo.Table VALUES (1, 'a'), (2, 'b');. I additionally allows this special multi-row notation in place of aSELECTquery inside of a derived table. For example:A derived table’s column names are normally discovered from the query. See that in #1,
X.FullNameactually refers to an expression containing multiple columns from tabledbo.User. The expression was explicitly given a new alias in the query. If you did not provide an alias, you would get an error, because the expression has no intrinsic name. In fact, the query from the prior point will not run as-is because derived tableXhas no column names! However, there is a syntax for providing explicit column aliases outside of a derived table:I prefer this second syntax because it really helps make clear what is intended, it helpfully separates the expressions from their names, and generally to me is all around better for column-aliasing in derived tables. Many times I am copying and pasting what is inside the derived table from elsewhere, and it is nice to not have to cobble in column names each time. The aliases
F,T, andSare for meFrom,To, andStart. I could have made them the longer names, but chose not to.Using
SELECT Alias = <Expression>is just a shortcut forSELECT <Expression> AS Alias. I prefer the=method because then the aliases are reliably on the left side in one easy scannable column instead of at the ends of variable-length expressions. This syntax also has benefits due to being able to change to variable names with the addition of one character, or convert the query to anUPDATEstatement easily.You can change any query to assign values to variables instead of returning a rowset just by adding
@Variable =in front of each column expression. One gotcha to be aware of is that if a query returns multiple rows, while you can still use the@Variable =syntax, the server will still do all the work of materializing all the rows, and your variable(s) will simply have the value(s) from some, one, row. It may be the first row, or the last one, but even if you think there is a consistent row returned you should always assume it will be a random row. If you need a particular row then provide a WHERE clause or aTOPstatement with anORDER BYto force a particular row’s values to be used.CROSS APPLYis simply a derived table that has the special property of being allowed to have an “outer reference”, that is, it can use column values from tables introduced before it, within the parentheses. It restricts rows when no rows are returned (just like anINNER JOINed derived table), orOUTER APPLYdoes not restrict rows (just like anOUTER JOINed derived table). It requires noONclause because all the filtering is done via aWHEREclause. The optimizer is quite good about understanding the intent of your query and doesn’t end up running theAPPLYonce for every outer row–it almost always is able to get the data intelligently as if it were just a regular join. I used it here as a cheap way to get a calculated value in my query that I could use more than once. I could as well have doneDECLARE @MonthDate = <same expression>and used that instead, but some part of me likes not declaring variables when I don’t have to (as the @MonthDate variable is only needed for a single query).P.S. I would like to point out one thing I see in your example code (if you will allow me). Consider this section:
All this lining up is great for trying to convey the intent of the IF / ELSE blocks, but in my professional opinion, you should be using
BEGINandEND. One reason that the person who coded this chose to lay out the conditions this way is that the indenting style is (to me) a little excessive. EachBEGINends up being 2 indents deep, one for theIF(treating it as one indent block) and one for theBEGIN .. END. But this attachment to single-statement IFs without blocks leads to several problems.The above expression does not convey intent properly and forces a reviewer to evaluate why something is being done twice and that in fact the two conditions are the same. For starters, the block should be rewritten as so:
This can now be sensibly understood and reviewed.
As soon as you start to add multiple conditions or nested levels, you can get in deep trouble and find it almost impossible to debug. What if some enterprising developer, recognizing that the two conditions were identical and it made the most sense to combine their contents, wrote this code:
This looks good, but hides the fact that the second
SETstatement does not belong to theIFbefore it. Now, it won’t compile properly because theELSEis orphaned. But what if the change were to this:Now we have an awful mess! This will parse correctly, but be far from the result the developer intended: the
ELSEblock is part of the@SpecialFlagcondition! It sure doesn’t look like it in the code, due to the indentation.So while I understand that code formatting conventions can be preferred with strongly-held convictions, and organizations and people can be greatly resistant to changes, I would like to suggest that you will achieve some benefits if 1) you use
BEGINandENDin allIFblocks, and 2) in order to alleviate the struggle this causes due to the hassle of double-indenting everything, that you reformulate your block-indentation practice like so:Instead of
IFandELSEbeginning a block that has no end, andBEGINbeing matched withENDat a deeper level;Make
IFandELSEbegin blocks that are ended withEND, and putBEGINat the end of the line.This would then be:
I recognize this looks very strange to someone used to a different style of indenting–all change is hard at first. But I believe that with a little practice it will grow on you and become less painful. It is just a matter of training the eye to match
ENDwith something besidesBEGIN. Eventually you realize that you don’t care to even look forBEGINbecause it adds no additional meaning to the lines beginning withIForELSE. Finally, you don’t have to do crazy, labor-intensive indenting patterns (including putting multiple spaces afterIFandELSEto get everything to line up) because the indenting just works, and you’re only doing a single level for each block instead of many. And you’ll never have code that falls out of theIFblock unexpectedly as shown above.To make it 100% clear: with this style, if a
BEGIN/ENDblock has only one statement in it, nothing has to be rearranged in order to add a second, and the code will not break.Finally, please note that I have added semicolons in my own examples, for the simple reason that in SQL Server they will one day be required and I’d like all my production code to keep working without needing a giant and painful semicoloning project! This also gives the benefit of explicitly indicating when your block has stopped (though frankly, I don’t know if a semicolon is required after the
ENDpreceding an immediateELSE–if so, I’ll at least have a small semicoloning project instead of a giant one).