VistaDB 6
VistaDB / Developer's Guide / SQL Reference / CASE in a WHERE clause / Optimizing conditionals in where clauses
In This Topic
    Optimizing conditionals in where clauses
    In This Topic

    Optimizing Conditional WHERE Clauses: Avoiding ORs and CASE Expressions

    This is originally from Jeff's SQL Server Blog. Posted here just in case your are offline, or the site goes away.

    Often, we need to create a flexible stored procedure that returns data that is optionally filtered by some parameters. If you wish to apply a filter, you set the parameter to the necessary value, if not, you leave it null. This is pretty standard stuff, of course, that we can write fairly easily, without the need for dynamic SQL, like this:

    create procedure GetData
        @MinDate int = null,
        @MaxDate int = null,
        @MinAmount money = null,
        @MaxAmount money = null,
        @ProductCode varchar(200) = null,
        @CompanyID int
        as
    
        
    select *
        from Data
        where
        (@MinDate is null OR @MinDate <= Date) and
        (@MaxDate is null OR @MaxDate >= Date) and
        (@MinAmount is null OR @MinAmount >= Amount) and
        (@MaxAmount is null OR @MaxAmount <= Amount) and
        (@ProductCode is null OR ProductCode = @ProductCode) and
        (@CompanyID is null OR CompanyID = @CompanyID)
    

    Note that we are using good boolean algebra as discussed here, and not using inefficient CASE or COALESCE() expressions around the columns in our WHERE clause. It is pretty easy to read and quite logical, and certainly it is easy to edit and to extend with more filtering options as necessary. We also made sure to use parenthesis in our WHERE clause to organize it and to enforce a clear and correct order of operations regarding our AND and OR clauses.

    However, if we manipulate our parameters a little and re-organize our criteria, we can potentially write this much more efficiently.
    First, note that we are allowing the user to specify an optional date range via the @MinDate and @MaxDate parameters. If either is NULL, we simply do not use a minimum or maximum date in our filter. However, we can simplify things by writing our criteria like this:

    select *
        from Data
        where
        Date between coalesce(@MinDate, '1/1/1900') and coalesce(@MaxDate, '12/31/2999') and
        ... etc
    

    All we need to do is use COALESCE() to transform our NULL values to dates that will encompass the entire range of values, thus "eliminating" that filter from the results. This greatly simplifies our WHERE clause and ensures that indexes on our Date column can be used.
    We can do the exact same thing for our Amount range as well:

    select *
        from Data
        where
        Date between coalesce(@MinDate, '1/1/1900') and coalesce(@MaxDate, '12/31/2999') and
        Amount between coalesce(@MinAmount,-99999999) and coalesce(@MaxAmount,99999999) and ...
    

    For numeric values, this can be trickier because you really want to make sure that you have your entire range of values covered. When in doubt, just use the smallest and largest possible values that your data type will allow.
    Next, we have a string comparison based on product code. These also can be very efficiently rewritten by using LIKE. Normally, LIKE is much slower than using equals, but if you do not use any wild cards and it allows you to eliminate an "OR" in your WHERE clause, it may actually be more efficient:

    select *
        from Data
        where
        Date between coalesce(@MinDate, '1/1/1900') and coalesce(@MaxDate, '12/31/2999') and
        Amount between coalesce(@MinAmount,-999999999) and coalesce(@MaxAmount,999999999) and
        ProductCode like coalesce(@ProductCode,'%') and ...
    

    Note that if your product codes contain symbols like % or _, this will not work for you. But in general, using LIKE in this case can be really handy. I also sometimes see this:

    where (@Name is null OR Name like '%' + @Name + '%')

    .. which can of course be written much simply and more efficiently, since we are using LIKE already, as:

    where (Name like '%' + coalesce(@Name,'') + '%')

    The idea is to eliminate those OR operators and simplify your WHERE condition wherever possible.

    Finally, we have an optional filter for a specific @CompanyID integer parameter. We don't want to use a LIKE wildcard, since this will require that all of the integer values in the table must be converted to strings for the comparison -- thus, no indexes will be used. What we can do, even though it might not seem intuitive, is use a range! For example:

    select *
        from Data
        where
        Date between coalesce(@MinDate, '1/1/1900') and coalesce(@MaxDate, '12/31/2999') and
        Amount between coalesce(@MinAmount,-99999999) and coalesce(@MaxAmount,99999999) and
        ProductCode like coalesce(@ProductCode,'%') and
        CustomerID between coalesce(@CustomerID,0) and coalesce(@CustomerID,999999999)
    

    Now, we have no ORs in our WHERE clause at all, and we have not wrapped any columns in our table in any expressions, so all indexes can potentially be used. Our SQL is still relatively clear and easy to read and work with, and we can easily add more conditions or criteria as necessary.

    You may also find that it is more efficient and/or easier to maintain if you first set your parameter values as necessary before the SELECT, like this:

    declare @MinID int, @MaxID int
    
        
    set @MinDate = coalesce(@MinDate, '1/1/1900')
        set @MaxDate = coalesce(@MaxDate,'12/31/2999')
        set @MinAmount = coalesce(@MinAmount,-99999999)
        set @MaxAmount = coalesce(@MaxAmount,99999999)
        set @ProductCode = coalesce(@ProductCode,'%')
        set @MinID = coalesce(@CustomerID,-99999999)
        set @MaxID = coalesce(@CustomerID,99999999)
    
        
    select *
        from Data
        where
        Date between @MinDate and @MaxDate and
        Amount between @MinAmount and @MaxAmount and
        ProductCode like @ProductCode and
        CustomerID between @MinID and @MaxID
    

    This will ensure that the coalesce() expressions are evaluated only once, and it does make things a little easier to read.

    The overall trick is to think to yourself: Can I take a parameter value and either alter it or use it to create new values, and then efficiently use those new values in my WHERE clause instead of simply using that raw parameter value? Sometimes, it takes some "outside-of-the-box" thinking, but it often can be done and the resulting performance gains can be tremendous.

    Ultimately, you always should test the different ways of writing your WHERE clauses and always double check the indexing on your tables to get the maximum efficiency. Writing things a particular way in one situation may work great, but then applying it to another may not work quite as well.

    See Also