Thursday, March 3, 2011

SqlServer: Like vs. "=" for matching strings

Whenever I write a stored procedure for selecting data based on string variable (varchar, nvarchar, char) I would have something like:

procedure dbo.p_get_user_by_username(
    @username      nvarchar(256)
as
begin
    select
     u.username 
     ,u.email
     --,etc
    from
     sampleUserTable u
    where
     u.username = @username
end

So in other words to match the record I would have

u.username = @username

But sometimes I come across code that would use LIKE in place of =

u.username like(@username)

When would you use it? Shouldn't that be used only when you need some wildcard matching?

EDIT

Thanks for the answers.

I think that I need to clarify that what I was really trying to ask was: if there could be a situation when it was preferred to use like in place of "=" for exact string matching. From the answers I could say that there would not be. From my own experience even in situations when I need to ignore e.g case, and leading and ending spaces i would use ltrim, rtrim, lower on both strings and then "=". Thanks again for your input.

From stackoverflow
  • With the LIKE keyword you can match the field u.username against a specified pattern instead of a fixed "string".

  • You are correct. There is no benefit in using LIKE unless you are doing wild card matching. In addition, using it without wildcard could lead to the use of an inefficient queryplan.

    Sam Saffron : This is not correct, the optimiser will work fine with like statements on indexed fields, its only when you have wildcards that you have to be very careful. So select * from t where field like '%anything' will force a table scan but like 'hello%' will not (if its indexed)
    Sam Saffron : Also, see my answer, like will not match on trailing spaces in an out-of-the-box sql 2005 installation.
  • Yes, as far as I know, using like without any wildcards is the same as using the = operator. are you sure the input parameter doesn't have wildcards in it?

  • Yes - you are right - it should only be used for wildcard matching. It should be used sparingly especially on very large tables on non-indexed fields as it can slow your queries WAY WAY down.

    Sam Saffron : Not really correct doing a like '%tail' will force an index/table scan but like 'head%' will perform quite well if the field is indexed ...
  • If no wildcards are used, then the difference is, that "=" makes an exact match, but LIKE will match a string with trailing spaces (from SSBO):

    When you perform string comparisons with LIKE, all characters in the pattern string are significant, including leading or trailing spaces. If a comparison in a query is to return all rows with a string LIKE 'abc ' (abc followed by a single space), a row in which the value of that column is abc (abc without a space) is not returned. However, trailing blanks, in the expression to which the pattern is matched, are ignored. If a comparison in a query is to return all rows with the string LIKE 'abc' (abc without a space), all rows that start with abc and have zero or more trailing blanks are returned.

    avgbody : Good point. The ideal way is to trim the username.
  • If you're seeing this in other people's code maybe they intended to allow a person to pass in a string that included a pattern or wildcards.

  • Sunny almost got it right :)

    Run the following in QA in a default install of SQL2005

    select * from sysobjects where name = 'sysbinobjs   '
    -- returns 1 row
    select * from sysobjects where name like 'sysbinobjs   '
    -- returns 0 rows
    

    So, LIKE does not match on trailing spaces, on the query plan side both perform almost equally, but the '=' join performs a tiny bit better.

    An additional thing you MUST keep in mind when using LIKE is to escape your string properly.

    declare @s varchar(40) 
    set @s = 'escaped[_]_%'
    
    select 1 where 'escaped[_]_%'  like @s 
    --Return nothing = BAD 
    
    set @s = '_e_s_c_a_p_e_d_[___]___%' 
    
    select 1 where 'escaped[_]_%'  like @s escape '_'
    --Returns 1 = GOOD
    

    In general people do not use LIKE for exact matching, because the escaping issues cause all sorts of complications and subtle bugs, people forget to escape and there is a world of pain.

    But ... if you want a real exact match that is efficient, LIKE can solve the problem.

    Say, you want to match username to "sam" and do not want to get "Sam" or "Sam " and unfortunately the collation of the column is case insensitive.

    Something like the following (with the escaping added) is the way to go.

    select * from sysobjects
    WHERE name = 'sysbinobjs' and name COLLATE Latin1_General_BIN LIKE 'sysbinobjs'
    

    The reason you do a double match is to avoid a table scan.

    However ....

    I think the varbinary casting trick is less prone to bugs and easier to remember.

    kristof : thanks sambo (and upvote of course) for good points about using LIKE. Actually I was not aware of the escape issue. What I was wondering really in my question was if someone would ever use LIKE instead of = for an exact string matching.
  • LIKE is for wildcard matching, where as = (equals) is for an exact matches.

    I also think it used for fields that have been catalogued by FULL TEXT CATALOGUES for hard core string comparisons.

0 comments:

Post a Comment