Options

Performance of LIKE vs. ILIKE

If I don't care about the cases, i.e. I know for sure whether my string is in upper case or lower case, would LIKE be faster, or ILIKE, or both are about the same?

Comments

  • Options
    Good question. You could benchmark this, but if LIKE is a possibility, then use it because it would be faster or equally fast as ILIKE that has to do more work. When parsing log files I noticed grep -i ( case insensitive) being substantially slower than grep, so i would not be surprised to see a similar finding here. The differences are probably very small if the look up space for string searching is limited.
  • Options
    Since Vertica's niche is all about fast queries, you're also best off normalizing your data on the way into the database, bringing it to a consistent case. Similarly if there are expressions that you evaluate often with LIKE/ILIKE, consider pulling them out during ETL.
  • Options
    Thank you for the replies! I benchmarked it, not surprisingly, LIKE is faster than ILIKE.
  • Options
    Guys, Vertica is Case-sensitive!! so, ILIKE makes your search bit easier , by not typing the exact word as it is in the database while searching for a record.. Eg; say there is a name like 'raj' in the database table select * from table where name LIKE '%Raj%'; It wont fetch my record. select * from table where name ILIKE '%Raj%'; It fetches my record.
  • Options
    Hi! Case-sensitive search ALWAYS will be faster than non sensitive, because it compares chars by their order in characters table. For example ASCII table comparison: small "a" = 97, while capital "A" = 65.
      test=> select ascii('a'), ascii('A'), 'a' > 'A';   ascii | ascii | ?column?   -------+-------+----------      97 |    65 | t  (1 row)  
    So non-case sensitive search require 2 comparison for decision: character compared to "small" AND "capital" letter. Can be another strategy: convert all strings to same case (lower or upper) and prepare case-sensitive comparison. I don't know what strategy Vertica choose, both strategies has same complexity in worst case ==> O(length of string). FYI: For UTF/Unicode its a big problem/challenge - case sensitive comparison, because same letters in different languages can has different order. Or: how to compare chars from different languages? How to compare Cyrillic "T" to latin "T"? In UTF/Unicode such comparison is possible (and Vertica data encoded in UTF-8). So even here no stable solutions!

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file