Case Insensitive strstr

Environment: C/C++

It’s a frequent task to make a case insensitive search for a string into another string. I believe everyone is able to write code that does this. And I am sure everyone did this many times 🙂 I don’t want to bring up here a discussion about code reusability and the benefits of having functions to handle one task instead of pasting same code locally wherever it is needed. However, my experience with programs written by professional developers led me to implement case insensitive search with the same interface as strstr() and post it here, so that everyone can take it and put it to their utility library.

I am including two different ways how to do the case insensitive search. They show the classical memory/speed trade-off. I have the feeling that the reason why case insensitive search function is not included in the runtime library is because there’s no solution which is clearly better than all others. The first technique allocates memory for copies of both strings, but it is significantly more time efficient for large strings. The other way has no memory overhead, but takes more CPU time. I didn’t compare performance for small strings, where I expect the efficiency to be pretty much the same. If anyone wants to investigate, please post your results.

The second implementation has the additional benefit that it is possible to specify a substring of the first string that will be searched.

I wrote the functions as C++ templates, but they can be converted to C functions very easily.

A. stristr – makes internal copies, converts to lower case and uses strstr

#include <string.h>
#include <malloc.h>
#include <tchar.h>

template <typename CHAR_TYPE>
CHAR_TYPE *stristr
(
CHAR_TYPE * szStringToBeSearched,
const CHAR_TYPE * szSubstringToSearchFor
)
{
CHAR_TYPE * pPos = NULL;
CHAR_TYPE * szCopy1 = NULL;
CHAR_TYPE * szCopy2 = NULL;

// verify parameters
if ( szStringToBeSearched == NULL ||
szSubstringToSearchFor == NULL )
{
return szStringToBeSearched;
}

// empty substring – return input (consistent with strstr)
if ( _tcslen(szSubstringToSearchFor) == 0 ) {
return szStringToBeSearched;
}

szCopy1 = _tcslwr(_tcsdup(szStringToBeSearched));
szCopy2 = _tcslwr(_tcsdup(szSubstringToSearchFor));

if ( szCopy1 == NULL || szCopy2 == NULL ) {
// another option is to raise an exception here
free((void*)szCopy1);
free((void*)szCopy2);
return NULL;
}

pPos = strstr(szCopy1, szCopy2);

if ( pPos != NULL ) {
// map to the original string
pPos = szStringToBeSearched + (pPos – szCopy1);
}

free((void*)szCopy1);
free((void*)szCopy2);

return pPos;
} // stristr(…)

B. strnistr – uses strnicmp in a loop, and may search in a substring of the searched string

#include <string.h>
#include <malloc.h>
#include <tchar.h>

template <typename CHAR_TYPE>
CHAR_TYPE *strnistr
(
CHAR_TYPE * szStringToBeSearched,
const CHAR_TYPE * szSubstringToSearchFor,
const int nStringLen = -1
)
{
int nLen;
int nOffset;
int nMaxOffset;
CHAR_TYPE * pPos;
int nStringLenInt;

// verify parameters
if ( szStringToBeSearched == NULL ||
szSubstringToSearchFor == NULL )
{
return szStringToBeSearched;
}

// get length of the substring
nLen = _tcslen(szSubstringToSearchFor);

// empty substring-return input (consistent w/ strstr)
if ( nLen == 0 ) {
return szStringToBeSearched;
}

if ( nStringLen == -1 || nStringLen >
(int)_tcslen(szStringToBeSearched) )
{
nStringLenInt = _tcslen(szStringToBeSearched);
} else {
nStringLenInt = nStringLen;
}

nMaxOffset = nStringLenInt – nLen;

pPos = szStringToBeSearched;

for ( nOffset = 0; nOffset <= nMaxOffset; nOffset++ ) {

if ( _tcsnicmp(pPos, szSubstringToSearchFor, nLen) == 0 ) {
return pPos;
}
// move on to the next character
pPos++; //_tcsinc was causing problems 🙁
}

return NULL;
} // strnistr(…)

Here’s a fragment of code I used to compare the performance of the functions:

{
int i;
char x1[3 * 100000];
char x2[3 * 10000];

for ( i = 0; i < sizeof(x1); i++ ) x1[i] = ‘A’+ (i % 3);
for ( i = 0; i < sizeof(x2); i++ ) x2[i] = ‘a’+ (i % 3);
x1[sizeof(x1) – 2] = x2[sizeof(x2) – 2] = ‘y’;
x1[sizeof(x1) – 1] = x2[sizeof(x2) – 1] = 0;
// use a profiler to compare
printf(“%pn”, stristr(x1, x2));
printf(“%pn”, strnistr(x1, x2));
}

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read