xsharp.eu • Fastest way to determine if a string is contained in another
Page 1 of 2

Fastest way to determine if a string is contained in another

Posted: Thu May 02, 2019 10:22 pm
by lumberjack
Hi all Pearlers,

Doing some research I came across this webpage, comparing different ways of determining if a string contains a substring.

Hope it is of interest to some.

Fastest way to determine if a string is contained in another

Posted: Fri May 03, 2019 6:39 am
by FoxProMatt
Did you intend to share a link with us?

EDIT -I was on my iPhone when I first read the message, and did not notice the link in you message.

Fastest way to determine if a string is contained in another

Posted: Fri May 03, 2019 7:43 am
by FFF
? It works ;)
BTW, reading and trying with the X#-runtime sample, i wrote
? (STRING)uXSharpUsual:Contains( "run")
resulting in:
error XS1061: 'XSharp.__Usual' does not contain a definition for 'Contains' and no accessible extension method 'Contains' accepting a first argument of type 'XSharp.__Usual' could be found (are you missing a using directive or an assembly reference?) 9,1 Start.prg strings
while
VAR x :=(STRING)uXSharpUsual
? x:Contains("run")
works flawless.
Is that to expect?

Fastest way to determine if a string is contained in another

Posted: Fri May 03, 2019 8:14 am
by SHirsch
Hi,

try:
? ((STRING)uXSharpUsual):Contains( "run")

Stefan

Fastest way to determine if a string is contained in another

Posted: Fri May 03, 2019 8:24 am
by lumberjack
FoxProMatt_MattSlay wrote:Did you intend to share a link with us?
Yes "this webpage" is the link...

Fastest way to determine if a string is contained in another

Posted: Fri May 03, 2019 8:26 am
by FFF
Stefan,
works, thx.
So, it seems, the implicit prioritiy of the conversion is higher than the method call.

Karl

Fastest way to determine if a string is contained in another

Posted: Fri May 03, 2019 8:42 am
by lumberjack
Karl,
FFF wrote:Stefan,
So, it seems, the implicit prioritiy of the conversion is higher than the method call.
Usual don't contain a method :Contains() from what I read from the error message.
Your example is casting the Logic returned from Contains() to a string, which is fine.

Code: Select all

? uUsual:Contains(" ") // Error
? " " $ uUsual
? At(" ", uUsual) > 0

Fastest way to determine if a string is contained in another

Posted: Fri May 03, 2019 9:08 am
by FFF
Johan,
nope, i cast the usual to a string, for which Contains IS defined.
I was fooled, thinking, "(String)myUsual" would be evaluated prior to myUsual:Contains... ;)

Anyway, the link made an interesting read, THX!

Fastest way to determine if a string is contained in another

Posted: Fri May 03, 2019 2:38 pm
by Chris
Strange part about his tests is that apparently string:Replace() is (slightly) faster than string:Contains(), at least for when the search string is not actually included in the target string. When the string is actually part of the target string, then the Contains() method is as expected a lot faster than the "replace" method. But because his tests use random strings (so not a common real life scenario), his method with replace is a little faster. In real life conditions, Contains() is better.

Also the reason why IndexOf() appears to be slower, is because by default it does culture dependend string comparisons, while Contains() only does ordinal comparison (compares byte by byte). It is easy to make IndexOf() perform an ordinal comparison as well (with the StringComparison.Ordinal parameter), in which case it has the same performance as Contains().

Plus, IndexOf() is a lot more powerful than Contains(), allows for optionally ignoring case and for specifically search for single chars which is even faster, so the best/more powerful/faster method to use is that, IsIndexOf(). It's just that "Contains()" is much more self explanatory when read in the code, so I personally tend to use it more often.

Fastest way to determine if a string is contained in another

Posted: Fri May 03, 2019 3:13 pm
by lumberjack
Hi Chris,
Chris wrote:Strange part about his tests is that apparently string:Replace() is (slightly) faster than string:Contains()
Yes I found that quite interesting, my initial thoughts would be that this would be extremely slow...
then the Contains() method is as expected a lot faster than the "replace" method
Yes this is what I also expected
Also the reason why IndexOf() appears to be slower, is because by default it does culture dependand string comparisons, while Contains() only does ordinal comparison (compares byte by byte). It is easy to make IndexOf() perform an ordinal comparison as well (with the StringComparison.Ordinal parameter), in which case it has the same performance as Contains().
I think this just highlights some of the issue with the "standard" implementation. IndexOf() is so much more powerful and I rather prefer using this, specially in cases of instead of doing:

Code: Select all

If myStr:Contains("blah blah")
  iPos := myStr:IndexOf("blah blah")
Which one might think will give a "slight" performance improvement, but obviously not.
Plus, IndexOf() is a lot more powerful than Contains(), allows for optionally ignoring case and for specifically search for single chars which is even faster, so the best/more powerful/faster method to use is that, IsIndexOf(). It's just that "Contains()" is much more self explanatory when read in the code, so I personally tend to use it more often.
I also use contains quite a lot, but by intelligently making use of IndexOf() one can eliminate the speed penalty.