xsharp.eu • Comparison of (Ansi) strings
Page 1 of 2

Comparison of (Ansi) strings

Posted: Wed Feb 26, 2020 4:43 pm
by ArneOrtlinghaus
How can I influence the sorting of strings?
The following code does not give the same order as in VO. In VO I would have the same order as the input array (ordered similar to the internal ASCII value, in X# I have something like chr(1), chr(2), ..., chr(6), chr(149), chr (7), ..., chr(15), chr(164), chr(16), ..., chr(19), chr(182), chr(20), chr(167). chr(21), ..., chr(33), chr(147), chr(148), chr(168), chr(34), ...

a :={}
for i := 1 upto 255
c := chr(i) + "_" +ntrim(i)
aadd(a, c)
next

ASort(a)
msginfo(array2string(a, crlf))

Comparison of (Ansi) strings

Posted: Wed Feb 26, 2020 7:12 pm
by Chris
Hi Arne,

Is this real text that you are trying to sort, or is it binary data? Due to string in .Net being unicode, Chr() for values 128...255 will not give a char with that unicode value, it will instead return the equivalent unicode char, so you cannot directly compare this routines in VO and .X#.

If it's binary data that you are working with (it does look like it), then I think the best way to deal with it is to create your own sorting and/or use standard .Net functions like String.Compare()

Comparison of (Ansi) strings

Posted: Wed Feb 26, 2020 7:18 pm
by Terry
Arne

Don't know why things are different in VO.

But format the composite string (c in this case) in the order you want.

for example c:=ntrim(i) + chr(i).

Convert i to a string and pad it left before adding it to the array.

This should work for any string complexity. Others may know a better way.

Terry.

Comparison of (Ansi) strings

Posted: Thu Feb 27, 2020 8:07 am
by ArneOrtlinghaus
Hi Chris and Terry,

In VO we use this possibility to order records by several columns composing it to one unique sort string which is used by a Quicksort procedure. In case of descending the ANSI characters are substituted by chr(255)-chr(xxx). Probably we have other places where we use some sortings related to the ANSI values.

Now I am searching for something similar. I have looked into the DLL Xsharp.core and tried to understand working of ASC and CHR. There a helper class stringhelpers is used. This class contains a method CompareWindows, which seems to be what I could need. Is it possible to use this method?

Arne

Comparison of (Ansi) strings

Posted: Thu Feb 27, 2020 8:25 am
by Chris
Hi Arne,

Chr(255) - Chr(xxx) will not work in .Net, unless you only use standard Latin chars (< 128 ascii), so no umlauts etc. Why don't you just sort in reverse order?

Comparison of (Ansi) strings

Posted: Thu Feb 27, 2020 8:39 am
by ArneOrtlinghaus
Hi Chris,
some columns are sorted in ascending order and some in descending order, just as the user selects it. For the first column I can use reverse ordering in case of descending. But if the other columns differ from the selection of the first column, I have to "reverse" the character sequence of that part. Until now this worked also quite well for the west european characters above 128 ascii.

Comparison of (Ansi) strings

Posted: Thu Feb 27, 2020 9:55 am
by Terry
Hi Arne

>> There a helper class stringhelpers is used. This class contains a method CompareWindows, which seems to be what I could need. Is it possible to use this method?

Sorry I can't answer that - I just don't know.

But the way I suggested will work, and with suitable conditional extractions you'll be able to sort things in whatever way or ways you like.

You are only sorting through 256 characters, it's not much so the following won't apply; but it is worth bearing in mind should you ever want to do the same thing over thousands, then strings are immutable and some forced garbage collection may improve performance.

It may be the "CompareWidows" method includes something like that.

Terry

Comparison of (Ansi) strings

Posted: Thu Feb 27, 2020 10:00 am
by ArneOrtlinghaus
A workaround now works when transforming the single bytes to hex strings. Indeed it is more a binary comparison than a real string comparison. I have found some other places which I have to look at.
There are places we have used chr(255) assuming that it is a placeholder for the last possible character. Looking at my Tests in X#, chr(255) = ÿ comes to position 225 and is located for example before Ö chr(214) and
Ü chr(220) which are used in German.

Comparison of (Ansi) strings

Posted: Fri Feb 28, 2020 4:36 pm
by ArneOrtlinghaus
Looking again at the code I have found some switches that influence string comparisons:
- The compilerswitch /VO13 Compatible string comparisons
- If set, the Setexact() and the Setcollation() settings. In our case /VO13 is set, SetExact is false. Testing the collation modes SetCollation(#clipper), SetCollation(#ORDINAL), SetCollation(#windows), SetCollation(#UNICODE) changed the behavior of sorting, but did not reproduce the same result as in VO.

Comparison of (Ansi) strings

Posted: Fri Feb 28, 2020 5:07 pm
by Chris
Hi Arne,

Yes, for compatibility with VO you need to use /vo13, same with almost every other vo compatibility option. Also note that #ORDINAL and #UNICODE did not exist as collation settings in VO, so you need to use either #CLIPPER or #WINDOWS

Also please remember the fundamental difference between VO and X#, VO has 8-bit strings, while X# (.Net) has unicode ones. String comparison in X# was designed in such a way to give correct (compatible with VO) results when "real" text is used (so it makes the necessary ansi<->unicode conversions), not with binary data. And using Char(255)-Chr(realchar) effectively changes the text into binary data.

Have you seen any difference of the behavior of X# to the one of VO with text data? If you have, can you please post a sample so we can look into it?