|
Post by honky on Feb 16, 2023 12:02:09 GMT
I found (probably here) an algo that removes duplicate strings. but it uses asc() codes Which causes problems with accents. Does anyone have an algo by pure string comparisons? Thank you for.
|
|
|
Post by tsh73 on Feb 16, 2023 12:54:06 GMT
Give us examples What are inputs is it in array, is in in a one huge string How big it is What you need as output is order of strings important
What are assents you are talking about? - aren't accented charachers have different codes? - I pretty sure string comparison would treat them as different characters too
|
|
|
Post by honky on Feb 16, 2023 13:13:08 GMT
@: tsh73": Strings are just names But a comparison on only the first four letters would suffice The problem with asc is: asc("d") = 100 asc("e") = 101 asc("é") = 233 <--- !!! asc("f") = 102 If we compare the strings as is, it should solve the problem
EDIT:
a$="a r de é g tà n" b$="a r de é g tà n" if a$=b$ then print "yess" 'It's say: "yess" a$="a à g é t" b$="a a g é t" if a$=b$ then print "yess" else print "no" end if 'It's say: "no"
|
|
|
Post by tsh73 on Feb 16, 2023 13:26:24 GMT
I've heard for JB/LB it could depend on International settings For me
c$(1)=chr$(100) c$(2)=chr$(101) c$(3)=chr$(233) c$(4)=chr$(102)
print c$(1)<c$(2) '1- Yes print c$(2)<c$(3) '1- Yes print c$(3)<c$(4) '0- No
(and chr$(233) does not looks like any near "e" in my locale)
What numbers it prints for you?
|
|
|
Post by tsh73 on Feb 16, 2023 13:32:40 GMT
So does your example prove that "If we compare the strings as is" it will NOT solve your problem? Or it will not exactly match but it will sort right? Oh Just run a sort on letters and see for yourself is accented characters fall on right places dim c$(256) for i = 2 to 15 for j = 0 to 15 n = i*16+j c$(n) = chr$(n) print c$(n); next print next print print "after sort" print
sort c$(), 0, 255
for i = 2 to 15 for j = 0 to 15 n = i*16+j print c$(n); next print next
Funny. I was not aware my locale has accented characters at all (though they are not used in my language) and sort() orders them reasonably. Good thing to know.
|
|
|
Post by honky on Feb 16, 2023 13:35:58 GMT
233 it's not "e" but "é" (wjth accent) EDIT: it's not a sort but an eliminate doublons
|
|
|
Post by tsh73 on Feb 16, 2023 13:40:15 GMT
So string comparison does not match asc("e") = 101 asc("é") = 233 as equal.
Do you need them to be considered equal, for duplicate removing?
Then I would suggest recoding string, changing all variants of "e" to plain "e", and all ather accented letters, too And then look for duplicates.
|
|
|
Post by honky on Feb 16, 2023 14:04:06 GMT
Yes, it is possible to replace the accented letters with standard ones. But I would like an algo by comparisons of raw strings a$ = b$ or a$ <> b$ I'm not talking about the number of attempts (with array(s) of transfer) to which I have returned.
|
|
|
Post by tsh73 on Feb 16, 2023 15:05:58 GMT
But BASIC does not do it.
Write your own function that takes raw strings, (internally) . replace accented characters to ordinary ones . and do compare. Return True or False (1 or 0)
That's it.
|
|
|
Post by Rod on Feb 16, 2023 15:10:10 GMT
A string is just a collection of bytes, asc bytes. So they will never compare unless you substitute. It wont matter whether you do a string comparison or an asc code comparison they will still differ.
a$="ardeégtàn" b$="ardeégtàn" if a$=b$ then print "yess" 'It's say: "yess" a$="aàgét" b$="aagét" if a$=b$ then print "yess" else print "no" end if 'It's say: "no" for n= 1 to len(a$) print asc(mid$(a$,n,1)),asc(mid$(b$,n,1)) next if a$>b$ then print "greater" if a$<b$ then print "lesser"
c$(1)=a$ c$(2)=b$ print c$(1),c$(2) sort c$(,1,2 print c$(1),c$(2)
a$=replstr$(a$,chr$(224),chr$(97)) if a$=b$ then print "yess" else print "no" end if print a$,b$
|
|
|
Post by Rod on Feb 16, 2023 15:13:18 GMT
Can I ask, because we don't use accents in Scotland. How is it that you can get a and accented a input? Are both a and accented a on the keyboard? Which of a$ and b$ is correctly input and spelt?
|
|
|
Post by honky on Feb 16, 2023 16:45:47 GMT
On my keyboard (French) there are accented letters, first line of the keyboard (numbers in upper case); except the "ù" of "où" (where) which is after the "m". And Jb makes a good difference between normals and accented. How lucky you are not to have accents.
|
|
|
Post by honky on Feb 16, 2023 17:56:07 GMT
The solution exists and works. I'm waiting to post it. For if anyone wants to think about it. Tell me when you give your tongue at the cat . And I will post it.
|
|
|
Post by cundo on Feb 16, 2023 21:51:40 GMT
My keyboard has the accent as a key is the ¨´{ key. Has 3 functions, if I press that key alone it doesn't do anything but wait to me to press a vowel. So special Key + vowel, not simultaneously , it writes the accented vowel.
Plus I have the Ñ next to the L. that's all.
|
|
|
Post by honky on Feb 21, 2023 14:58:05 GMT
Due to the shortage of tongues, the cat died of hunger. So here is the solution: Note: JB consider "é" (e accentued) as "e". Bur differency "â" of 'a"
dim k$(500) a$="aa bb cc dd ee ff bbbb cc ée âa dd cc bbbb ee aa " '//Une chaine--- n=15 for x=1 to n '// Mise dans un seul tableau --- k$(x)=word$(a$,x) next x for i=1 to n nouveau =1 for j = 1 to i-1 if k$(i) = k$(j) then nouveau=0 next j if nouveau = 1 then dedoub = dedoub+1 resu$(dedoub) = k$(i) print(resu$(dedoub)) end if next i
|
|