|
Post by tsh73 on Jan 23, 2020 13:57:04 GMT
Indeed! I tried and it really behaves better to 1/0 problem.
Nice thing to know - and a great find!
|
|
|
Post by B+ on Jan 23, 2020 16:27:23 GMT
We were lucky Chris Iverson wrote the Mersenne Twister dll for us. There is no need for Mersenne Twister. I bemuse myself with various random models and was very unhappy to find out years ago that JB RNG had a noticeable bias. So I looked into the behavior of the problem and found out that the RND function preferred larger values. Whenever I generated set of integers from 0 to 9, the result flunked every chi-square tests I put on it, but that was always due to the fact that the distribution of the integers was repeatedly skewed toward higher values being less frequent in the statistically significant manner. Seeing that, I applied simple remedy of taking the first digit after the decimal point out of the picture. For example, if RND(1) returns 0.628251, then you purge 6 to get 0.28251. Here is the short adjustment function I have been using ever since: function rand() r=10*rnd(1) rand=r-int(r) end function I ran lots of different tests, but the adjustment effect proved invincible. Try it for yourself to see... Outstanding, I will try next time I need Rnd for JB, thanks!
|
|
casco
New Member
Posts: 16
|
Post by casco on Jan 25, 2020 7:39:01 GMT
Thanks for sharing. I still value Chris though! Rod, I think you are right in preferring Messier Twister if it can be implemented one way or the other in a programming language. There is no MersTwistJB file available and that was the reason I said that there was actually no need for that particular PRNG.
|
|
|
Post by Rod on Jan 25, 2020 7:57:50 GMT
I really prefer simple, rnd(0) is perfectly adequate for most of my needs, indeed up to now all my needs. Folks should not worry too much about using it. Your solution is a really simple fix for the slight bias. I can see myself using it. Less so the twister,
|
|
casco
New Member
Posts: 16
|
Post by casco on Feb 1, 2020 5:48:02 GMT
I really prefer simple, rnd(0) is perfectly adequate for most of my needs, indeed up to now all my needs. Folks should not worry too much about using it. Your solution is a really simple fix for the slight bias. I can see myself using it. Less so the twister, Rod, is that Mersenne Twister RNG to your avail? I wonder how it performs in a case I try to describe. You generate random sequence of 1's and 0's. The sequence goes on and on and terminates only when the RNG returns ten identical digits (ones or zeroes) in a row. When that happens, the sequence stops and you assume that when you repeat the process one hundred times, the terminating strings would be about evenly divided between strings of ten 1's or ten 0's, with the expected variability staying above 40 and bellow 60. This is not the case at all with the JB RNG though. If you put it under this much of pressure, you get in return heavily biased results. I mean if you run those 100 samples 20 times, then 0000000000 is the terminating string that almost always prevails over 1111111111 in each sample of those 100 runs. I managed to come up with a code that illustrates it. And I wonder... what if that twister suffer from a similar bias? Well, I should add that I do agree that the JB RNG is there for practical applications and is not meant to be an object to be messed with one way or the other. for k=1 to 20 '-----------------------------K
for n=1 to 100 '--------------------N
a$=""
for i=1 to 10
r=int(2*rnd(1))
a$=a$+str$(r)
next i
do
sum=0
a$=right$(a$,9)
a$=a$+str$(int(2*rnd(1)))
for i=1 to 10
sum=sum+val(mid$(a$,i,1))
next
loop until sum=10 or sum=0
select case sum
case 10: one=one+1
case 0: zero=zero+1
end select
next n '----------------------------N
print "1's ";one
print "0's ";zero
if zero=one then
print "tie"
goto [skip]
end if
if zero<one then
print "1111111111 win"
else
print "0000000000 win"
end if
[skip]
one=0 : zero=0
print
next k '------------------------------------K
print "__"
end
|
|
|
Post by tenochtitlanuk on Feb 1, 2020 11:42:28 GMT
There's little point adding new tests which just show the old JB/LB PRNG is biassed! I've done a LOT of playing with assorted ways to generate and test PRNGs. It is of course more fun with the ones that are biassed, repeat, or cluster in multi-dimensioned k-space!
Move to LB (free) and use Mersenne... I got 1's 42 0's 58 0000000000 win 1's 54 0's 46 1111111111 win 1's 61 0's 39 1111111111 win 1's 43 0's 57 0000000000 win 1's 49 0's 51 0000000000 win 1's 46 0's 54 0000000000 win 1's 43 0's 57 0000000000 win 1's 60 0's 40 1111111111 win 1's 55 0's 45 1111111111 win 1's 45 0's 55 0000000000 win 1's 55 0's 45 1111111111 win 1's 47 0's 53 0000000000 win 1's 50 0's 50 tie 1's 41 0's 59 0000000000 win 1's 45 0's 55 0000000000 win 1's 59 0's 41 1111111111 win 1's 53 0's 47 1111111111 win 1's 46 0's 54 0000000000 win 1's 47 0's 53 0000000000 win 1's 44 0's 56 0000000000 win __
NB The following version runs in LB not JB, and you need the Mersenne dll we repeatedly refer to.
open "MersTwistLB" for dll as #MT calldll #MT, "SeedWithTime", ret as void
for k =1 to 20 '-----------------------------K
for n =1 to 100 '--------------------N
a$ =""
for i =1 to 10 r =int( 2 *rand( 1)) a$ =a$ +str$( r) next i
do sum =0 a$ =right$( a$, 9) a$ =a$+str$( int( 2 *rand( 1)))
for i =1 to 10 sum =sum +val( mid$( a$, i, 1)) next
loop until sum =10 or sum =0
select case sum case 10: one =one +1 case 0: zero =zero +1 end select
next n '----------------------------N
print "1's ";one, print "0's ";zero,
if zero =one then print "tie" goto [skip] end if
if zero <one then print "1111111111 win" else print "0000000000 win" end if
[skip] one =0 : zero =0 next k '------------------------------------K
print "__"
close #MT end
function rand( dummy) 'getRandomFloat() returns a float. calldll #MT, "getRandomFloat", rand as double end function
|
|
casco
New Member
Posts: 16
|
Post by casco on Feb 2, 2020 0:20:06 GMT
There's little point adding new tests which just show the old JB/LB PRNG is biassed! I've done a LOT of playing with assorted ways to generate and test PRNGs. It is of course more fun with the ones that are biassed, repeat, or cluster in multi-dimensioned k-space! Move to LB (free) and use Mersenne... Thanks a lot. Given my needs, I think I stick with JB for the time being. Btw, your version of the code is really neatly written. No wonder that I usually get lost in mine after fifty lines or so.
|
|
|
Post by tenochtitlanuk on Feb 2, 2020 20:43:08 GMT
It was indeed interesting to confirm another aspect of LB/JB's rnd(). Fine for games etc, but misbehaves when you examine it closely. Screenshot shows the bias in how often a chosen 10-char string is chosen ( '0000000000' annd '1111111111' are used. We again count how many times each of these appear and plot them. The white divider is where 'ties' appear. The numbers increment if a particular pair fof results crops up again. The whole block of esults approaches a kind of 2D 'normal curve'. A chosen 10-char string should on average appear about once every 2^10 ( 1024) tries. It should be on average equally often for ANY ten-char string you choose. Should be a symmetrical plot but isn't- until you either use the native rnd() with the first digit removed and the decimal point moved, or in LB use Mersenne. All three are shown in the code- just rem in/out a few lines in function rand() dim result( 200, 200)
WindowWidth =700 WindowHeight =720
open "Scatter graph" for graphics_nsb as #wg
#wg "trapclose quit" #wg "down ; font 5" #wg "fill darkblue ; color white ; line 37 37 637 637" ' dividing line where equal numbers of each pattern #wg "color cyan ; backcolor darkblue"
open "MersTwistLB" for dll as #MT flag =1 calldll #MT, "SeedWithTime", ret as void
zeros$ ="0000000000" ones$ ="1111111111"
runLength =10 zeros$ =left$( zeros$, runLength) ones$ =left$( ones$, runLength)
for l =1 to 5000
allZerosCount =0 allOnesCount =0
for n =1 to 15000
a$ =""
for i =1 to runLength r =int( 2 *rand( 1)) a$ =a$ +str$( r) next i
if a$ =zeros$ then allZerosCount =allZerosCount +1 if a$ =ones$ then allOnesCount =allOnesCount +1
scan next n
result( allOnesCount, allZerosCount) =result( allOnesCount, allZerosCount) +1
print runLength; " 1's "; allOnesCount, print runLength; " 0's "; allZerosCount,
if ( allOnesCount =allZerosCount) and ( allZerosCount <>0) then print "Both equal times- tie" #wg "color white": goto [skip] end if
if allZerosCount <allOnesCount then print "all ones wins." #wg "color red" else print "all zeros wins.": #wg "color green" end if
[skip] #wg "place "; 10 + 14 *( allOnesCount); " "; 10 +14 *( allZerosCount) #wg "down"
'#wg "\"; right$( " " +str$( result( allOnesCount, allZerosCount)), 3) #wg "\"; result( allOnesCount, allZerosCount)
next l
print "__ Done __"
close #MT flag =0
wait
sub quit h$ if flag =1 then close #MT close #wg end end sub
function rand( dummy) 'getRandomFloat() returns a float. 'calldll #MT, "getRandomFloat", rand as double 'rand =rnd( 1) rand =randTrunc() end function
function randTrunc( ) R =rnd( 1) S =10 *R randTrunc =S -int( S) end function
|
|
|
Post by tenochtitlanuk on Feb 4, 2020 18:17:35 GMT
For reassurance of those of us trying out LB5 v350, this was run on one of my Pi's. Can't fault that distribution!
|
|
casco
New Member
Posts: 16
|
Post by casco on Feb 5, 2020 3:46:56 GMT
For reassurance of those of us trying out LB5 v350, this was run on one of my Pi's. Can't fault that distribution! Similar graphics can be helpful to solve problems of "self-terminating" sequences. For example, you have a sequence made of randomly chosen characters ranging from 0 to 9, such as 3766012378... You have a sample of, let's say, 20 such sequences available and you record their length. Surely, there must be an instruction in the code that brings each sequence to a stop, but you don't know what it is. There is a line of code that says 'if something then stop'. What is that something? The null hypothesis states that the length of each sequence has been randomly chosen. That means the RNG makes the decision regarding the length of each sequence. The alternative hypothesis says that this is not so. The only way to prove the AH is to find the other way, which is responsible for those sequences being terminated. There are instances where you have some chance to find it; there are other, even simple ways, where it is impossible. Here is a simple example of the possible kind: There are 20 sequences with their length recorded and organized in the ascending order. 1, 2, 3, 3, 5, 5, 6, 6, 7, 8, 46, 60, 64, 458, 574, 577, 1274, 1919, 2240, 8600. The samples don't seem to be uniformly distributed, and so a RNG set to produce integers from 1 to N is very likely not the reason for the stoppage. You can prove it by comparing the ending digits of each sequence with its length starting with the longest: length of sequence= 8600, ending digits= ...7016735 8600length of sequence= 2240, ending digits= ...6622871 2240length of sequence= 1919, ending digits= ...0061046 1919and so on. In each case, the sequence stops when its last digit(s) agree with the number that represents the sequence's length. The superbig clue to the problem are the shortest sequences where the agreement between the length and the ending digit(s) is the most apparent. It's usually the distribution of the samples that tips you off what kind of trick might have been used to stop the sequences.
|
|
|
Post by tenochtitlanuk on Feb 5, 2020 12:47:27 GMT
Interesting points raised there for me to ponder. Thanks Casco!
|
|
ntech
Junior Member
Posts: 99
|
Post by ntech on Feb 11, 2020 17:12:36 GMT
For reassurance of those of us trying out LB5 v350, this was run on one of my Pi's. Can't fault that distribution!
Great work! Last time I tried LB on my Pi, it was pretty buggy. I'd love to know when it's pretty much ready.
|
|