|
Post by daveylibra on Nov 26, 2019 0:59:01 GMT
I must ask about the way the RND function works. I wrote a simple program to generate a large number of 0s and 1s,
and then count how many series of 2 or more numbers there are, against series of 1 number. The outcome should be the
same, not accounting for variance.
The program runs through a string of 10,000 numbers 20 times
I was very surprised to find, then, that the total steadily increased! This happened every time I ran the program.
But I was even more surprised when, by chance, I made the program generate an extra random variable using RND.
Note the line B = INT(RND(1)*2)
This variable B is not used. But by including the line, I do get random results!
You can see what I mean if you put REM in front of this line, effectively nullifying it.
Having this line in or out of the program should make no difference, but you can easily see that it does!
If anyone has the time, could you run the program a few times both with and without the line, its the only way
to see what I mean.
I just cannot understand why! Any ideas?
Many thanks..
Dave
DIM N(10000)
FOR A = 1 TO 20
FOR X=1 TO 10000
N(X) = INT(RND(1)*2)
B = INT(RND(1)*2)
NEXT X
FOR X= 2 TO 9999
IF N(X-1)<>N(X) AND N(X)=N(X+1) THEN TOTAL=TOTAL+1
IF N(X-1)<>N(X) AND N(X)<>N(X+1) THEN TOTAL=TOTAL-1
REM LOOK FOR THE START OF A SERIES, THEN COUNT HOW MANY MORE SERIES >1 THERE ARE.
NEXT X
PRINT TOTAL
NEXT A
|
|
|
Post by B+ on Nov 26, 2019 2:08:02 GMT
Hi Davey,
Your total is increasing because you don't zero it out at the start of another A loop.
|
|
|
Post by B+ on Nov 26, 2019 2:44:19 GMT
I will bet nobody can seed a positive total more than a negative total in 40 trials. I tried maybe a half dozen and already know!
'use RANDOMIZE n n between 0 and 1, to create a seed for sharing results that can be replicated. randomize .71 '.3, .4 least neg so far whew .5! FOR A = 1 TO 40 scan FOR X=1 TO 10000 scan B = INT(RND(0)*3) - 1 'print B; BTOTAL = BTOTAL + B NEXT X PRINT BTOTAL if BTOTAL < 0 then NEG = NEG + 1 else POS = POS + 1 BTOTAL = 0 'zero total here if you don't want it to accumulate NEXT A print "Negative totals ";NEG;" Positive totals ";POS print "Done"
|
|
|
Post by B+ on Nov 26, 2019 3:42:38 GMT
JB RND stinks but you can get better distributions like this:
'use RANDOMIZE n n between 0 and 1, to create a seed for sharing results that can be replicated. randomize .7 ' pos .1, .5, .25, .95, .7 ' neg .65, .35, .83 ' exact split!!! .2 dim d(9) for i = 0 to 9 : d(i) = i : next '<<<<<<<<<<<<<<<<<<<<< set up digits deck for shuffling in between draws of digit for j = 1 to 20 'take a peek at some shuffles for i = 0 to 9 : print d(i);","; : next : print call shuffle next
FOR A = 1 TO 40 scan FOR X=1 TO 10000 scan call shuffle if d(1) < 5 then BTOTAL = BTOTAL - 1 else BTOTAL = BTOTAL + 1 ' <<<<< always draw same digit after shuffle NEXT X PRINT BTOTAL if BTOTAL < 0 then NEG = NEG + 1 else POS = POS + 1 BTOTAL = 0 'zero total here if you don't want it to accumulate NEXT A print "Negative totals ";NEG;" Positive totals ";POS print "Done"
sub shuffle ' shuffle the digits Fisher-Yates algo for i = 9 to 1 step -1 r = int(rnd(0) * (i + 1)) t = d(i) : d(i) = d(r) : d(r) = t next end sub
|
|
|
Post by tenochtitlanuk on Nov 26, 2019 10:19:56 GMT
<rant> Just an aside- to say JB's rnd() stinks is overkill and offensive. Its small bias has been investigated over the years with just about every statistical tool we users can throw at it. On my web site you'll find examples and for example chi-squared tests done on the output.
For 99% of use- usually games- it is fine. And for most 'Monte Carlo' science simulations. For fun look up the history of bad pseudo-random generators, from the likes of Microsoft and IBM.
You always have the option of running your JB code in 'big brother' Liberty BASIC, where you can call a Mersenne Twister dll and you'd really struggle to find any bias. ( LB is a free download, but with a nag screen. I'd strongly recommend paying for it- Carl has put thousands of hours into developing his languages and it has not made him rich!) </rant>
|
|
|
Post by B+ on Nov 26, 2019 15:00:38 GMT
LOL maybe "across the pond" this does sound harsh, I don't know. I do know when something doesn't smell right, in Ohio we say "it stinks" when we sense something ain't right but can't say for sure, with Chi-Squared and all, why.
I remember trying out the old classic game Hammurabi (sp?) in a short period of time it was obvious RND was off.
But I offer a way to fix it right here, right now with JB! without need to resort to extreme measures ;D
It is probable that Davey has picked up on the scent, he throws away every other RND and senses better results! As he rightly points out, that ain't supposed to happen.
But again, I offer a way to fix it right here, right now with JB! without need to resort to extreme measures ;D
|
|
|
Post by tsh73 on Nov 26, 2019 17:15:06 GMT
Probably you already told that story - what exactly seemed off running Hammurabi? If so I completely forgot Still I think it's unusual.
I found that thread in archive backup. Now, it I manage to make it readable, I'll repost it here...
|
|
|
Post by B+ on Nov 26, 2019 18:07:46 GMT
Probably you already told that story - what exactly seemed off running Hammurabi? If so I completely forgot Still I think it's unusual.
I found that thread in archive backup. Now, it I manage to make it readable, I'll repost it here... As I recall, it was negatively biased and poor Hammurabi, no matter what he did disaster would soon wipe out his Kingdom, not a good and fun game of skillful decision making if RND is off.
|
|
|
Post by B+ on Nov 26, 2019 18:19:10 GMT
I tried to run Davey's program as intended by him with a fixed RND. In my estimation you are equally likely to get a repeat 0 or 1 as NOT, after some sequence of 0's and 1's is broken, which is what I think he is doing with his code example. When I ran his original code with TOTAL = 0 after each trial, there was definite bias that the throwaway of B did seem to help, strange as that is. So here is code test with fixed RND and so far limited results look better but testing process is so tedious... ' Here is Davey's experiment using a fixed RND number technique randomize .1 ' commented block so results can replicated for sharing 13, 6, 1 'randomize .1 ' uncommented block so results can replicated for sharing 7, 12, 1
'randomize .2 ' commented block so results can replicated for sharing 8, 10, 2 'randomize .2 ' uncommented block so results can replicated for sharing 12, 8, 0
'randomize .3 ' commented block so results can replicated for sharing 8, 10, 2 'randomize .3 ' uncommented block so results can replicated for sharing 12, 8, 0
dim d(9) for i = 0 to 9 : d(i) = i : next
DIM N(10000) FOR A = 1 TO 20 FOR X=1 TO 10000
'replace N(X) = INT(RND(1)*2) ' <<<<< we want Random 0's and 1's call shuffle if d(1) < 5 then R = 0 else R = 1 ' d(1) is choice of 10 digits 5 are under 5, 5 are >= 5 N(X) = R
''comment this block on and off ''create throw away random number to see if it makes any difference in overall results IT SHOULD NOT 'call shuffle 'if d(1) < 5 then R = 0 else R = 1 'B = R
NEXT X FOR X= 2 TO 9999 IF N(X-1)<>N(X) AND N(X)=N(X+1) THEN TOTAL=TOTAL+1 IF N(X-1)<>N(X) AND N(X)<>N(X+1) THEN TOTAL=TOTAL-1 REM LOOK FOR THE START OF A SERIES, THEN COUNT HOW MANY MORE SERIES >1 THERE ARE. NEXT X PRINT TOTAL if TOTAL > 0 then POSITIVES = POSITIVES + 1 if TOTAL < 0 then NEGATIVES = NEGATIVES + 1 if TOTAL = 0 then TIES = TIES + 1 TOTAL = 0 ' <<<<<<<<<<<<<<<<<<<<< I am pretty sure Davey wants this to happen NEXT A print "Negative Totals = ";NEGATIVES;" Positive Totals = ";POSITIVES;" Tied Totals = ";TIES
sub shuffle ' shuffle the digits Fisher-Yates algo for i = 9 to 1 step -1 r = int(rnd(0) * (i + 1)) t = d(i) : d(i) = d(r) : d(r) = t next end sub
The results seemed mixed much better.
|
|
|
Post by B+ on Nov 26, 2019 18:24:45 GMT
'use RANDOMIZE n n between 0 and 1, to create a seed for sharing results that can be replicated. randomize .71 '.3, .4 least neg so far whew .5! FOR A = 1 TO 40 scan FOR X=1 TO 10000 scan B = INT(RND(0)*3) - 1 'print B; BTOTAL = BTOTAL + B NEXT X PRINT BTOTAL if BTOTAL < 0 then NEG = NEG + 1 else POS = POS + 1 BTOTAL = 0 'zero total here if you don't want it to accumulate NEXT A print "Negative totals ";NEG;" Positive totals ";POS print "Done"
Here is another demo of Bias: Challenge find a Randomize seed that DOES NOT RESULT with more negative totals than positive!
And I am even giving Positives all the the ties!!!
|
|
|
Post by B+ on Nov 26, 2019 19:50:49 GMT
I just ran this and NOTHING, no seed .01 to .99 produced more positives than negatives:
for seed = .01 to 1 step .01 scan print seed POS = 0 : NEG = 0 randomize seed '.3, .4 least neg so far whew .5! FOR A = 1 TO 40 scan FOR X=1 TO 10000 scan B = INT(RND(0)*3) - 1 'print B; BTOTAL = BTOTAL + B NEXT X 'PRINT BTOTAL if BTOTAL < 0 then NEG = NEG + 1 else POS = POS + 1 BTOTAL = 0 'zero total here if you don't want it to accumulate NEXT A 'print "Negative totals ";NEG;" Positive totals ";POS if POS > NEG then print "Positive outcome with seed ";seed next
|
|
|
Post by B+ on Nov 26, 2019 21:09:22 GMT
And this proves bias is fixed: 'use RANDOMIZE n n between 0 and 1, to create a seed for sharing results that can be replicated. dim d(9) for i = 0 to 9 : d(i) = i : next for seed = .01 to 1 step .01 scan 'print seed POS = 0 : NEG = 0 randomize seed '.3, .4 least neg so far whew .5! FOR A = 1 TO 40 scan FOR X=1 TO 10000 scan call shuffle if d(1) < 5 then B = -1 else B = 1 'print B; BTOTAL = BTOTAL + B NEXT X 'PRINT BTOTAL if BTOTAL < 0 then NEG = NEG + 1 else POS = POS + 1 BTOTAL = 0 'zero total here if you don't want it to accumulate NEXT A 'print "Negative totals ";NEG;" Positive totals ";POS if POS > NEG then print "Positive outcome with seed ";seed next sub shuffle ' shuffle the digits Fisher-Yates algo for i = 9 to 1 step -1 r = int(rnd(0) * (i + 1)) t = d(i) : d(i) = d(r) : d(r) = t next end sub
And yes, it takes just a bit longer to run ;-)) But what do you want, quality or a quick pile of junk? Output:
|
|
|
Post by Rod on Nov 26, 2019 21:52:07 GMT
I'm not sure just yet, have not had much time but I don't think the rnd() function is as bad as you say. Yes it does have a slight bias and yes it does have a narrow seeding problem when first run. That's the two issues I know about. For all BASIC games and normal human randomness it is perfectly adequate. Only mathematicians get picky.
In your program I think you are accumulating floating point errors. Have a look at these two variants.
for seed = .01 to 1 step .01 scan print seed POS = 0 : NEG = 0 randomize seed '.3, .4 least neg so far whew .5! FOR A = 1 TO 10 scan FOR X=1 TO 1000 scan B = INT(RND(0)*3) - .995 'print B; BTOTAL = BTOTAL + B NEXT X 'PRINT BTOTAL if BTOTAL < 0 then NEG = NEG + 1 else POS = POS + 1 BTOTAL = 0 'zero total here if you don't want it to accumulate NEXT A print "Negative totals ";NEG;" Positive totals ";POS if POS > NEG then print "Positive outcome with seed ";seed next print print "________________________" print for seed = .01 to 1 step .01 scan print seed POS = 0 : NEG = 0 randomize seed '.3, .4 least neg so far whew .5! FOR A = 1 TO 10 scan FOR X=1 TO 1000 scan B = INT(RND(0)*3) - 1 'print B; if B>0 then BTOTAL = BTOTAL + 1 if B<0 then BTOTAL = BTOTAL - 1 NEXT X 'PRINT BTOTAL if BTOTAL < 0 then NEG = NEG + 1 else POS = POS + 1 BTOTAL = 0 'zero total here if you don't want it to accumulate NEXT A print "Negative totals ";NEG;" Positive totals ";POS if POS > NEG then print "Positive outcome with seed ";seed next
|
|
|
Post by tenochtitlanuk on Nov 26, 2019 23:16:33 GMT
Fascinated by your replacement of rnd() with a sub 'shuffle' which can only produce between .0123456789 ( lowest) and .9876543210 ( highest). I'd not expect that to average 0.50000000000 since it only samples a subset of all the ten-digit possibilities, while ignoring any that have repeated use of digits and others not used...
I'm really not sure what we are supposed to be testing here... implementations of randoms; statistics of randoms; or statistics of digit sequences. What, in simple English, is the bias you say is so striking? eg 'in many calls to LB's rnd() I find that I get x% more values below 0.5 than above it' or ' I find that when used to generate -1 and +1 at random but with equal probability the average is not 0.0' or 'I find that if I look at the result of calling rnd() repeatedly and looking at the third digit, the following digit is not equally likely to be any of 0 to 9'.
EDIT My first Hamurabi was on a Commodore PET 8K. My version in LB a decade or so ago showed no significant biasses- but half the fun was adding things like random plagues of locusts, invasions, or sex-strikes by the women so no children born. Educational for the school kids we developed these variations with!
PPS My surname is Fisher, My father Ronald was a top-grade mathematician between the World Wars but not the R Fisher whose statistics work is now suspect. And none of the three did any shuffling..
|
|
|
Post by daveylibra on Nov 26, 2019 23:25:58 GMT
Hi all, thanks for all the testing! Just to explain further what I was trying to do, a sequence of length 1 is simply a 0 or a 1. So, the 0 in 1,1,0,1,1 is a series of 1. anything longer is obviously a series of 2 or more. Now, the maths is (I've had someone check this for me)- in a string of numbers length n, the average is (n/4)+0.5 series of 1. And the total number of series is (n/2)+0.5 EG, if n=100 then series of 1 is average 25.5 and total number of series is average 50.5 So series of 2 or more = 25. So there should be slightly more series of 1 than any other length. This difference becomes more insignificant as n increases.
Having said all this, I thought I would test this out with a program. The line IF N(X-1)<>N(X) just tests for the beginning of a sequence, ie a swap from 0 to 1. I'm sure you get the idea.
I deliberately let TOTAL run, and printed it every so often, which was easier for me than plotting a graph. I expected TOTAL to hover around 0. I was very surprised when it increased steadily every run! If this were true, we could bet on a 1 after every ...1,0 and on a 0 after every ...0,1 and win in the long run. (A game of heads or tails, anyone?)
By chance, when adapting the program, I included the "throwaway" RND variable, which, to further surprise, does seem to make the results random.
How this can work is a complete mystery. After all, if there is a bias in RND, how can creating an unused RND "every other go" make any difference? But, if it works, it is a simpler, and shorter, solution than the complicated efforts with seeds that are a bit beyond me.
I even had the program count the number of 0s and 1s both using the throwaway RND and not using it. I didn't notice bias.
I guess we can never solve the mystery!
PS I was testing statistics in a sequence, but this actual question is about RND, and the fact that generating an extra, unused RND inside a loop seems to solve the bias problem!
|
|