Pandas Python and a little probability

Ok, so after looking through different tutorials, I found out pandas is way easier to handle then SQL.

So for the first step in that machine learning experiment from the last post, I came up with this little piece of code.
It uses random/randomint and pandas as import.
Resources and references are in the code comments.
As always, drop me a comment, hate it, love it, what to improve.
Next step is to get this rolling with more math involved and a bigger sets of probability data to store.
Adding an 'old_odd_probability' and 'old_even_probability' would probably be the next step toward getting that bayes to interpret a bigger picture of when and how random chooses it's *random integers*.

For example, to use the water droplet example from Jurassic Park (assuming you repeat the experiment 5 times)...

https://scienceonblog.wordpress.com/2017/04/13/chaos-theory-in-jurassic-park/
 
 If the droplet tends to roll left off your hand the second time you try, and you do it the same way every second time,  chances are the way in which you dribble that drop of water on your hand during that second event are probably influenced by something that tends to tilt the results toward rolling left.

In the end I want some proof that on a run of random, it will create very similar choices over a long run.  If you've been following my chatbot experiment, it would choose very much the same lines of script every time I ran her.  Maybe it's only on small ranges it does this, that's another theory, but that's also another experiment.




Here's the code:

---Start code block ----


import pandas as pd
#import numpy as np #<-- not used yet
#import matplotlib.pyplot as plt #<-- not used yet
from random import randint

"""
resources:
          1) https://ourcodingclub.github.io/2018/04/18/pandas-python-intro.html
          2) https://stackoverflow.com/questions/31511997/pandas-dataframe-replace-all-values-in-a-column-based-on-condition
          3) https://stackoverflow.com/questions/13842088/set-value-for-particular-cell-in-pandas-dataframe-using-index
"""

### this set up is from resource 1) ###

pd.set_option('max_columns', 20)
random_data = { 'run number': [0,1,2,3,4], #'run number:' as key produces NaN results
                'odd probability': [0.50, 0.50, 0.60, 0.40, 0.30],
                'even probability': [0.50, 0.50, 0.40, 0.60, 0.70],
                }

prob_data = pd.DataFrame(random_data, columns = ['run number', 'odd probability', 'even probability'])
### End setup from resource 1 ###





##########  tutorial good to knows #########
# reference 1= R1.  reference 2= R2  ect...
# To get one column (R1):  dataframe.iloc[0] iloc=integer location
# To sort data (R1):  dataframe.sort_values(by=['Height'], ascending=False)
# to change data values (R3):  database.set_value(<row>, <column>, <new_value>) <-- soon depreciated
# also set values(R3): database.ix['x','C']=10

######  print statement checks ######
#print(prob_data)
#row = prob_data.iloc[3][1]
###### figuring out how to change my probability values --->
#prob_data.loc[prob_data['odd probability'][3], 'odd probability'] = 1.00
#prob_data.set_value(3, 'odd probability', 0.99) #<-- soon depreciated (R3)
## the one I need >>>> prob_data.ix[3, 'odd probability'] = 0.01



def run_prob():

    """
    run a for loop 5 times, and calculate probability of randomint
    being an even or an odd 5 times.   This will not compare old
    probability with new probabilities.  That is the next step
    in the experiment.
    probability(event) = occurance of event / total possible events
    probability(odd) = total occurances of an odd / total occurance of odd + even
    probability(even) = total occurances of an even / total occurance of even + odd
    """

    count = 1
    even_count = 0
    odd_count = 0
    # I found out you can use almost anything as your for loop variable
    # for <something>  ,   thus spamalot and carrots

    for spamalot in range(4):
        x = randint(1, 20)
        if x % 2 == 0:
            even_count += 1
            count += 1
            prob = round(even_count / count , 2)
            index = count - 1
            prob_data.ix[index, 'even probability'] = prob

        else:
            odd_count += 1
            count += 1
            prob = round(odd_count / count, 2)
            index = count - 1
            prob_data.ix[index, 'odd probability'] = prob

            #print(x) #<--- uncomment to see randomint's choices

    print(prob_data)

########  run it 5 times to see change in data table ########

for carrots in range(5):
    run_prob()











---End code block---







Comments

Popular posts from this blog

Statistics, Python, Making a Stem Plot

parenting, learning, and code