writing some manual test methods for binary suffix string tree

I am waiting for a python machine learning book to get here via the Amazon fairy, so in the meantime I decided to play with the exercise from class I left off on.

The goal I had in mind was to put words as the key, and a suffix list as the value into the binary tree.  Then add the binary searchs through the suffix list into the tree.

So someone could search through the tree for the suffix they were looking for, and get any key value that had a matching suffix in their suffix list.

If I had a tree of DNA strings say.... and I wanted to make a tree, and find all the key DNA sequences that contained 'AcTGAT'  , it could return me a list of them.
I don't have it accomplished yet, but I just started doing this to get my head back into this files code.

I find it extremely useful to just tinker around with stuff.  See what I can make it do.  Might not be useful to whoever is reading this, but I find it fun and worthwhile.

Picture: Using my PrintTree class I made a ways back, This is the result of test_tree_two(3)





I still need to clean up a lot on the binary string tree, and incorporate the suffix searches but here's some methods I wrote so far.


So here's a few methods I tagged on to the end to manually test it.  No idea if anyone out there actually needs anything remotely similar, but hey, learning and sharing, that's the goal.
Update: I did those errors and messages all wrong... fixed them



def make_alias(name, number):
    if type(name) != str:
        message = "Error: make_alias(name, number) name - must be a string"
        raise TypeError(message)
    if type(number) != int:
        messsage = "Error: make_alias(name, number) number - must be an integer"
        raise TypeError(message)

    astring = name + str(number)
    return astring

def generate_string(stringsize, alphatype=None):
    alpha26 = 'abcdefghijklmnopqrstuvwxyz'
    alpha8 = 'abcdwxyz'
    alphaDNA = 'cATG'
    if type(stringsize) != int:
        message = "Error: generate_string method requires integer as parameter"
        raise TypeError(message)
    if alphatype == None:
        x = ''.join(random.choice("abcd") for _ in range(stringsize))
        return x
    elif alphatype == 'dna':
        x = ''.join(random.choice(alphaDNA) for _ in range(stringsize))
        return x
    elif alphatype == 'short':
        x = ''.join(random.choice(alpha8) for _ in range(stringsize))
        return x
    elif alphatype == "long":
        x = ''.join(random.choice(alpha26) for _ in range(stringsize))
        return x
    else:
        message = "Error: generate_string second parameter options= 'dna', 'short', 'long', or None"
        raise TypeError(message)

def generate_suffix_list(astring):
    alist = []
    if type(astring) != str:
        message = "Parameter for generate_suffix_list(astring), astring must be string type"
        raise TypeError(message)

    for i in range(len(astring)):
        alist.append(astring[i:])

    return alist

def test_tree():

    herbal = generate_string(8)
    herblist = generate_suffix_list(herbal)
    length = len(herblist)
    plant = BinaryTree('herb')
    for i in range(0, length):
        word = herblist[i]
        plant.set(word, i)

    plant.dump()

    x = plant.find_all_sub('a')
    print(x)

def test_tree_two(treesize):
    plant = BinaryTree("BCDEFGHI")
    if type(treesize) != int:
        message = "Error: parameter for test_tree_two(treesize) has to be an integer"
        raise TypeError(message)
    for i in range(treesize):
        herbal = generate_string(8, 'dna')
        herblist = generate_suffix_list(herbal)
        length = len(herblist)
        alias = make_alias('dnaData', i)
        plant.set(herbal, herblist, alias)
    printpretty = PrintTree()
    printpretty.graphical(plant.root_node)
#test_tree()
test_tree_two(3)

Comments

Popular posts from this blog

playing with color in powershell python

JavaScript Ascii animation with while loops and console.log

playing with trigonometry sin in pygame