Psuedo for the NLTK, SQL, Python chatbot part1
Experiment: Part One, figuring out the Psuedo code.
"How much do we want to tackle at a time?"
updated: 8-21-19 the table set-up section was all messed up. Apologies.
A while back I read an article, or a bit of an email from the IBM newsletter, that talked about having your chatbot tell people about yourself. I really like that idea, so finally I'm about to set that as my task.
But first, I need to see how to implement something like that.
I decided I'd like to integrate some manual training for the chatbot, so some database is in order. My pythonanywhere site uses SQL, so we're gonna use that for the experiment. I also need to refresh on how NLTK tags words (https://www.nltk.org/), and then start breaking some code. I can start by coming up with some psuedo code. It'll be a guideline for what I want the actual code to do.
--------------------------------------------------------
type(positive, negative, informative, non-question)
name(same as type)
trigger(words that NLTK picked out we want to grab this table with)
rating(This will be used in the future 'manual training' bit to give responses that are satisfactory a higher rating then others, or even eliminate one from use in some cases.)
response(The response that chat bot should return for trigger words.)
We just want to see that a response is being returned from a trigger word for now.
rating(This will be used in the future 'manual training' bit to give responses that are satisfactory
a higher rating then others, or even eliminate one from use in some cases.)
*Maybe even Who, What, Why, When, Where, WIWA*
- definitely need Whispering wall, 'WIWA' or whatever your chatbots' name is to return
- a carrot for people to follow when it's own name is used.
- the one response for anything with the chatbots name in it might be:
- "I am a chatbot, designed by <your name>, She/he/they has programmed me to respond to
- my name, but response is limited to the response you are seeing now. You can ask about
- <your name> if you like."
* Some trigger words will fit in all the tables, but the manual training bit will help
decide how we need to modify ratings, and responses, because I'm adding this bit
after the psuedo( && blog post) was written, I know that this part will be a future
experiment. We are not modifying the table yet.*
---------------------------------------------------
"How much do we want to tackle at a time?"
updated: 8-21-19 the table set-up section was all messed up. Apologies.
A while back I read an article, or a bit of an email from the IBM newsletter, that talked about having your chatbot tell people about yourself. I really like that idea, so finally I'm about to set that as my task.
But first, I need to see how to implement something like that.
I decided I'd like to integrate some manual training for the chatbot, so some database is in order. My pythonanywhere site uses SQL, so we're gonna use that for the experiment. I also need to refresh on how NLTK tags words (https://www.nltk.org/), and then start breaking some code. I can start by coming up with some psuedo code. It'll be a guideline for what I want the actual code to do.
--------------------------------------------------------
The table set-up:
Just setting up an idea for now.type(positive, negative, informative, non-question)
name(same as type)
trigger(words that NLTK picked out we want to grab this table with)
rating(This will be used in the future 'manual training' bit to give responses that are satisfactory a higher rating then others, or even eliminate one from use in some cases.)
response(The response that chat bot should return for trigger words.)
We just want to see that a response is being returned from a trigger word for now.
rating(This will be used in the future 'manual training' bit to give responses that are satisfactory
a higher rating then others, or even eliminate one from use in some cases.)
*Maybe even Who, What, Why, When, Where, WIWA*
- definitely need Whispering wall, 'WIWA' or whatever your chatbots' name is to return
- a carrot for people to follow when it's own name is used.
- the one response for anything with the chatbots name in it might be:
- "I am a chatbot, designed by <your name>, She/he/they has programmed me to respond to
- my name, but response is limited to the response you are seeing now. You can ask about
- <your name> if you like."
* Some trigger words will fit in all the tables, but the manual training bit will help
decide how we need to modify ratings, and responses, because I'm adding this bit
after the psuedo( && blog post) was written, I know that this part will be a future
experiment. We are not modifying the table yet.*
---------------------------------------------------
The python:
import the needed SQL and NLTK
SQL should have a table for responses to questions about me, and a set of tables for if someone types some kind of negative connotation This could also be a great way to enforce positive affirmations.
Order is by how I'm coming up with what needs to be done. Italic's functions would be the proper placement in python for the file to read.
The order is very important to python.
import the needed SQL and NLTK
SQL should have a table for responses to questions about me, and a set of tables for if someone types some kind of negative connotation This could also be a great way to enforce positive affirmations.
Order is by how I'm coming up with what needs to be done. Italic's functions would be the proper placement in python for the file to read.
The order is very important to python.
def get_input():
get the user input.
def process_engineer_data():
This will go above the function that uses it. Which is parse_input():
def default_response():
return a default response ("This data is not relevant to the experiment except for to see if the input is processed correctly.")
def parse_input():
use NLTK to parse out words to see if it's a question about the chatbot creator.
if the creator is in the input:
use: process_engineer_data()
else:
use: default_response()
Because this is an experiment to just process the data relating to the creator, we don't need to delve into the responses that are not about the creator. It can also be a good way to see if it is processing (as a fallback) the creator from the input properly.
def process_engineer_data():
Get the appropriate SQL table.return a base response made by me before training. use another method with that response to ask if this is a good answer.If it is a good answer we give it a rating. let's do -1>5. The higher the number the more appropriate a response it is. We want the chatbot to only pick the highest responses in the table.
Use -1 for responses that are NOT to be used. 0 for responses that should go to default, and anything above that the chatbot can pick up and use at random, with a preferrence for higher rated table items. Because this part is getting extensive, we might want a class object to process all of that.
class ProcessEngineer(object):
def __init__(self, data, rating):
"Since data will be new every use of this object, we don't want to initiate it with the rating or the data, it will change on every use, and it's preferrable that the class object not be mutated on every run. This will lead to errors and problems in the future."
self.data = data
self.rating = rating.
def get_sql(self, type):
get the table from sql related to the trigger word type.
Possible types: Positive feedback, Negative feedback, Informative feedback, non_questions
return associated table for 'get_rating_data(table)'
def get_rating_data(self, table):
in that table, we want the chatbot to choose the highest response for now. The randomizing is A whole nother can of worms that can be a totally different experiment.
So, now to parse the SQL table:
Wow, perfect, Thank you StackOverflow!!
https://stackoverflow.com/questions/24495791/sql-query-to-find-highest-rated
Return a response.
**********************************************************
For the next bit, where we'd want to manipulate the SQL, we probably want another class to handle making changes to the SQL.
class UpdateSQL(object):
Again, do we want this initiated with data that will be changed frequently? Probably not, so no initiations.
def process_rating(self, response, table_item):
This is the manual training bit.
We're going to decide if the response that was given from get_rating_data in the above class is appropriate manually, and change the table accordingly.So first send response to the command line and ask if it is appropriate.
OH. We need a possible questions table too....
So two tables.
This may get too big too fast. So lets just focus on getting the made responses classified into the right tables. Lets say they want to ask "Where does Nellie live?", this is different from "How does Nellie live?" One we would want an informative answer, one a positive feedback. We want 'where' to to always trigger the location/information response. We might add that as a trigger to the table:
informative.
OK, so that's still getting way to big. Small chunks.
So for the first bit, lets Just have the stuff above the **** line.
The processing manual training can be after we've made sure the SQL, and nltk parser are working properly. Can't build a proper roof if you have no walls, can't build sturdy walls without a sturdy base. Start small.
Next up: Part two, tackling the SQL and NLTK bits and bobs.
Comments
Post a Comment