#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2012
    Posts
    12
    Rep Power
    0

    [Solved] - Making A Chatbot With Text File Database - WindowsOS


    [CODE=python]import os
    import string

    usr_name = raw_input(' What is your name?: ')
    cur_user = usr_name
    prg_runn = True
    cur_line = ''
    prv_line = ''

    while prg_runn == True :
    print ''
    print raw_input(' Press enter to continue: ')
    prv_line = cur_line

    if cur_user == usr_name :
    print (' ') + str(cur_user)

    # Text Processing
    print (' The user will enter data:')
    cur_line = raw_input(' Please speak to the bot: ')
    print (' Striping text from input:')
    x = "".join([c for c in cur_line if c in string.letters or c in string.whitespace or c in string.digits])
    x = x.lower()
    x = x.strip(' ')
    print (' Data compared to previous line:')
    print (' Testing to see is response exists:')
    print (' If it does, increasing integer:')

    if len(x) <= 100 and prv_line != '':
    z = list(x) #+ ('_' * (99-len(x))))
    cur_line = ''.join(z[0:])
    wordfind = x.split(' ')
    if not os.path.exists('memory'):
    os.makedirs('memory')
    f = open('memory/' + prv_line +'.txt', 'a')
    f.write('\n' + cur_line + ' 1')
    f.close()

    print (' This text has been saved to memory:')
    cur_user = 'Ashiel'

    elif cur_user == 'Ashiel' :
    print (' ') + str(cur_user)
    print (' Reading current line:')
    print (' Searching text file with name equal to current line:')
    print (' If non existent, respond with stock:')
    print (' If so, respond from generation:')
    print (' This is the bots response')
    cur_line = raw_input(' Please enter a test response: ')
    cur_user = usr_name[/CODE]

    The portion I'm having trouble with is with file management. Writing to the file is just as I would like. My goal is to write an increment to count responses on the first line of the saved text file. Write to the file the humans response with an increment at the end the response line. Like below

    Code:
    1
    this is my response 1
    Every response will be in this format. Space delimiting words into a table.

    Code:
    >>> ['this', 'is', 'my', 'response', 1]
    If there are 4 responses in a file. The human will respond to a bots statement from another file.

    [CODE=how are you doing.txt]13
    i am doing fine 5
    why do you ask 4
    pretty good 2
    pretty good i guess 2[/CODE]

    Code:
    >>> Ashiel
    >>> how are you doing
    >>>
    >>> User
    >>> raw_input() = pretty good i guess
    Memory should be able to find line equal to 'pretty good i guess' and increment '2' by '1' totaling '3'... and the first line to '14'. If this makes any since

    My problem is the...

    Reading from a textfile with a filename == prv_line.
    Finding a string in a textfile.
    Comparing an incremental.

    Parts of the file management. I am new, and I do not understand the documentation on file.find/read... etc. I was hoping that someone could help.
  2. #2
  3. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,997
    Rep Power
    481

    I hope this answers some of your questions


    Code:
    # plan
    #   read the response file into a list of lines
    #   change the list as needed
    #   write the entire list back to the response file
    
    
    
    # but first
    # required setup
    filename = 'how are you doing.txt'
    response = 'i am doing fine'
    response_words = response.split()
    
    
    
    
    #   read the response file into a list of lines
    # read the data file
    with open(filename,'r') as inf:
        lines = inf.readlines()
    
    #print(lines) # debug
    
    
    
    
    #   change the list as needed
    # adjust the total number of responses
    counts = int(lines[0])
    counts += 1
    lines[0] = '%d\n'%counts
    
    # adjust the matching response, if there is one
    for (i,line) in enumerate(lines[1:]):
        if line.startswith(response):
            fields = line.split()
            if len(fields) == len(response_words)+1: # this is a match!
                counts_for_this_response = int(fields[-1])+1
                lines[i+1] = '%s %d\n'%(response,counts_for_this_response) # change this line
                break
    else:                                   # append the new response
        lines.append('%s %d\n'%(response,1))
    
    
    
    
    
    #   write the entire list back to the response file
    # rewrite the entire file
    with open(filename,'w') as ouf:
        ouf.write(''.join(lines))
    
    
    
    
    '''
    conclusion:
      use int() to convert from string to integer
      use string formatting to convert from integer to string
      I chose `for (i,line) in enumerate(lines[1:]):'
        to avoid messing up the first line of the file.
        I foresaw possible trouble if the input had been an empty string.
        The enumeration starts at 0, that's why I later had to adjust
        `lines[i+1]'
      use `with' statement to open files.
        The context manager automatically closes the file at the end of the block.
      changes to the length of lines in the middle of the file prevent
        random access.  We had to rewrite the entire file.
    '''

    Comments on this post

    • Xenith agrees : Very Helpful.
    [code]Code tags[/code] are essential for python code and Makefiles!
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2012
    Posts
    12
    Rep Power
    0
    I had originally intended to limit responses, and yes all of that made excellent sense. I think this is exactly what I was trying to achieve. I'll work on the chatbot tomorrow, and hopefully I'll at least have the ability to read from the files and respond. Interestingly, I can generate a random set of memory files for testing.

    Is the way this bot saved data excessive? If so, would there be a neater or more efficient way to store the memory files? I might by sect them into files by the second word in the string or something.
  6. #4
  7. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,997
    Rep Power
    481
    If the files get large you could make fixed length records and store them sorted. Then you'd only have to rewrite large portions of the file only on new responses.

    Or you can move to a database.

    The other issue, your scheme relies on the file system name space and access speed for the other responses. At least I think that's what happens.

    Another plan:
    Investigate atexit
    store the data in memory with periodic (timed or by number of changes) backups to the files, and a final backup to file guaranteed with atexit. Well, unless the cleaner trips over the power cord.
    [code]Code tags[/code] are essential for python code and Makefiles!
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2012
    Posts
    12
    Rep Power
    0
    Yes, generally the reply speed should be almost instantaneous, and most files will only have 1 or 2 responses, at worst a file may have 100 or so responses (if the bot is used in excess). It's also just doing a file name search, which is relatively quick.

    I'm just worried, not about the size of the files themselves... but rather the number in files. Each response will have a different file name associated to it. This memory is more like a rainbow table of user responses. The purpose of the response frequency is to randomize a response based on how many times that response was made when asked the same prior. So the top line will be the sum of all lines... A response on a line with the increment word, will be the probability out of total responses that response may be selected.

    The hope is that over the course of millions of responses, the responses self equalize and just make sense universally. With the addition of the ability to decrease a response if it feels contextually incorrect, and remove lines if this response "frequency" becomes negative. I hope these will make the bot sound somewhat human after a couple of gigs in conversational bruteforce.

    I have another program that makes the memory files for entire conversations. By just loading a line delimited file in the order of (speaker A, B, A, B, A and so on). If I get the increment working its as simple as while looping every line of the file for self comparison and writing to the files.

    I can also make special cases for words like "pi" "radius" "diameter" "mile" "celsius"... etc for all the converstions, so that is searches for an int in the line, and tries to convert by the word.

    ^^; I'm very motivated to get this working... I started learning python 2 weeks ago, and I love every minute I've worked with it.
  10. #6
  11. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,997
    Rep Power
    481
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2012
    Posts
    12
    Rep Power
    0
    edX

    CS188.1x Artificial Intelligence➔

    CS188.1x is an online adaptation of the first half of UC Berkeley’s upper division course
    CS188: Introduction to Artificial Intelligence.
    BerkeleyX
  14. #8
  15. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2012
    Posts
    12
    Rep Power
    0
    Oh great python guru, I need to check textfiles for duplicate lines now. I will deal with the increment after I get responses working. I am trying to use parts from other files to do this comparison and can't get this to work as I would like.

    What am I doing incorrectly, the error message is quite vauge.

    Code:
    import os
    import string
    import random
    
    def file_len(fname):
        with open(fname) as f:
            for i, l in enumerate(f):
                pass
        return i
    
    usr_name = raw_input('  What is your name?: ')
    cur_user = usr_name
    prg_runn = True
    cur_line = ''
    prv_line = ''
    
    while prg_runn == True :
        prv_line = cur_line
        
        if cur_user == usr_name :
            
            ### Text Processing
            # The user will enter data
            cur_line = raw_input('  [' + str(cur_user) + ']: ')
            
            # Striping text from input
            x = "".join([c for c in cur_line if c in string.letters or c in string.whitespace or c in string.digits]) 
            x = x.lower()
            x = x.strip(' ')
            
            # Data compared to previous line
            # Testing to see is response exists
            
            if len(x) <= 100 and prv_line != '' and prv_line != 'ASHIEL DOES NOT UNDERSTAND' and prv_line != 'my name is ashiel':
                z = list(x) # + ('_' * (99-len(x))))
                cur_line = ''.join(z[0:])
                wordfind = x.split(' ')
                if not os.path.exists('memory'):
                    os.makedirs('memory')
                #file = open('memory/' + str(prv_line) +'.txt', 'r')
                #flines = file.readlines()
                
                #res = filter(lambda x: str(cur_line) in x, flines)
                #if len(res) == 0:
                    #file.write(str(cur_line)+"\n")
                    #file.close()
                    
                f = open('memory/' + str(prv_line) +'.txt', 'a')
                f.write(str(cur_line) + '\n')
                
                f.close()
            
            # This text has been saved to memory
            cur_user = 'Ashiel'
            
        elif cur_user == 'Ashiel' :
            
            # Reading current line
            # Searching text file with the name: ' + str(cur_line) + '.txt'
            
            if 'name' in cur_line:
                bot_response = 'my name is ashiel'
                
            elif not os.path.exists('memory/' + cur_line +'.txt'):
                with open('memory/@stock_response.txt','r') as inf:
                    lines = inf.readlines()
                    inf.close()
                    bot_response = lines[random.randint(0,4)]
                    bot_response = bot_response.strip('\n')
                    bot_response = "".join([c for c in bot_response if c in string.letters or c in string.whitespace or c in string.digits]) 
                    
            else:
                with open('memory/' + cur_line +'.txt','r') as inf:
                    lines = inf.readlines()
                    inf.close()
                    
                
                #If non existent, respond with stock:
                
                #If so, respond from generation
                y = "".join([c for c in lines[random.randint(0,file_len('memory/' + cur_line +'.txt')-1)] if c in string.letters or c in string.whitespace or c in string.digits])
                bot_response = y
                bot_response = bot_response.strip('\n')
                
            print '  [' + str(cur_user) + ']: ' + str(bot_response)
            cur_line = bot_response
            cur_user = usr_name
  16. #9
  17. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,997
    Rep Power
    481

    unable to duplicate lerror message, vague or clear.


    I populated a file named
    memory/@stock_response.txt
    and ran your program. There were no errors. I now have

    Code:
    $ ls memory
    3 response stock.txt   response 4 stock.txt  @stock_response.txt
    hi.txt                 response stock 2.txt
    how are you doing.txt  stock response 1.txt
    with

    $ cat memory/@stock_response.txt
    stock response 1
    response stock 2
    3 response stock
    response 4 stock
    [code]Code tags[/code] are essential for python code and Makefiles!
  18. #10
  19. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2012
    Posts
    12
    Rep Power
    0
    I was referring to the commented portion here, I appoligize.

    Code:
                #file = open('memory/' + str(prv_line) +'.txt', 'r')
                #flines = file.readlines()
                
                #res = filter(lambda x: str(cur_line) in x, flines)
                #if len(res) == 0:
                    #file.write(str(cur_line)+"\n")
                    #file.close()
    It was an attempt to compare the lines of the file to the cur_line var, in order to check if a new response must be added or not... Otherwise I will have duplicate responses in the same file.

    I'm trying not to get something like.

    [CODE=response.txt]this is a response
    this is a second response
    this is a third response
    this is a response
    this is a response[/CODE]

    I need to compare the cur_line to see if the file contains "this is a response" before adding it to the file.

    on a side note, this is my @stock_response.txt generator.

    Code:
    import os
    if not os.path.exists('memory'):
        os.makedirs('memory')
    f = open('memory/@stock_response.txt', 'a')
    f.write('how are you doing today\n')
    f.write('i am doing well\n')
    f.write('who are you\n')
    f.write('what are you doing\n')
    f.write('its nice to meet you\n')
    f.close()
  20. #11
  21. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,997
    Rep Power
    481
    file = open('memory/' + str(prv_line) +'.txt', 'r')
    flines = file.readlines()


    # Have you considered the set type?
    # Remember that the strings in the set
    # will have newline at the end.

    set_of_lines = set(flines)




    You have this, to which I appended comments:
    Code:
                res = filter(lambda x: str(cur_line) in x, flines)
                if len(res) == 0:
                    file.write(str(cur_line)+"\n") # cannot write to file opened for read  mode='r'
                    file.close() # closing file within "if" conditional is risky business.

    Comments on this post

    • Xenith agrees : Good it working.
    [code]Code tags[/code] are essential for python code and Makefiles!
  22. #12
  23. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2012
    Posts
    12
    Rep Power
    0
    b49P23TIvg - Thank you very much for the help, it is working now just as I always intended it too. Now I only need to gather a line delimited list of random conversation starters, and then start working on the weighted random.

    Another non-important question.

    I am developing this 'software' in aptana studio 3, so the ide is nice and clean when I am messing with the bot. I would like to have a similar look and feel in a console window separate from the program. I have no idea how to even do this kind of thing. Do you know where I could be the resources. I assume that I would be force to use tinker?
  24. #13
  25. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,997
    Rep Power
    481
    I've almost completely avoided "IDE"s for 35 years. I use emacs in some way for just about all computer tasks.

    How about a console window interface with gnu readline?
    [code]Code tags[/code] are essential for python code and Makefiles!

IMN logo majestic logo threadwatch logo seochat tools logo