#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2011
    Posts
    139
    Rep Power
    3

    Accessing the next element


    Hello, i have a text file, say called 'Text.list' setup similar to the following:

    element[1]
    element[2]
    element[3]
    ...
    element[n]

    i have the following code:

    Code:
    Text = open('Text.list','r').readlines()
    for Line in Text :
    # now i want to see if the current value of Line is equal to the
    # next element in the file, ie. is Line == element[x+1]
    Is there a slick way in Python to do this, without having to use a counter variable, etc?

    Thank you.
  2. #2
  3. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,714
    Rep Power
    480

    zip is your friend. zip it!


    Code:
    #What do you mean by element?  How does an "element" differ from Line?
    #this code compares pairs of lines without counters, at some memory
    #cost.
    
    # the names curtail and behead come from the j definitions
    # http://www.jsoftware.com
    # curtail=: }:
    # behead=: }.
    def curtail(LIST):
        '''
            >>> curtail([1,2,3])
            [1, 2]
        '''
        return LIST[:-1]
    
    def behead(LIST):
        '''
            >>> behead([1,2,3])
            [2, 3]
        '''
        return LIST[1:]
    
    # preferred file use.  Use the "with" statement.
    with open('Text.list','r') as input_stream:
        Text = input_stream.readlines()
    
    # compare pairs of lines
    for (LINE,NEXTLINE,) in zip(curtail(Text),behead(Text)):
        if LINE == NEXTLINE:
            print('duplicate adjacent lines of\n"{}"'.format(LINE))

    Comments on this post

    • WynnDeezl agrees
    [code]Code tags[/code] are essential for python code and Makefiles!
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2011
    Posts
    139
    Rep Power
    3

    clarification


    thanks!

    what i meant by element[x] is just some line of text.

    Line is just a variable containing element[x].

    wow. i'll have to study the j stuff.
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2012
    Posts
    39
    Rep Power
    2
    The easiest way to do this without counter variables or loading the entire file into memory is to turn the check on its head. Rather than try and compare to the next line which you don't know, compare it to the previous line which you have already read.

    Code:
    lastline = None
    with open('Text.list','r') as Text:
        for Line in Text:
            if Line == lastline:
                # duplicate line found
                print "Duplicate line: ",line
            lastline = Line
    Depending upon what you are trying to do this may not be practical but as it is only keeping two lines and a file pointer in memory it is very efficient.

    Comments on this post

    • WynnDeezl agrees : Perfect. Thanks!!!
  8. #5
  9. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,714
    Rep Power
    480
    And of course the gnu-linux command

    uniq --repeated file

    does the work
    [code]Code tags[/code] are essential for python code and Makefiles!
  10. #6
  11. Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Feb 2005
    Posts
    588
    Rep Power
    64
    A bit more general, maybe you can adapt for adjacent lines:
    [code=python]'''list_get_duplicates1.py

    create a list of duplicates in a given list
    '''

    def get_duplicate_items(mylist):
    """
    return a list of duplicate items in mylist
    """
    return [item for item in set(mylist) if mylist.count(item) > 1]


    # testing ...
    mylist1 = [ 'oranges' , 'apples' , 'oranges' , 'grapes' ]
    mylist2 = list('abcdefgbd')

    sf = 'Duplicates = %s'

    print(mylist1)
    print(sf % get_duplicate_items(mylist1))

    print('-'*50)

    print(mylist2)
    print(sf % get_duplicate_items(mylist2))

    '''result -->
    ['oranges', 'apples', 'oranges', 'grapes']
    Duplicates = ['oranges']
    --------------------------------------------------
    ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'b', 'd']
    Duplicates = ['b', 'd']
    '''
    [/code]
    Real Programmers always confuse Christmas and Halloween because Oct31 == Dec25
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2012
    Posts
    12
    Rep Power
    0
    I am not sure if I understand the original question correctly. But if you are looking for unique entries in the file then you can use the "set" function and generators. Example below:

    if __name__ == '__main__':
    with open ("unsorted.txt", "r") as f:
    x= list(set(line for line in f))
    print x

IMN logo majestic logo threadwatch logo seochat tools logo