#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2013
    Posts
    18
    Rep Power
    0

    Unhappy Pairwise calculation in a list


    Dear all
    I have a list of data as text and I want to execute a function to this list.As this function caculates the distance between two coordinates I should first make a group of my data which have the same string start with.
    1-I do split the first column in each line but don't know how to exploit the lines with the same first string.
    2- I also don't know how to execute this function for every two points in its different group
    I am busy with this problem for weeks!
    this is a part of my list:
    AFJ.SPZ.IR.3 46.812 38.433
    AFJ.SPZ.IR.8 46.84 38.463
    AKL.SPZ.IR.11 46.691 38.399
    AKL.SPZ.IR.12 46.722 38.407
    AKL.SPZ.IR.13 46.654 38.404
    AKL.SPZ.IR.25 46.699 38.442
    AKL.SPZ.IR.3 46.812 38.433
    AKL.SPZ.IR.8 46.84 38.463
    ALA.SPZ.IR.3 46.812 38.433
    ANAR.BHZ.IR.8 46.84 38.463
    ANJ.SPZ.IR.13 46.654 38.404
    ANJ.SPZ.IR.18 46.662 38.399
    ANJ.SPZ.IR.27 46.763 38.377
    ANJ.SPZ.IR.3 46.812 38.433
    ANJ.SPZ.IR.8 46.84 38.463
    BST.SPZ.IR.1 46.732 38.457
    BST.SPZ.IR.10 46.707 38.448
  2. #2
  3. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,894
    Rep Power
    481
    Originally Posted by sahar sa
    AFJ.SPZ.IR.3 46.812 38.433
    migtht be a "tab separated" string. Which part is the numeric vector? Which is the key?

    By distance I presume you want
    EuclideanNorm(V1-V2)

    I don't know how to proceed after (and because I don't think I split the lines as you intend)
    Code:
    # runs in python 3
    # doctest using command line   python -m doctest this_file.py
    
    import collections, itertools, io
    
    data = '''
        AFJ.SPZ.IR.3	46.812	38.433
        AFJ.SPZ.IR.8	46.84	 38.463
        AKL.SPZ.IR.11	46.691	38.399
        AKL.SPZ.IR.12	46.722	38.407
        AKL.SPZ.IR.13	46.654	38.404
        AKL.SPZ.IR.25	46.699	38.442
        AKL.SPZ.IR.3	46.812	38.433
        AKL.SPZ.IR.8	46.84	 38.463
        ALA.SPZ.IR.3	46.812	38.433
        ANAR.BHZ.IR.8	46.84 38.463
        ANJ.SPZ.IR.13	46.654	38.404
        ANJ.SPZ.IR.18	46.662	38.399
        ANJ.SPZ.IR.27	46.763	38.377
        ANJ.SPZ.IR.3	46.812	38.433
        ANJ.SPZ.IR.8	46.84 38.463
        BST.SPZ.IR.1	46.732	38.457
        BST.SPZ.IR.10	46.707	38.448
    '''.strip()
    
    def group(s:'multi-line string'):
        '''
            Each line of the string is split into 2 fields.
            The first field is the dictionary key.
            The second field is the associated item.
            All items of like keys are appended to a list as the
              value of the dictioary key.
            Group returns that dictionary.
            >>> LF = '{:c}'.format(10)
            >>> # print(dict(group('a 1 4'+LF+'b 4'+LF+'a 2')))
            >>> dict(group('a 1 4'+LF+'b 4'+LF+'a 2')) == dict(a = ['1 4', '2'], b = ['4'])
            True
        '''
        result = collections.defaultdict(list)
        for line in io.StringIO(s):
            (key, item) = line.strip().split(maxsplit = 1)
            result[key].append(item)
        return result
    
    def convert_numeric_string_to_vector(s):
        '''
            >>> convert_numeric_string_to_vector('1 2 3.75') == [1, 2, 3.75]
            True
        '''
        return [float(representation) for representation in s.split()]
    
    def main(data = data):
        dictionary_of_unknown_unguessed = group(data)
        d = dictionary_of_unknown_unguessed
        for (key, value,) in d.items():
            L = d[key]
            for (i, item,) in enumerate(L):
                L[i] = convert_numeric_string_to_vector(item)
            d[key] = L
        return d
    
    if __name__ == '__main__':
        import pprint
        pprint.pprint(main())
    
    run_the_module_to_display = '''
    {'AFJ.SPZ.IR.3': [[46.812, 38.433]],
     'AFJ.SPZ.IR.8': [[46.84, 38.463]],
     'AKL.SPZ.IR.11': [[46.691, 38.399]],
     'AKL.SPZ.IR.12': [[46.722, 38.407]],
     'AKL.SPZ.IR.13': [[46.654, 38.404]],
     'AKL.SPZ.IR.25': [[46.699, 38.442]],
     'AKL.SPZ.IR.3': [[46.812, 38.433]],
     'AKL.SPZ.IR.8': [[46.84, 38.463]],
     'ALA.SPZ.IR.3': [[46.812, 38.433]],
     'ANAR.BHZ.IR.8': [[46.84, 38.463]],
     'ANJ.SPZ.IR.13': [[46.654, 38.404]],
     'ANJ.SPZ.IR.18': [[46.662, 38.399]],
     'ANJ.SPZ.IR.27': [[46.763, 38.377]],
     'ANJ.SPZ.IR.3': [[46.812, 38.433]],
     'ANJ.SPZ.IR.8': [[46.84, 38.463]],
     'BST.SPZ.IR.1': [[46.732, 38.457]],
     'BST.SPZ.IR.10': [[46.707, 38.448]]}
    '''
    [code]Code tags[/code] are essential for python code and Makefiles!
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2013
    Posts
    18
    Rep Power
    0
    Thank you alot
    I was looking for this classification:
    AFJ ['AFJ.SPZ.IR.3 46.812 38.433', 'AFJ.SPZ.IR.8 46.84 38.463', 'AFJ.SPZ.IR.8 46.84 38.463']
    AKL ['AKL.SPZ.IR.11 46.691 38.399', 'AKL.SPZ.IR.11 46.691 38.399', 'AKL.SPZ.IR.12 46.722 38.407', 'AKL.SPZ.IR.12 46.722 38.407', 'AKL.SPZ.IR.13 46.654 38.404', 'AKL.SPZ.IR.25 46.699 38.442', 'AKL.SPZ.IR.3 46.812 38.433', 'AKL.SPZ.IR.8 46.84 38.463']
    ALA ['ALA.SPZ.IR.3 46.812 38.433']
    ANAR ['ANAR.BHZ.IR.8 46.84 38.463']
    ANJ ['ANJ.SPZ.IR.13 46.654 38.404', 'ANJ.SPZ.IR.18 46.662 38.399', 'ANJ.SPZ.IR.3 46.812 38.433']

    Although I don't know yet how to execute my function pairwisely to the coordinates in every class!

    Originally Posted by b49P23TIvg
    migtht be a "tab separated" string. Which part is the numeric vector? Which is the key?

    By distance I presume you want
    EuclideanNorm(V1-V2)

    I don't know how to proceed after (and because I don't think I split the lines as you intend)
    Code:
    # runs in python 3
    # doctest using command line   python -m doctest this_file.py
    
    import collections, itertools, io
    
    data = '''
        AFJ.SPZ.IR.3	46.812	38.433
        AFJ.SPZ.IR.8	46.84	 38.463
        AKL.SPZ.IR.11	46.691	38.399
        AKL.SPZ.IR.12	46.722	38.407
        AKL.SPZ.IR.13	46.654	38.404
        AKL.SPZ.IR.25	46.699	38.442
        AKL.SPZ.IR.3	46.812	38.433
        AKL.SPZ.IR.8	46.84	 38.463
        ALA.SPZ.IR.3	46.812	38.433
        ANAR.BHZ.IR.8	46.84 38.463
        ANJ.SPZ.IR.13	46.654	38.404
        ANJ.SPZ.IR.18	46.662	38.399
        ANJ.SPZ.IR.27	46.763	38.377
        ANJ.SPZ.IR.3	46.812	38.433
        ANJ.SPZ.IR.8	46.84 38.463
        BST.SPZ.IR.1	46.732	38.457
        BST.SPZ.IR.10	46.707	38.448
    '''.strip()
    
    def group(s:'multi-line string'):
        '''
            Each line of the string is split into 2 fields.
            The first field is the dictionary key.
            The second field is the associated item.
            All items of like keys are appended to a list as the
              value of the dictioary key.
            Group returns that dictionary.
            >>> LF = '{:c}'.format(10)
            >>> # print(dict(group('a 1 4'+LF+'b 4'+LF+'a 2')))
            >>> dict(group('a 1 4'+LF+'b 4'+LF+'a 2')) == dict(a = ['1 4', '2'], b = ['4'])
            True
        '''
        result = collections.defaultdict(list)
        for line in io.StringIO(s):
            (key, item) = line.strip().split(maxsplit = 1)
            result[key].append(item)
        return result
    
    def convert_numeric_string_to_vector(s):
        '''
            >>> convert_numeric_string_to_vector('1 2 3.75') == [1, 2, 3.75]
            True
        '''
        return [float(representation) for representation in s.split()]
    
    def main(data = data):
        dictionary_of_unknown_unguessed = group(data)
        d = dictionary_of_unknown_unguessed
        for (key, value,) in d.items():
            L = d[key]
            for (i, item,) in enumerate(L):
                L[i] = convert_numeric_string_to_vector(item)
            d[key] = L
        return d
    
    if __name__ == '__main__':
        import pprint
        pprint.pprint(main())
    
    run_the_module_to_display = '''
    {'AFJ.SPZ.IR.3': [[46.812, 38.433]],
     'AFJ.SPZ.IR.8': [[46.84, 38.463]],
     'AKL.SPZ.IR.11': [[46.691, 38.399]],
     'AKL.SPZ.IR.12': [[46.722, 38.407]],
     'AKL.SPZ.IR.13': [[46.654, 38.404]],
     'AKL.SPZ.IR.25': [[46.699, 38.442]],
     'AKL.SPZ.IR.3': [[46.812, 38.433]],
     'AKL.SPZ.IR.8': [[46.84, 38.463]],
     'ALA.SPZ.IR.3': [[46.812, 38.433]],
     'ANAR.BHZ.IR.8': [[46.84, 38.463]],
     'ANJ.SPZ.IR.13': [[46.654, 38.404]],
     'ANJ.SPZ.IR.18': [[46.662, 38.399]],
     'ANJ.SPZ.IR.27': [[46.763, 38.377]],
     'ANJ.SPZ.IR.3': [[46.812, 38.433]],
     'ANJ.SPZ.IR.8': [[46.84, 38.463]],
     'BST.SPZ.IR.1': [[46.732, 38.457]],
     'BST.SPZ.IR.10': [[46.707, 38.448]]}
    '''
  6. #4
  7. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,894
    Rep Power
    481

    OK, I give up.


    Originally Posted by sahar sa
    I was looking for this classification:
    AFJ ['AFJ.SPZ.IR.3 46.812 38.433', 'AFJ.SPZ.IR.8 46.84 38.463', 'AFJ.SPZ.IR.8 46.84 38.463']
    ...
    Why is AFJ.SPZ.IR.8 46.84 38.463 repeated? (other duplicates exist)
    Why is the BST key omitted? I find no indication in your descriptions that would give anyone a clue about the algorithm.

    Either you're careless or you don't know how to specify the problem clearly. Small wonder you've struggled for weeks. Or I could be insufficiently smart.


    Here's a program that might do part of what you actually want. It does not do what you say you want.
    Code:
    # runs in python 3
    # doctest using command line   python -m doctest -v this_file.py
    # Another test: the command    python this_file.py               should have no errors.
    
    import pdb                              # yes, needed it.
    import collections, itertools
    
    def group(inf:'an io input stream', split_character:'easily generalized to a split_function, or to a regular expression' = None):
        '''
            Lines are first stripped, then
            Each line of the string is split into 2 fields.
            The first field is the dictionary key.
            An item is the entire line.
            All items of like keys are appended to a list as the
              value of the dictionary key.
            Group returns that dictionary.
            >>> import io
            >>> LF = '{:c}'.format(10)
            >>> result = dict(group(io.StringIO('a 1 4'+LF+'b 4'+LF+'a 2')))
            >>> result == dict(a = ['a 1 4', 'a 2'], b = ['b 4'])
            True
        '''
        result = collections.defaultdict(list)
        for line in inf:
            stripped_line = line.strip()
            (key, *junk,) = stripped_line.split(sep = split_character, maxsplit = 1)
            result[key].append(stripped_line)
        return result
    
    if __name__ == '__main__':
    
        import io, pprint
    
        test0 = '''
            AFJ.SPZ.IR.3	46.812	38.433
            AFJ.SPZ.IR.8	46.84	38.463
            AKL.SPZ.IR.11	46.691	38.399
            AKL.SPZ.IR.12	46.722	38.407
            AKL.SPZ.IR.13	46.654	38.404
            AKL.SPZ.IR.25	46.699	38.442
            AKL.SPZ.IR.3	46.812	38.433
            AKL.SPZ.IR.8	46.84	38.463
            ALA.SPZ.IR.3	46.812	38.433
            ANAR.BHZ.IR.8	46.84	38.463
            ANJ.SPZ.IR.13	46.654	38.404
            ANJ.SPZ.IR.18	46.662	38.399
            ANJ.SPZ.IR.27	46.763	38.377
            ANJ.SPZ.IR.3	46.812	38.433
            ANJ.SPZ.IR.8	46.84	38.463
            BST.SPZ.IR.1	46.732	38.457
            BST.SPZ.IR.10	46.707	38.448
        '''.strip()
    
        expectation0 = dict(
            AFJ = ['AFJ.SPZ.IR.3	46.812	38.433',
                   'AFJ.SPZ.IR.8	46.84	38.463',
                   'AFJ.SPZ.IR.8	46.84	38.463',],
            AKL = ['AKL.SPZ.IR.11	46.691	38.399',
                   'AKL.SPZ.IR.11	46.691	38.399',
                   'AKL.SPZ.IR.12	46.722	38.407',
                   'AKL.SPZ.IR.12	46.722	38.407',
                   'AKL.SPZ.IR.13	46.654	38.404',
                   'AKL.SPZ.IR.25	46.699	38.442',
                   'AKL.SPZ.IR.3	46.812	38.433',
                   'AKL.SPZ.IR.8	46.84	38.463',],
            ALA = ['ALA.SPZ.IR.3	46.812	38.433',],
            ANAR= ['ANAR.BHZ.IR.8	46.84	38.463',],
            ANJ = ['ANJ.SPZ.IR.13	46.654	38.404',
                   'ANJ.SPZ.IR.18	46.662	38.399',
                   'ANJ.SPZ.IR.3	46.812	38.433',],
            BST = ['BST.SPZ.IR.1	46.732	38.457',
                   'BST.SPZ.IR.10	46.707	38.448',],
        )
    
        result0 = group(io.StringIO(test0), '.')
        try:
            assert result0 == expectation0
        except AssertionError:
            print('\nexpect:\n')
            pprint.pprint(expectation0)
            print('\n\nresult:\n')
            pprint.pprint(result0)
            raise
    Last edited by b49P23TIvg; November 20th, 2013 at 07:44 AM.
    [code]Code tags[/code] are essential for python code and Makefiles!
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2013
    Posts
    18
    Rep Power
    0
    Thank you alot for your time and codes
    Actually AFJ repeatation was a mistake of pasting!I didn't paste all the data I just wanted to show the grouping.
    Excuse me for my carelessness!
    this is the fuction which calculates the distance between coordinates:
    #!/usr/bin/env python
    from math import radians, cos, sin, asin, sqrt

    def haversine(lon1, lat1, lon2, lat2):
    """
    Calculate the great circle distance between two points
    on the earth (specified in decimal degrees)
    """
    # convert decimal degrees to radians
    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
    # haversine formula
    dlon = lon2 - lon1
    dlat = lat2 - lat1
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * asin(sqrt(a))
    km = 6367 * c
    return km

    I should execute it in every group and calculate the distance between every two coordinates in that group.


    Originally Posted by b49P23TIvg
    Why is AFJ.SPZ.IR.8 46.84 38.463 repeated? (other duplicates exist)
    Why is the BST key omitted? I find no indication in your descriptions that would give anyone a clue about the algorithm.

    Either you're careless or you don't know how to specify the problem clearly. Small wonder you've struggled for weeks. Or I could be insufficiently smart.


    Here's a program that might do part of what you actually want. It does not do what you say you want.
    Code:
    # runs in python 3
    # doctest using command line   python -m doctest -v this_file.py
    # Another test: the command    python this_file.py               should have no errors.
    
    import pdb                              # yes, needed it.
    import collections, itertools
    
    def group(inf:'an io input stream', split_character:'easily generalized to a split_function, or to a regular expression' = None):
        '''
            Lines are first stripped, then
            Each line of the string is split into 2 fields.
            The first field is the dictionary key.
            An item is the entire line.
            All items of like keys are appended to a list as the
              value of the dictionary key.
            Group returns that dictionary.
            >>> import io
            >>> LF = '{:c}'.format(10)
            >>> result = dict(group(io.StringIO('a 1 4'+LF+'b 4'+LF+'a 2')))
            >>> result == dict(a = ['a 1 4', 'a 2'], b = ['b 4'])
            True
        '''
        result = collections.defaultdict(list)
        for line in inf:
            stripped_line = line.strip()
            (key, *junk,) = stripped_line.split(sep = split_character, maxsplit = 1)
            result[key].append(stripped_line)
        return result
    
    if __name__ == '__main__':
    
        import io, pprint
    
        test0 = '''
            AFJ.SPZ.IR.3	46.812	38.433
            AFJ.SPZ.IR.8	46.84	38.463
            AKL.SPZ.IR.11	46.691	38.399
            AKL.SPZ.IR.12	46.722	38.407
            AKL.SPZ.IR.13	46.654	38.404
            AKL.SPZ.IR.25	46.699	38.442
            AKL.SPZ.IR.3	46.812	38.433
            AKL.SPZ.IR.8	46.84	38.463
            ALA.SPZ.IR.3	46.812	38.433
            ANAR.BHZ.IR.8	46.84	38.463
            ANJ.SPZ.IR.13	46.654	38.404
            ANJ.SPZ.IR.18	46.662	38.399
            ANJ.SPZ.IR.27	46.763	38.377
            ANJ.SPZ.IR.3	46.812	38.433
            ANJ.SPZ.IR.8	46.84	38.463
            BST.SPZ.IR.1	46.732	38.457
            BST.SPZ.IR.10	46.707	38.448
        '''.strip()
    
        expectation0 = dict(
            AFJ = ['AFJ.SPZ.IR.3	46.812	38.433',
                   'AFJ.SPZ.IR.8	46.84	38.463',
                   'AFJ.SPZ.IR.8	46.84	38.463',],
            AKL = ['AKL.SPZ.IR.11	46.691	38.399',
                   'AKL.SPZ.IR.11	46.691	38.399',
                   'AKL.SPZ.IR.12	46.722	38.407',
                   'AKL.SPZ.IR.12	46.722	38.407',
                   'AKL.SPZ.IR.13	46.654	38.404',
                   'AKL.SPZ.IR.25	46.699	38.442',
                   'AKL.SPZ.IR.3	46.812	38.433',
                   'AKL.SPZ.IR.8	46.84	38.463',],
            ALA = ['ALA.SPZ.IR.3	46.812	38.433',],
            ANAR= ['ANAR.BHZ.IR.8	46.84	38.463',],
            ANJ = ['ANJ.SPZ.IR.13	46.654	38.404',
                   'ANJ.SPZ.IR.18	46.662	38.399',
                   'ANJ.SPZ.IR.3	46.812	38.433',],
            BST = ['BST.SPZ.IR.1	46.732	38.457',
                   'BST.SPZ.IR.10	46.707	38.448',],
        )
    
        result0 = group(io.StringIO(test0), '.')
        try:
            assert result0 == expectation0
        except AssertionError:
            print('\nexpect:\n')
            pprint.pprint(expectation0)
            print('\n\nresult:\n')
            pprint.pprint(result0)
            raise
  10. #6
  11. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,894
    Rep Power
    481
    [edit]Take note! 6367 km better represents the earth radius than the value shown here.[/edit]

    Changed haversine to return the haversine, not a distance. You should hand check one of the coordinate pairs to verify my algebraic transformations. Best to measure the distance on a map and compare that way. In this case you can use a very different method to check a result.

    Used itertools.product to generate the Cartesian product pairs you seem to request.

    Used split and float(numeric_string) to extract the coordinate fields from the strings and convert them to numbers.
    Code:
    #!/usr/bin/env python
    
    # runs in python 3
    # doctest using command line   python -m doctest -v this_file.py
    # Another test: the command    python this_file.py               should have no errors.
    # Correct the expectation and uncomment to restore functionality.
    
    import collections, itertools
    from math import radians, cos, sin, asin, sqrt
    
    def haversine(lon1, lat1, lon2, lat2):
        dlon = lon2 - lon1 
        dlat = lat2 - lat1 
        return sqrt(sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2)
    
    def great_circle_distance(a, b, radius = 6367/2):
        """
            Calculate the great circle distance between two points
            on a sphere. 
            Arguments a and b are iterable specifications of
            (latitude, longitude,) in decimal degrees.  
            The default radius is that of the earth, in km.
        """
        # convert decimal degrees to radians 
        radian_coordinates = map(radians, list(a)+list(b))
        angle = asin(haversine(*radian_coordinates))
        return 2 * radius * angle
    
    def group(inf:'an io input stream', split_character:'easily generalized to a split_function, or to a regular expression' = None):
        '''
            Lines are first stripped, then
            Each line of the string is split into 2 fields.
            The first field is the dictionary key.
            An item is the entire line.
            All items of like keys are appended to a list as the
              value of the dictionary key.
            Group returns that dictionary.
            >>> import io
            >>> LF = '{:c}'.format(10)
            >>> result = dict(group(io.StringIO('a 1 4'+LF+'b 4'+LF+'a 2')))
            >>> result == dict(a = ['a 1 4', 'a 2'], b = ['b 4'])
            True
        '''
        result = collections.defaultdict(list)
        for line in inf:
            stripped_line = line.strip()
            (key, *junk,) = stripped_line.split(sep = split_character, maxsplit = 1)
            result[key].append(stripped_line)
        return result
    
    def main(d):
        for (key, value,) in d.items():
            print('Distances between points in group {}'.format(key))
            for (A, B,) in itertools.product(value, repeat=2):
                (titleA, *a,) = A.split()
                (titleB, *b,) = B.split()
                km = great_circle_distance(map(float, a), map(float, b))
                print('{:9.3f} {} {}'.format(km, titleA, titleB))
    
    if __name__ == '__main__':
    
        import io, pprint
    
        test0 = '''
            AFJ.SPZ.IR.3	46.812	38.433
            AFJ.SPZ.IR.8	46.84	38.463
            AKL.SPZ.IR.11	46.691	38.399
            AKL.SPZ.IR.12	46.722	38.407
            AKL.SPZ.IR.13	46.654	38.404
            AKL.SPZ.IR.25	46.699	38.442
            AKL.SPZ.IR.3	46.812	38.433
            AKL.SPZ.IR.8	46.84	38.463
            ALA.SPZ.IR.3	46.812	38.433
            ANAR.BHZ.IR.8	46.84	38.463
            ANJ.SPZ.IR.13	46.654	38.404
            ANJ.SPZ.IR.18	46.662	38.399
            ANJ.SPZ.IR.27	46.763	38.377
            ANJ.SPZ.IR.3	46.812	38.433
            ANJ.SPZ.IR.8	46.84	38.463
            BST.SPZ.IR.1	46.732	38.457
            BST.SPZ.IR.10	46.707	38.448
        '''.strip()
    
        expectation0 = dict(  # sahar sa agreed these incorrect
            AFJ = ['AFJ.SPZ.IR.3	46.812	38.433',
                   'AFJ.SPZ.IR.8	46.84	38.463',
                   'AFJ.SPZ.IR.8	46.84	38.463',],
            AKL = ['AKL.SPZ.IR.11	46.691	38.399',
                   'AKL.SPZ.IR.11	46.691	38.399',
                   'AKL.SPZ.IR.12	46.722	38.407',
                   'AKL.SPZ.IR.12	46.722	38.407',
                   'AKL.SPZ.IR.13	46.654	38.404',
                   'AKL.SPZ.IR.25	46.699	38.442',
                   'AKL.SPZ.IR.3	46.812	38.433',
                   'AKL.SPZ.IR.8	46.84	38.463',],
            ALA = ['ALA.SPZ.IR.3	46.812	38.433',],
            ANAR= ['ANAR.BHZ.IR.8	46.84	38.463',],
            ANJ = ['ANJ.SPZ.IR.13	46.654	38.404',
                   'ANJ.SPZ.IR.18	46.662	38.399',
                   'ANJ.SPZ.IR.3	46.812	38.433',],
            BST = ['BST.SPZ.IR.1	46.732	38.457',
                   'BST.SPZ.IR.10	46.707	38.448',],
        )
    
        result0 = group(io.StringIO(test0), '.') # sahar sa tacitly agreed these are correct
    
        main(result0)
    
        #try:
        #    assert result0 == expectation0
        #except AssertionError:
        #    print('\nexpect:\n')
        #    pprint.pprint(expectation0)
        #    print('\n\nresult:\n')
        #    pprint.pprint(result0)
        #    raise
        #
    Last edited by b49P23TIvg; November 21st, 2013 at 08:35 AM. Reason: Remark on faulty $R_e$.
    [code]Code tags[/code] are essential for python code and Makefiles!
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2013
    Posts
    18
    Rep Power
    0
    Thank you aaaallllllllot
    I runed as you said and everything was OK.except the output. I examined the calculation result for AFJ two points in this website : http://andrew.hedges.name/experiments/haversine/
    the result in your code is "2.065"km (2.065 AFJ.SPZ.IR.3 AFJ.SPZ.IR.8) but the result in the website is:4.13km
    the reason should be your defined Radius:6367/2 which is 6367 km.


    I really appreciate your kindness and help.You are awsome


    Originally Posted by b49P23TIvg
    Changed haversine to return the haversine, not a distance. You should hand check one of the coordinate pairs to verify my algebraic transformations. Best to measure the distance on a map and compare that way. In this case you can use a very different method to check a result.

    Used itertools.product to generate the Cartesian product pairs you seem to request.

    Used split and float(numeric_string) to extract the coordinate fields from the strings and convert them to numbers.
    Code:
    #!/usr/bin/env python
    
    # runs in python 3
    # doctest using command line   python -m doctest -v this_file.py
    # Another test: the command    python this_file.py               should have no errors.
    # Correct the expectation and uncomment to restore functionality.
    
    import collections, itertools
    from math import radians, cos, sin, asin, sqrt
    
    def haversine(lon1, lat1, lon2, lat2):
        dlon = lon2 - lon1 
        dlat = lat2 - lat1 
        return sqrt(sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2)
    
    def great_circle_distance(a, b, radius = 6367/2):
        """
            Calculate the great circle distance between two points
            on a sphere. 
            Arguments a and b are iterable specifications of
            (latitude, longitude,) in decimal degrees.  
            The default radius is that of the earth, in km.
        """
        # convert decimal degrees to radians 
        radian_coordinates = map(radians, list(a)+list(b))
        angle = asin(haversine(*radian_coordinates))
        return 2 * radius * angle
    
    def group(inf:'an io input stream', split_character:'easily generalized to a split_function, or to a regular expression' = None):
        '''
            Lines are first stripped, then
            Each line of the string is split into 2 fields.
            The first field is the dictionary key.
            An item is the entire line.
            All items of like keys are appended to a list as the
              value of the dictionary key.
            Group returns that dictionary.
            >>> import io
            >>> LF = '{:c}'.format(10)
            >>> result = dict(group(io.StringIO('a 1 4'+LF+'b 4'+LF+'a 2')))
            >>> result == dict(a = ['a 1 4', 'a 2'], b = ['b 4'])
            True
        '''
        result = collections.defaultdict(list)
        for line in inf:
            stripped_line = line.strip()
            (key, *junk,) = stripped_line.split(sep = split_character, maxsplit = 1)
            result[key].append(stripped_line)
        return result
    
    def main(d):
        for (key, value,) in d.items():
            print('Distances between points in group {}'.format(key))
            for (A, B,) in itertools.product(value, repeat=2):
                (titleA, *a,) = A.split()
                (titleB, *b,) = B.split()
                km = great_circle_distance(map(float, a), map(float, b))
                print('{:9.3f} {} {}'.format(km, titleA, titleB))
    
    if __name__ == '__main__':
    
        import io, pprint
    
        test0 = '''
            AFJ.SPZ.IR.3	46.812	38.433
            AFJ.SPZ.IR.8	46.84	38.463
            AKL.SPZ.IR.11	46.691	38.399
            AKL.SPZ.IR.12	46.722	38.407
            AKL.SPZ.IR.13	46.654	38.404
            AKL.SPZ.IR.25	46.699	38.442
            AKL.SPZ.IR.3	46.812	38.433
            AKL.SPZ.IR.8	46.84	38.463
            ALA.SPZ.IR.3	46.812	38.433
            ANAR.BHZ.IR.8	46.84	38.463
            ANJ.SPZ.IR.13	46.654	38.404
            ANJ.SPZ.IR.18	46.662	38.399
            ANJ.SPZ.IR.27	46.763	38.377
            ANJ.SPZ.IR.3	46.812	38.433
            ANJ.SPZ.IR.8	46.84	38.463
            BST.SPZ.IR.1	46.732	38.457
            BST.SPZ.IR.10	46.707	38.448
        '''.strip()
    
        expectation0 = dict(  # sahar sa agreed these incorrect
            AFJ = ['AFJ.SPZ.IR.3	46.812	38.433',
                   'AFJ.SPZ.IR.8	46.84	38.463',
                   'AFJ.SPZ.IR.8	46.84	38.463',],
            AKL = ['AKL.SPZ.IR.11	46.691	38.399',
                   'AKL.SPZ.IR.11	46.691	38.399',
                   'AKL.SPZ.IR.12	46.722	38.407',
                   'AKL.SPZ.IR.12	46.722	38.407',
                   'AKL.SPZ.IR.13	46.654	38.404',
                   'AKL.SPZ.IR.25	46.699	38.442',
                   'AKL.SPZ.IR.3	46.812	38.433',
                   'AKL.SPZ.IR.8	46.84	38.463',],
            ALA = ['ALA.SPZ.IR.3	46.812	38.433',],
            ANAR= ['ANAR.BHZ.IR.8	46.84	38.463',],
            ANJ = ['ANJ.SPZ.IR.13	46.654	38.404',
                   'ANJ.SPZ.IR.18	46.662	38.399',
                   'ANJ.SPZ.IR.3	46.812	38.433',],
            BST = ['BST.SPZ.IR.1	46.732	38.457',
                   'BST.SPZ.IR.10	46.707	38.448',],
        )
    
        result0 = group(io.StringIO(test0), '.') # sahar sa tacitly agreed these are correct
    
        main(result0)
    
        #try:
        #    assert result0 == expectation0
        #except AssertionError:
        #    print('\nexpect:\n')
        #    pprint.pprint(expectation0)
        #    print('\n\nresult:\n')
        #    pprint.pprint(result0)
        #    raise
        #
  14. #8
  15. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,894
    Rep Power
    481
    Originally Posted by wikipedia
    Earth radius is the distance from Earth's center to its surface, about 6,371 kilometers
    Yes, I confused miles and kilometers (roughly a factor of 2).
    Given that a nautical mile is one arc minute (at earth equator, on the surface) multiply 360 degrees by 60 minutes per degree by 1 mile per minute gives the earths circumference in nautical miles, and there being 5 nautical miles per 6 statute miles ... etceteras etceteras all computed with my sorry brain, led me to conclude that you'd supplied the earth's diameter. So I divided by 2.

    Turns out I had to look up the value anyway.

    Allowing for the confusion, my approximation wasn't terrible. Earth diameter in miles is 7800 or so, and the number you provided was 6300. With the exception of the horrible mistake, it was a pretty good guess.

    So change my program. The doc string was incorrect.
    Code:
    def great_circle_distance(a, b, radius = 6367/2):
        """
            Calculate the great circle distance between two points
            on a sphere. 
            Arguments a and b are iterable specifications of
            (latitude, longitude,) in decimal degrees.  
            The default is the half radius of the earth, in km.
        """
        # convert decimal degrees to radians 
        radian_coordinates = map(radians, list(a)+list(b))
        angle = asin(haversine(*radian_coordinates))
        return 2 * radius * angle
    [code]Code tags[/code] are essential for python code and Makefiles!
  16. #9
  17. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2013
    Posts
    18
    Rep Power
    0
    Dear David
    I can't return the value of main func to be written in a text. I've added the following script to your code but it doesn't work.can you help me plz?

    if __name__ == '__main__':

    import io, pprint
    fid=open('text','r+')
    fir=fid.read()
    result0 = group(io.StringIO(fir), '.')
    fout=open("output.txt","w")
    fout.write(str(main(result0)) + "\n")
    fout.close()
  18. #10
  19. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,894
    Rep Power
    481
    The main function prints directly and by default returns None.
    Code:
    import sys
    
    def main(d, output_stream = sys.stdout):
        for (key, value,) in d.items():
            print('Distances between points in group {}'.format(key), file=output_stream)
            for (A, B,) in itertools.product(value, repeat=2):
                (titleA, *a,) = A.split()
                (titleB, *b,) = B.split()
                km = great_circle_distance(map(float, a), map(float, b))
                print('{:9.3f} {} {}'.format(km, titleA, titleB), output_stream)
    
    
    if __name__ == '__main__':
        import io
        fid=open('text','r+')
        fir=fid.read()
        result0 = group(io.StringIO(fir), '.')
        with open("output.txt","w") as fout:
            main(result0, fout)
            fout.write('\n') # if you still need extra new line
    [code]Code tags[/code] are essential for python code and Makefiles!
  20. #11
  21. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2013
    Posts
    18
    Rep Power
    0
    how should it write the output in output.txt?Also main func gets just 1 argument so it gives error with main(result0, fout).
  22. #12
  23. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2013
    Posts
    18
    Rep Power
    0
    Dear David
    sorry for my humble questions!
    do you know a way to have only the distance between every two waveforms without duplicate? I mean: "distance waveform1 waveform2 ", but we have also "distance waveform2 waveform1" in our output,which is useless.
  24. #13
  25. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,894
    Rep Power
    481
    The new definition of main in post 10 takes two arguments, the latter being an open output stream.

    I'll need to review the thread to eliminate one of a,b b,a.

    (Probably)
    Instead of printing, main will build a dictionary having keys be the frozenset([a, b]). This will automatically eliminate duplicates. Some sort of comparison of (a, b) against reversed((b, a)) would also work but without sorting and whatnot we'd end up with an n squared time algorithm and who wants that? In other words, the algorithm works, slowly.
    [code]Code tags[/code] are essential for python code and Makefiles!

IMN logo majestic logo threadwatch logo seochat tools logo