### Thread: Pairwise calculation in a list

1. No Profile Picture
Registered User
Devshed Newbie (0 - 499 posts)

Join Date
Nov 2013
Posts
18
Rep Power
0

#### Pairwise calculation in a list

Dear all
I have a list of data as text and I want to execute a function to this list.As this function caculates the distance between two coordinates I should first make a group of my data which have the same string start with.
1-I do split the first column in each line but don't know how to exploit the lines with the same first string.
2- I also don't know how to execute this function for every two points in its different group
I am busy with this problem for weeks!
this is a part of my list:
AFJ.SPZ.IR.3 46.812 38.433
AFJ.SPZ.IR.8 46.84 38.463
AKL.SPZ.IR.11 46.691 38.399
AKL.SPZ.IR.12 46.722 38.407
AKL.SPZ.IR.13 46.654 38.404
AKL.SPZ.IR.25 46.699 38.442
AKL.SPZ.IR.3 46.812 38.433
AKL.SPZ.IR.8 46.84 38.463
ALA.SPZ.IR.3 46.812 38.433
ANAR.BHZ.IR.8 46.84 38.463
ANJ.SPZ.IR.13 46.654 38.404
ANJ.SPZ.IR.18 46.662 38.399
ANJ.SPZ.IR.27 46.763 38.377
ANJ.SPZ.IR.3 46.812 38.433
ANJ.SPZ.IR.8 46.84 38.463
BST.SPZ.IR.1 46.732 38.457
BST.SPZ.IR.10 46.707 38.448
2. Originally Posted by sahar sa
AFJ.SPZ.IR.3 46.812 38.433
migtht be a "tab separated" string. Which part is the numeric vector? Which is the key?

By distance I presume you want
EuclideanNorm(V1-V2)

I don't know how to proceed after (and because I don't think I split the lines as you intend)
Code:
```# runs in python 3
# doctest using command line   python -m doctest this_file.py

import collections, itertools, io

data = '''
AFJ.SPZ.IR.3	46.812	38.433
AFJ.SPZ.IR.8	46.84	 38.463
AKL.SPZ.IR.11	46.691	38.399
AKL.SPZ.IR.12	46.722	38.407
AKL.SPZ.IR.13	46.654	38.404
AKL.SPZ.IR.25	46.699	38.442
AKL.SPZ.IR.3	46.812	38.433
AKL.SPZ.IR.8	46.84	 38.463
ALA.SPZ.IR.3	46.812	38.433
ANAR.BHZ.IR.8	46.84 38.463
ANJ.SPZ.IR.13	46.654	38.404
ANJ.SPZ.IR.18	46.662	38.399
ANJ.SPZ.IR.27	46.763	38.377
ANJ.SPZ.IR.3	46.812	38.433
ANJ.SPZ.IR.8	46.84 38.463
BST.SPZ.IR.1	46.732	38.457
BST.SPZ.IR.10	46.707	38.448
'''.strip()

def group(s:'multi-line string'):
'''
Each line of the string is split into 2 fields.
The first field is the dictionary key.
The second field is the associated item.
All items of like keys are appended to a list as the
value of the dictioary key.
Group returns that dictionary.
>>> LF = '{:c}'.format(10)
>>> # print(dict(group('a 1 4'+LF+'b 4'+LF+'a 2')))
>>> dict(group('a 1 4'+LF+'b 4'+LF+'a 2')) == dict(a = ['1 4', '2'], b = ['4'])
True
'''
result = collections.defaultdict(list)
for line in io.StringIO(s):
(key, item) = line.strip().split(maxsplit = 1)
result[key].append(item)
return result

def convert_numeric_string_to_vector(s):
'''
>>> convert_numeric_string_to_vector('1 2 3.75') == [1, 2, 3.75]
True
'''
return [float(representation) for representation in s.split()]

def main(data = data):
dictionary_of_unknown_unguessed = group(data)
d = dictionary_of_unknown_unguessed
for (key, value,) in d.items():
L = d[key]
for (i, item,) in enumerate(L):
L[i] = convert_numeric_string_to_vector(item)
d[key] = L
return d

if __name__ == '__main__':
import pprint
pprint.pprint(main())

run_the_module_to_display = '''
{'AFJ.SPZ.IR.3': [[46.812, 38.433]],
'AFJ.SPZ.IR.8': [[46.84, 38.463]],
'AKL.SPZ.IR.11': [[46.691, 38.399]],
'AKL.SPZ.IR.12': [[46.722, 38.407]],
'AKL.SPZ.IR.13': [[46.654, 38.404]],
'AKL.SPZ.IR.25': [[46.699, 38.442]],
'AKL.SPZ.IR.3': [[46.812, 38.433]],
'AKL.SPZ.IR.8': [[46.84, 38.463]],
'ALA.SPZ.IR.3': [[46.812, 38.433]],
'ANAR.BHZ.IR.8': [[46.84, 38.463]],
'ANJ.SPZ.IR.13': [[46.654, 38.404]],
'ANJ.SPZ.IR.18': [[46.662, 38.399]],
'ANJ.SPZ.IR.27': [[46.763, 38.377]],
'ANJ.SPZ.IR.3': [[46.812, 38.433]],
'ANJ.SPZ.IR.8': [[46.84, 38.463]],
'BST.SPZ.IR.1': [[46.732, 38.457]],
'BST.SPZ.IR.10': [[46.707, 38.448]]}
'''```
3. No Profile Picture
Registered User
Devshed Newbie (0 - 499 posts)

Join Date
Nov 2013
Posts
18
Rep Power
0
Thank you alot
I was looking for this classification:
AFJ ['AFJ.SPZ.IR.3 46.812 38.433', 'AFJ.SPZ.IR.8 46.84 38.463', 'AFJ.SPZ.IR.8 46.84 38.463']
AKL ['AKL.SPZ.IR.11 46.691 38.399', 'AKL.SPZ.IR.11 46.691 38.399', 'AKL.SPZ.IR.12 46.722 38.407', 'AKL.SPZ.IR.12 46.722 38.407', 'AKL.SPZ.IR.13 46.654 38.404', 'AKL.SPZ.IR.25 46.699 38.442', 'AKL.SPZ.IR.3 46.812 38.433', 'AKL.SPZ.IR.8 46.84 38.463']
ALA ['ALA.SPZ.IR.3 46.812 38.433']
ANAR ['ANAR.BHZ.IR.8 46.84 38.463']
ANJ ['ANJ.SPZ.IR.13 46.654 38.404', 'ANJ.SPZ.IR.18 46.662 38.399', 'ANJ.SPZ.IR.3 46.812 38.433']

Although I don't know yet how to execute my function pairwisely to the coordinates in every class!

Originally Posted by b49P23TIvg
migtht be a "tab separated" string. Which part is the numeric vector? Which is the key?

By distance I presume you want
EuclideanNorm(V1-V2)

I don't know how to proceed after (and because I don't think I split the lines as you intend)
Code:
```# runs in python 3
# doctest using command line   python -m doctest this_file.py

import collections, itertools, io

data = '''
AFJ.SPZ.IR.3	46.812	38.433
AFJ.SPZ.IR.8	46.84	 38.463
AKL.SPZ.IR.11	46.691	38.399
AKL.SPZ.IR.12	46.722	38.407
AKL.SPZ.IR.13	46.654	38.404
AKL.SPZ.IR.25	46.699	38.442
AKL.SPZ.IR.3	46.812	38.433
AKL.SPZ.IR.8	46.84	 38.463
ALA.SPZ.IR.3	46.812	38.433
ANAR.BHZ.IR.8	46.84 38.463
ANJ.SPZ.IR.13	46.654	38.404
ANJ.SPZ.IR.18	46.662	38.399
ANJ.SPZ.IR.27	46.763	38.377
ANJ.SPZ.IR.3	46.812	38.433
ANJ.SPZ.IR.8	46.84 38.463
BST.SPZ.IR.1	46.732	38.457
BST.SPZ.IR.10	46.707	38.448
'''.strip()

def group(s:'multi-line string'):
'''
Each line of the string is split into 2 fields.
The first field is the dictionary key.
The second field is the associated item.
All items of like keys are appended to a list as the
value of the dictioary key.
Group returns that dictionary.
>>> LF = '{:c}'.format(10)
>>> # print(dict(group('a 1 4'+LF+'b 4'+LF+'a 2')))
>>> dict(group('a 1 4'+LF+'b 4'+LF+'a 2')) == dict(a = ['1 4', '2'], b = ['4'])
True
'''
result = collections.defaultdict(list)
for line in io.StringIO(s):
(key, item) = line.strip().split(maxsplit = 1)
result[key].append(item)
return result

def convert_numeric_string_to_vector(s):
'''
>>> convert_numeric_string_to_vector('1 2 3.75') == [1, 2, 3.75]
True
'''
return [float(representation) for representation in s.split()]

def main(data = data):
dictionary_of_unknown_unguessed = group(data)
d = dictionary_of_unknown_unguessed
for (key, value,) in d.items():
L = d[key]
for (i, item,) in enumerate(L):
L[i] = convert_numeric_string_to_vector(item)
d[key] = L
return d

if __name__ == '__main__':
import pprint
pprint.pprint(main())

run_the_module_to_display = '''
{'AFJ.SPZ.IR.3': [[46.812, 38.433]],
'AFJ.SPZ.IR.8': [[46.84, 38.463]],
'AKL.SPZ.IR.11': [[46.691, 38.399]],
'AKL.SPZ.IR.12': [[46.722, 38.407]],
'AKL.SPZ.IR.13': [[46.654, 38.404]],
'AKL.SPZ.IR.25': [[46.699, 38.442]],
'AKL.SPZ.IR.3': [[46.812, 38.433]],
'AKL.SPZ.IR.8': [[46.84, 38.463]],
'ALA.SPZ.IR.3': [[46.812, 38.433]],
'ANAR.BHZ.IR.8': [[46.84, 38.463]],
'ANJ.SPZ.IR.13': [[46.654, 38.404]],
'ANJ.SPZ.IR.18': [[46.662, 38.399]],
'ANJ.SPZ.IR.27': [[46.763, 38.377]],
'ANJ.SPZ.IR.3': [[46.812, 38.433]],
'ANJ.SPZ.IR.8': [[46.84, 38.463]],
'BST.SPZ.IR.1': [[46.732, 38.457]],
'BST.SPZ.IR.10': [[46.707, 38.448]]}
'''```
4. #### OK, I give up.

Originally Posted by sahar sa
I was looking for this classification:
AFJ ['AFJ.SPZ.IR.3 46.812 38.433', 'AFJ.SPZ.IR.8 46.84 38.463', 'AFJ.SPZ.IR.8 46.84 38.463']
...
Why is AFJ.SPZ.IR.8 46.84 38.463 repeated? (other duplicates exist)
Why is the BST key omitted? I find no indication in your descriptions that would give anyone a clue about the algorithm.

Either you're careless or you don't know how to specify the problem clearly. Small wonder you've struggled for weeks. Or I could be insufficiently smart.

Here's a program that might do part of what you actually want. It does not do what you say you want.
Code:
```# runs in python 3
# doctest using command line   python -m doctest -v this_file.py
# Another test: the command    python this_file.py               should have no errors.

import pdb                              # yes, needed it.
import collections, itertools

def group(inf:'an io input stream', split_character:'easily generalized to a split_function, or to a regular expression' = None):
'''
Lines are first stripped, then
Each line of the string is split into 2 fields.
The first field is the dictionary key.
An item is the entire line.
All items of like keys are appended to a list as the
value of the dictionary key.
Group returns that dictionary.
>>> import io
>>> LF = '{:c}'.format(10)
>>> result = dict(group(io.StringIO('a 1 4'+LF+'b 4'+LF+'a 2')))
>>> result == dict(a = ['a 1 4', 'a 2'], b = ['b 4'])
True
'''
result = collections.defaultdict(list)
for line in inf:
stripped_line = line.strip()
(key, *junk,) = stripped_line.split(sep = split_character, maxsplit = 1)
result[key].append(stripped_line)
return result

if __name__ == '__main__':

import io, pprint

test0 = '''
AFJ.SPZ.IR.3	46.812	38.433
AFJ.SPZ.IR.8	46.84	38.463
AKL.SPZ.IR.11	46.691	38.399
AKL.SPZ.IR.12	46.722	38.407
AKL.SPZ.IR.13	46.654	38.404
AKL.SPZ.IR.25	46.699	38.442
AKL.SPZ.IR.3	46.812	38.433
AKL.SPZ.IR.8	46.84	38.463
ALA.SPZ.IR.3	46.812	38.433
ANAR.BHZ.IR.8	46.84	38.463
ANJ.SPZ.IR.13	46.654	38.404
ANJ.SPZ.IR.18	46.662	38.399
ANJ.SPZ.IR.27	46.763	38.377
ANJ.SPZ.IR.3	46.812	38.433
ANJ.SPZ.IR.8	46.84	38.463
BST.SPZ.IR.1	46.732	38.457
BST.SPZ.IR.10	46.707	38.448
'''.strip()

expectation0 = dict(
AFJ = ['AFJ.SPZ.IR.3	46.812	38.433',
'AFJ.SPZ.IR.8	46.84	38.463',
'AFJ.SPZ.IR.8	46.84	38.463',],
AKL = ['AKL.SPZ.IR.11	46.691	38.399',
'AKL.SPZ.IR.11	46.691	38.399',
'AKL.SPZ.IR.12	46.722	38.407',
'AKL.SPZ.IR.12	46.722	38.407',
'AKL.SPZ.IR.13	46.654	38.404',
'AKL.SPZ.IR.25	46.699	38.442',
'AKL.SPZ.IR.3	46.812	38.433',
'AKL.SPZ.IR.8	46.84	38.463',],
ALA = ['ALA.SPZ.IR.3	46.812	38.433',],
ANAR= ['ANAR.BHZ.IR.8	46.84	38.463',],
ANJ = ['ANJ.SPZ.IR.13	46.654	38.404',
'ANJ.SPZ.IR.18	46.662	38.399',
'ANJ.SPZ.IR.3	46.812	38.433',],
BST = ['BST.SPZ.IR.1	46.732	38.457',
'BST.SPZ.IR.10	46.707	38.448',],
)

result0 = group(io.StringIO(test0), '.')
try:
assert result0 == expectation0
except AssertionError:
print('\nexpect:\n')
pprint.pprint(expectation0)
print('\n\nresult:\n')
pprint.pprint(result0)
raise```
Last edited by b49P23TIvg; November 20th, 2013 at 07:44 AM.
5. No Profile Picture
Registered User
Devshed Newbie (0 - 499 posts)

Join Date
Nov 2013
Posts
18
Rep Power
0
Thank you alot for your time and codes
Actually AFJ repeatation was a mistake of pasting!I didn't paste all the data I just wanted to show the grouping.
Excuse me for my carelessness!
this is the fuction which calculates the distance between coordinates:
#!/usr/bin/env python
from math import radians, cos, sin, asin, sqrt

def haversine(lon1, lat1, lon2, lat2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
"""
# convert decimal degrees to radians
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
km = 6367 * c
return km

I should execute it in every group and calculate the distance between every two coordinates in that group.

Originally Posted by b49P23TIvg
Why is AFJ.SPZ.IR.8 46.84 38.463 repeated? (other duplicates exist)
Why is the BST key omitted? I find no indication in your descriptions that would give anyone a clue about the algorithm.

Either you're careless or you don't know how to specify the problem clearly. Small wonder you've struggled for weeks. Or I could be insufficiently smart.

Here's a program that might do part of what you actually want. It does not do what you say you want.
Code:
```# runs in python 3
# doctest using command line   python -m doctest -v this_file.py
# Another test: the command    python this_file.py               should have no errors.

import pdb                              # yes, needed it.
import collections, itertools

def group(inf:'an io input stream', split_character:'easily generalized to a split_function, or to a regular expression' = None):
'''
Lines are first stripped, then
Each line of the string is split into 2 fields.
The first field is the dictionary key.
An item is the entire line.
All items of like keys are appended to a list as the
value of the dictionary key.
Group returns that dictionary.
>>> import io
>>> LF = '{:c}'.format(10)
>>> result = dict(group(io.StringIO('a 1 4'+LF+'b 4'+LF+'a 2')))
>>> result == dict(a = ['a 1 4', 'a 2'], b = ['b 4'])
True
'''
result = collections.defaultdict(list)
for line in inf:
stripped_line = line.strip()
(key, *junk,) = stripped_line.split(sep = split_character, maxsplit = 1)
result[key].append(stripped_line)
return result

if __name__ == '__main__':

import io, pprint

test0 = '''
AFJ.SPZ.IR.3	46.812	38.433
AFJ.SPZ.IR.8	46.84	38.463
AKL.SPZ.IR.11	46.691	38.399
AKL.SPZ.IR.12	46.722	38.407
AKL.SPZ.IR.13	46.654	38.404
AKL.SPZ.IR.25	46.699	38.442
AKL.SPZ.IR.3	46.812	38.433
AKL.SPZ.IR.8	46.84	38.463
ALA.SPZ.IR.3	46.812	38.433
ANAR.BHZ.IR.8	46.84	38.463
ANJ.SPZ.IR.13	46.654	38.404
ANJ.SPZ.IR.18	46.662	38.399
ANJ.SPZ.IR.27	46.763	38.377
ANJ.SPZ.IR.3	46.812	38.433
ANJ.SPZ.IR.8	46.84	38.463
BST.SPZ.IR.1	46.732	38.457
BST.SPZ.IR.10	46.707	38.448
'''.strip()

expectation0 = dict(
AFJ = ['AFJ.SPZ.IR.3	46.812	38.433',
'AFJ.SPZ.IR.8	46.84	38.463',
'AFJ.SPZ.IR.8	46.84	38.463',],
AKL = ['AKL.SPZ.IR.11	46.691	38.399',
'AKL.SPZ.IR.11	46.691	38.399',
'AKL.SPZ.IR.12	46.722	38.407',
'AKL.SPZ.IR.12	46.722	38.407',
'AKL.SPZ.IR.13	46.654	38.404',
'AKL.SPZ.IR.25	46.699	38.442',
'AKL.SPZ.IR.3	46.812	38.433',
'AKL.SPZ.IR.8	46.84	38.463',],
ALA = ['ALA.SPZ.IR.3	46.812	38.433',],
ANAR= ['ANAR.BHZ.IR.8	46.84	38.463',],
ANJ = ['ANJ.SPZ.IR.13	46.654	38.404',
'ANJ.SPZ.IR.18	46.662	38.399',
'ANJ.SPZ.IR.3	46.812	38.433',],
BST = ['BST.SPZ.IR.1	46.732	38.457',
'BST.SPZ.IR.10	46.707	38.448',],
)

result0 = group(io.StringIO(test0), '.')
try:
assert result0 == expectation0
except AssertionError:
print('\nexpect:\n')
pprint.pprint(expectation0)
print('\n\nresult:\n')
pprint.pprint(result0)
raise```
6. Take note! 6367 km better represents the earth radius than the value shown here.[/edit]

Changed haversine to return the haversine, not a distance. You should hand check one of the coordinate pairs to verify my algebraic transformations. Best to measure the distance on a map and compare that way. In this case you can use a very different method to check a result.

Used itertools.product to generate the Cartesian product pairs you seem to request.

Used split and float(numeric_string) to extract the coordinate fields from the strings and convert them to numbers.
Code:
```#!/usr/bin/env python

# runs in python 3
# doctest using command line   python -m doctest -v this_file.py
# Another test: the command    python this_file.py               should have no errors.
# Correct the expectation and uncomment to restore functionality.

import collections, itertools
from math import radians, cos, sin, asin, sqrt

def haversine(lon1, lat1, lon2, lat2):
dlon = lon2 - lon1
dlat = lat2 - lat1
return sqrt(sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2)

def great_circle_distance(a, b, radius = 6367/2):
"""
Calculate the great circle distance between two points
on a sphere.
Arguments a and b are iterable specifications of
(latitude, longitude,) in decimal degrees.
The default radius is that of the earth, in km.
"""
# convert decimal degrees to radians
return 2 * radius * angle

def group(inf:'an io input stream', split_character:'easily generalized to a split_function, or to a regular expression' = None):
'''
Lines are first stripped, then
Each line of the string is split into 2 fields.
The first field is the dictionary key.
An item is the entire line.
All items of like keys are appended to a list as the
value of the dictionary key.
Group returns that dictionary.
>>> import io
>>> LF = '{:c}'.format(10)
>>> result = dict(group(io.StringIO('a 1 4'+LF+'b 4'+LF+'a 2')))
>>> result == dict(a = ['a 1 4', 'a 2'], b = ['b 4'])
True
'''
result = collections.defaultdict(list)
for line in inf:
stripped_line = line.strip()
(key, *junk,) = stripped_line.split(sep = split_character, maxsplit = 1)
result[key].append(stripped_line)
return result

def main(d):
for (key, value,) in d.items():
print('Distances between points in group {}'.format(key))
for (A, B,) in itertools.product(value, repeat=2):
(titleA, *a,) = A.split()
(titleB, *b,) = B.split()
km = great_circle_distance(map(float, a), map(float, b))
print('{:9.3f} {} {}'.format(km, titleA, titleB))

if __name__ == '__main__':

import io, pprint

test0 = '''
AFJ.SPZ.IR.3	46.812	38.433
AFJ.SPZ.IR.8	46.84	38.463
AKL.SPZ.IR.11	46.691	38.399
AKL.SPZ.IR.12	46.722	38.407
AKL.SPZ.IR.13	46.654	38.404
AKL.SPZ.IR.25	46.699	38.442
AKL.SPZ.IR.3	46.812	38.433
AKL.SPZ.IR.8	46.84	38.463
ALA.SPZ.IR.3	46.812	38.433
ANAR.BHZ.IR.8	46.84	38.463
ANJ.SPZ.IR.13	46.654	38.404
ANJ.SPZ.IR.18	46.662	38.399
ANJ.SPZ.IR.27	46.763	38.377
ANJ.SPZ.IR.3	46.812	38.433
ANJ.SPZ.IR.8	46.84	38.463
BST.SPZ.IR.1	46.732	38.457
BST.SPZ.IR.10	46.707	38.448
'''.strip()

expectation0 = dict(  # sahar sa agreed these incorrect
AFJ = ['AFJ.SPZ.IR.3	46.812	38.433',
'AFJ.SPZ.IR.8	46.84	38.463',
'AFJ.SPZ.IR.8	46.84	38.463',],
AKL = ['AKL.SPZ.IR.11	46.691	38.399',
'AKL.SPZ.IR.11	46.691	38.399',
'AKL.SPZ.IR.12	46.722	38.407',
'AKL.SPZ.IR.12	46.722	38.407',
'AKL.SPZ.IR.13	46.654	38.404',
'AKL.SPZ.IR.25	46.699	38.442',
'AKL.SPZ.IR.3	46.812	38.433',
'AKL.SPZ.IR.8	46.84	38.463',],
ALA = ['ALA.SPZ.IR.3	46.812	38.433',],
ANAR= ['ANAR.BHZ.IR.8	46.84	38.463',],
ANJ = ['ANJ.SPZ.IR.13	46.654	38.404',
'ANJ.SPZ.IR.18	46.662	38.399',
'ANJ.SPZ.IR.3	46.812	38.433',],
BST = ['BST.SPZ.IR.1	46.732	38.457',
'BST.SPZ.IR.10	46.707	38.448',],
)

result0 = group(io.StringIO(test0), '.') # sahar sa tacitly agreed these are correct

main(result0)

#try:
#    assert result0 == expectation0
#except AssertionError:
#    print('\nexpect:\n')
#    pprint.pprint(expectation0)
#    print('\n\nresult:\n')
#    pprint.pprint(result0)
#    raise
#```
Last edited by b49P23TIvg; November 21st, 2013 at 08:35 AM. Reason: Remark on faulty \$R_e\$.
7. No Profile Picture
Registered User
Devshed Newbie (0 - 499 posts)

Join Date
Nov 2013
Posts
18
Rep Power
0
Thank you aaaallllllllot
I runed as you said and everything was OK.except the output. I examined the calculation result for AFJ two points in this website : http://andrew.hedges.name/experiments/haversine/
the result in your code is "2.065"km (2.065 AFJ.SPZ.IR.3 AFJ.SPZ.IR.8) but the result in the website is:4.13km

I really appreciate your kindness and help.You are awsome

Originally Posted by b49P23TIvg
Changed haversine to return the haversine, not a distance. You should hand check one of the coordinate pairs to verify my algebraic transformations. Best to measure the distance on a map and compare that way. In this case you can use a very different method to check a result.

Used itertools.product to generate the Cartesian product pairs you seem to request.

Used split and float(numeric_string) to extract the coordinate fields from the strings and convert them to numbers.
Code:
```#!/usr/bin/env python

# runs in python 3
# doctest using command line   python -m doctest -v this_file.py
# Another test: the command    python this_file.py               should have no errors.
# Correct the expectation and uncomment to restore functionality.

import collections, itertools
from math import radians, cos, sin, asin, sqrt

def haversine(lon1, lat1, lon2, lat2):
dlon = lon2 - lon1
dlat = lat2 - lat1
return sqrt(sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2)

def great_circle_distance(a, b, radius = 6367/2):
"""
Calculate the great circle distance between two points
on a sphere.
Arguments a and b are iterable specifications of
(latitude, longitude,) in decimal degrees.
The default radius is that of the earth, in km.
"""
# convert decimal degrees to radians
return 2 * radius * angle

def group(inf:'an io input stream', split_character:'easily generalized to a split_function, or to a regular expression' = None):
'''
Lines are first stripped, then
Each line of the string is split into 2 fields.
The first field is the dictionary key.
An item is the entire line.
All items of like keys are appended to a list as the
value of the dictionary key.
Group returns that dictionary.
>>> import io
>>> LF = '{:c}'.format(10)
>>> result = dict(group(io.StringIO('a 1 4'+LF+'b 4'+LF+'a 2')))
>>> result == dict(a = ['a 1 4', 'a 2'], b = ['b 4'])
True
'''
result = collections.defaultdict(list)
for line in inf:
stripped_line = line.strip()
(key, *junk,) = stripped_line.split(sep = split_character, maxsplit = 1)
result[key].append(stripped_line)
return result

def main(d):
for (key, value,) in d.items():
print('Distances between points in group {}'.format(key))
for (A, B,) in itertools.product(value, repeat=2):
(titleA, *a,) = A.split()
(titleB, *b,) = B.split()
km = great_circle_distance(map(float, a), map(float, b))
print('{:9.3f} {} {}'.format(km, titleA, titleB))

if __name__ == '__main__':

import io, pprint

test0 = '''
AFJ.SPZ.IR.3	46.812	38.433
AFJ.SPZ.IR.8	46.84	38.463
AKL.SPZ.IR.11	46.691	38.399
AKL.SPZ.IR.12	46.722	38.407
AKL.SPZ.IR.13	46.654	38.404
AKL.SPZ.IR.25	46.699	38.442
AKL.SPZ.IR.3	46.812	38.433
AKL.SPZ.IR.8	46.84	38.463
ALA.SPZ.IR.3	46.812	38.433
ANAR.BHZ.IR.8	46.84	38.463
ANJ.SPZ.IR.13	46.654	38.404
ANJ.SPZ.IR.18	46.662	38.399
ANJ.SPZ.IR.27	46.763	38.377
ANJ.SPZ.IR.3	46.812	38.433
ANJ.SPZ.IR.8	46.84	38.463
BST.SPZ.IR.1	46.732	38.457
BST.SPZ.IR.10	46.707	38.448
'''.strip()

expectation0 = dict(  # sahar sa agreed these incorrect
AFJ = ['AFJ.SPZ.IR.3	46.812	38.433',
'AFJ.SPZ.IR.8	46.84	38.463',
'AFJ.SPZ.IR.8	46.84	38.463',],
AKL = ['AKL.SPZ.IR.11	46.691	38.399',
'AKL.SPZ.IR.11	46.691	38.399',
'AKL.SPZ.IR.12	46.722	38.407',
'AKL.SPZ.IR.12	46.722	38.407',
'AKL.SPZ.IR.13	46.654	38.404',
'AKL.SPZ.IR.25	46.699	38.442',
'AKL.SPZ.IR.3	46.812	38.433',
'AKL.SPZ.IR.8	46.84	38.463',],
ALA = ['ALA.SPZ.IR.3	46.812	38.433',],
ANAR= ['ANAR.BHZ.IR.8	46.84	38.463',],
ANJ = ['ANJ.SPZ.IR.13	46.654	38.404',
'ANJ.SPZ.IR.18	46.662	38.399',
'ANJ.SPZ.IR.3	46.812	38.433',],
BST = ['BST.SPZ.IR.1	46.732	38.457',
'BST.SPZ.IR.10	46.707	38.448',],
)

result0 = group(io.StringIO(test0), '.') # sahar sa tacitly agreed these are correct

main(result0)

#try:
#    assert result0 == expectation0
#except AssertionError:
#    print('\nexpect:\n')
#    pprint.pprint(expectation0)
#    print('\n\nresult:\n')
#    pprint.pprint(result0)
#    raise
#```
8. Originally Posted by wikipedia
Earth radius is the distance from Earth's center to its surface, about 6,371 kilometers
Yes, I confused miles and kilometers (roughly a factor of 2).
Given that a nautical mile is one arc minute (at earth equator, on the surface) multiply 360 degrees by 60 minutes per degree by 1 mile per minute gives the earths circumference in nautical miles, and there being 5 nautical miles per 6 statute miles ... etceteras etceteras all computed with my sorry brain, led me to conclude that you'd supplied the earth's diameter. So I divided by 2.

Turns out I had to look up the value anyway.

Allowing for the confusion, my approximation wasn't terrible. Earth diameter in miles is 7800 or so, and the number you provided was 6300. With the exception of the horrible mistake, it was a pretty good guess.

So change my program. The doc string was incorrect.
Code:
```def great_circle_distance(a, b, radius = 6367/2):
"""
Calculate the great circle distance between two points
on a sphere.
Arguments a and b are iterable specifications of
(latitude, longitude,) in decimal degrees.
The default is the half radius of the earth, in km.
"""
# convert decimal degrees to radians
return 2 * radius * angle```
9. No Profile Picture
Registered User
Devshed Newbie (0 - 499 posts)

Join Date
Nov 2013
Posts
18
Rep Power
0
Dear David
I can't return the value of main func to be written in a text. I've added the following script to your code but it doesn't work.can you help me plz?

if __name__ == '__main__':

import io, pprint
fid=open('text','r+')
result0 = group(io.StringIO(fir), '.')
fout=open("output.txt","w")
fout.write(str(main(result0)) + "\n")
fout.close()
10. The main function prints directly and by default returns None.
Code:
```import sys

def main(d, output_stream = sys.stdout):
for (key, value,) in d.items():
print('Distances between points in group {}'.format(key), file=output_stream)
for (A, B,) in itertools.product(value, repeat=2):
(titleA, *a,) = A.split()
(titleB, *b,) = B.split()
km = great_circle_distance(map(float, a), map(float, b))
print('{:9.3f} {} {}'.format(km, titleA, titleB), output_stream)

if __name__ == '__main__':
import io
fid=open('text','r+')
result0 = group(io.StringIO(fir), '.')
with open("output.txt","w") as fout:
main(result0, fout)
fout.write('\n') # if you still need extra new line```
11. No Profile Picture
Registered User
Devshed Newbie (0 - 499 posts)

Join Date
Nov 2013
Posts
18
Rep Power
0
how should it write the output in output.txt?Also main func gets just 1 argument so it gives error with main(result0, fout).
12. No Profile Picture
Registered User
Devshed Newbie (0 - 499 posts)

Join Date
Nov 2013
Posts
18
Rep Power
0
Dear David
sorry for my humble questions!
do you know a way to have only the distance between every two waveforms without duplicate? I mean: "distance waveform1 waveform2 ", but we have also "distance waveform2 waveform1" in our output,which is useless.
13. The new definition of main in post 10 takes two arguments, the latter being an open output stream.

I'll need to review the thread to eliminate one of a,b b,a.

(Probably)
Instead of printing, main will build a dictionary having keys be the frozenset([a, b]). This will automatically eliminate duplicates. Some sort of comparison of (a, b) against reversed((b, a)) would also work but without sorting and whatnot we'd end up with an n squared time algorithm and who wants that? In other words, the algorithm works, slowly.