#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Posts
    7
    Rep Power
    0

    Why is it that the confidence of the second tuple in L not calculated?


    Code:
    supportData = {('nas','fat'): 0.5, ('nas'): 1.0, ('fat'):0.6, ('van'):0.72, ('jos'):0.55,('van','jos'):0.10}
    
    L = [('nas','fat'),('van','jos')]
    
    #for i in L:
    for freqSet in L:#only get the sets with two or more items
        H = [''.join(list(i)) for i in freqSet]
        
        for conseq in H:
            print H
            freqsetlist = list(freqSet)
            freqsetlist.remove(conseq)
            print freqsetlist
            conf = supportData[freqSet]/supportData[tuple(freqsetlist)[0]]
            if conf >= 0.5:
                  print freqsetlist,'-->',conseq,'conf:',conf
    If I run the code I get:
    Code:
    ['nas', 'fat']
    ['fat']
    ['fat'] --> nas conf: 0.833333333333
    ['nas', 'fat']
    ['nas']
    ['nas'] --> fat conf: 0.5
    ['van', 'jos']
    ['jos']
    ['van', 'jos']
    ['van']
  2. #2
  3. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,894
    Rep Power
    481
    Given that your program doesn't work, and that I couldn't find the result you anticipated, I've modified your program to print better diagnostic information. First, you have this overly complicated statement
    H = [''.join(list(i)) for i in freqSet]
    which is equivalent to
    H = list(freqSet)
    making me think the program doesn't do what you expect.
    Code:
    supportData = {('nas','fat'): 0.5, ('nas'): 1.0, ('fat'):0.6, ('van'):0.72, ('jos'):0.55,('van','jos'):0.10}
    
    L = [('nas','fat'),('van','jos')]
    
    for freqSet in L: #only get the sets with two or more items
        H = list(freqSet)
        for conseq in H:
            freqsetlist = list(freqSet)
            freqsetlist.remove(conseq)
            print'numerator   key:value  == %s:%f'%(freqSet,supportData[freqSet])
            key = tuple(freqsetlist)[0]
            print'denominator key:value  == %s:%f'%(key,supportData[key])
            conf = supportData[freqSet]/supportData[tuple(freqsetlist)[0]]
            if conf >= 0.5:
                  print freqsetlist,'-->',conseq,'conf:',conf
    Run:
    Code:
    $ ( cd /tmp && python q.py )
    numerator   key:value  == ('nas', 'fat'):0.500000
    denominator key:value  == fat:0.600000
    ['fat'] --> nas conf: 0.833333333333
    numerator   key:value  == ('nas', 'fat'):0.500000
    denominator key:value  == nas:1.000000
    ['nas'] --> fat conf: 0.5
    numerator   key:value  == ('van', 'jos'):0.100000
    denominator key:value  == jos:0.550000
    numerator   key:value  == ('van', 'jos'):0.100000
    denominator key:value  == van:0.720000
    [code]Code tags[/code] are essential for python code and Makefiles!
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Posts
    7
    Rep Power
    0
    Originally Posted by b49P23TIvg
    Given that your program doesn't work, and that I couldn't find the result you anticipated, I've modified your program to print better diagnostic information. First, you have this overly complicated statement
    H = [''.join(list(i)) for i in freqSet]
    which is equivalent to
    H = list(freqSet)
    making me think the program doesn't do what you expect.
    Code:
    supportData = {('nas','fat'): 0.5, ('nas'): 1.0, ('fat'):0.6, ('van'):0.72, ('jos'):0.55,('van','jos'):0.10}
    
    L = [('nas','fat'),('van','jos')]
    
    for freqSet in L: #only get the sets with two or more items
        H = list(freqSet)
        for conseq in H:
            freqsetlist = list(freqSet)
            freqsetlist.remove(conseq)
            print'numerator   key:value  == %s:%f'%(freqSet,supportData[freqSet])
            key = tuple(freqsetlist)[0]
            print'denominator key:value  == %s:%f'%(key,supportData[key])
            conf = supportData[freqSet]/supportData[tuple(freqsetlist)[0]]
            if conf >= 0.5:
                  print freqsetlist,'-->',conseq,'conf:',conf
    Run:
    Code:
    $ ( cd /tmp && python q.py )
    numerator   key:value  == ('nas', 'fat'):0.500000
    denominator key:value  == fat:0.600000
    ['fat'] --> nas conf: 0.833333333333
    numerator   key:value  == ('nas', 'fat'):0.500000
    denominator key:value  == nas:1.000000
    ['nas'] --> fat conf: 0.5
    numerator   key:value  == ('van', 'jos'):0.100000
    denominator key:value  == jos:0.550000
    numerator   key:value  == ('van', 'jos'):0.100000
    denominator key:value  == van:0.720000


    But the confidence for tuple ('van','jos') hasn't been calculated like that of ('nas','fat'). The question is, what is stopping the the confidence calculation for the second tuple ?
  6. #4
  7. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,894
    Rep Power
    481
    nothing's stopping the calculation.
    10/55 is less than a half,
    10/72 is also less than a half.
    This statement prevents them from printing:
    Code:
            if conf >= 0.5:
                  print freqsetlist,'-->',conseq,'conf:',conf
    [code]Code tags[/code] are essential for python code and Makefiles!
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Posts
    7
    Rep Power
    0
    Originally Posted by b49P23TIvg
    nothing's stopping the calculation.
    10/55 is less than a half,
    10/72 is also less than a half.
    This statement prevents them from printing:
    Code:
            if conf >= 0.5:
                  print freqsetlist,'-->',conseq,'conf:',conf



    Poor me! Thank you. I am very grateful

IMN logo majestic logo threadwatch logo seochat tools logo