#1
  1. No Profile Picture
    .
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2002
    Posts
    296
    Rep Power
    12

    shannon's information theory: log e, or log2 ?


    shannon's information theory that looks like this:

    H = - E P(i) log P(i)

    (E representing sigma, and P(i) the probability of message i)

    the log: should that be natural e log, or log2 does anyone know?
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2003
    Location
    Bucharest, Romania
    Posts
    70
    Rep Power
    12

    Wink


    log2

    Should be obvious, since the code's alphabet is {0,1}
    Andrei
  4. #3
  5. No Profile Picture
    .
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2002
    Posts
    296
    Rep Power
    12
    thanks. hmm, might be obvious to you but not me, and i'm still not sure beacuse in a book i've got it says "log e" within that formula. also in the original paper it's just log, and isn't log on it's own log e? it is on my calculator and in the c language in any case. but then i've seen log2 in a version of that formula on a www page somewhere and what you say makes sense, so i'm not sure. not sure how to find out defintely either.
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2003
    Location
    Bucharest, Romania
    Posts
    70
    Rep Power
    12
    You're right. It's confusing when they don't write the logarithm's base. I found it obvious because Shannon's talking about binary information: 0 or 1, which is base 2. You should read Shannon's book on data compression, it's a very complete guide:
    http://www.data-compression.com/theory.html
    Good Luck!
    Andrei
  8. #5
  9. No Profile Picture
    .
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2002
    Posts
    296
    Rep Power
    12
    ok thanks. so the book i have that says log e, is incorrect. maybe i'll email the author to tell him he's a silly sod you'd have thought someone'd check and double check before they put it into their book.

    i have the "the mathematical theory of communication" book which has shannon's original paper and a follow up by warren weaver. the warren weaver part is great - it's in reasonably descriptive english, but unfortunetely i get lost within the first few pages of shannon's part.

    thanks for the reply and link.
  10. #6
  11. No Profile Picture
    .
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2002
    Posts
    296
    Rep Power
    12
    stating the base is 2 for shannon's equasion turns out not to be so accurate at all. the base was left unspecified on purpose because it can be used with any base.

    from shannon's paper:
    "The choice of a logarithmic base corresponds to the choice of a unit for
    measuring information. If the base 2 is used the resulting units may be
    called binary digits, or more briefly bits....If the base 10 is used the
    units may be called decimal digits."

    so basically it depends on what base your input is in.

    yes base 2 is most often used, but any other base can be used including base e.
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2002
    Location
    BCN
    Posts
    84
    Rep Power
    13
    Actually, it does not really matter because the difference between both logs is only a constant. And, as we are more interested in maximums and minimums of information constants are no big deal.

IMN logo majestic logo threadwatch logo seochat tools logo