#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2010
    Posts
    22
    Rep Power
    0

    Converting text file to binary


    Hi,
    I have text file with numbers..
    As this:
    123
    1045
    4928

    and i want to save this file in the memory as binary file.

    I will explain..
    Right now, this file weight is 11 bytes (1 byte per character).
    When convertin the presentation of each number to log2 i can save each number as 1/2 bytes.

    123=> 1111011 = 1byte
    1045=> 10000010101 = 2 bytes
    4928=> 1001101000000 = 2 bytes

    How can i do that? easy way?

    Thanks!
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Apr 2009
    Posts
    1,941
    Rep Power
    1225
    See:
    perldoc -f sprinf
    perldoc -f printf

    I suspect that you have an XY problem.
    "XY Problem" explanations
    Last edited by FishMonger; June 16th, 2013 at 09:27 AM.
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    837
    Rep Power
    496
    Originally Posted by FishMonger
    I suspect that you have an XY problem.
    "XY Problem" explanations
    So do I. Why would anyone want to save 6 bytes with today's hardware?
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Apr 2009
    Posts
    1,941
    Rep Power
    1225
    Originally Posted by Laurent_R
    So do I. Why would anyone want to save 6 bytes with today's hardware?
    I suspect that it's related to the OP's prior question: Allocating constant space/memory for big hash.
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2010
    Posts
    22
    Rep Power
    0
    Originally Posted by Laurent_R
    So do I. Why would anyone want to save 6 bytes with today's hardware?
    Can you explain yourself?
    Maybe i am missing somthing that i dont know..
    I am just trying to save disk space
  10. #6
  11. !~ /m$/
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    May 2004
    Location
    Reno, NV
    Posts
    4,263
    Rep Power
    1810
    Well, there are core modules such as this:

    IO::Compress::Zip

    which is why the XY problem was mentioned, I suspect. There may be easier/better ways to get to your end goal.
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Jun 2012
    Posts
    837
    Rep Power
    496
    Just explain what you really need, rather than how to achieve the way you think is the right way to do it, but might simply not be the best.
  14. #8
  15. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2010
    Posts
    22
    Rep Power
    0
    I will try to explain what i need.

    i have file with many numbers, each number is line number in other reference_file.
    for example if i have 1342.
    that is mean that i need to remember line 1342 in the reference file to represent the new file (each line contain some text).
    i have billions like this.
    So i am looking for a better way to save a disk place instead the regular format (characters.)
    Maybe creating a reference will be better instead of line number?
  16. #9
  17. !~ /m$/
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    May 2004
    Location
    Reno, NV
    Posts
    4,263
    Rep Power
    1810
    To me, that's a database. An integer is an excellent key, also.

    Of course, without knowing more I question whether these should all exist in a single flat file or table, or whether they could be split into different categories, etc.
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Apr 2009
    Posts
    1,941
    Rep Power
    1225
    First you say that your file is only 11 bytes containing an integer per line, now you say you have billions of lines like that, which is certainly going to be greater than 11 bytes.

    Saving disk space is not the main concern, or at least it shouldn't be. The main concern should be how to access that data in an efficient way.

    You haven't provided enough info for us to say with any confidence how you should store/access the data, but I'll agree with keath and say you should be using a database.

    If you don't want to (note that I said "don't want to" not "can't") use a database, then instead of tracking the desired line numbers in that separate file, you should be tracking the desired byte offsets.

IMN logo majestic logo threadwatch logo seochat tools logo