1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Rep Power

    Question Old dog learning new tricks

    I used to know how to program back in the 80's, but after 20 years of driving trucks, I've gotten real rusty.
    I have an idea for a data compression algorithm, but I'm having trouble with reading information from disk into memory where it can be manipulated. I've been trying to do this using Python, while teaching myself the language, but I just can't seem to get the program to manipulate the data the way I need it to.
    I need the program to look at ALL information found on disk as their binary or hex numeric values. It still needs to understand EOF markers and directories, but it doesn't need to look at text symbology or try to process any of the code it examines. I just need the numerical values off the disk, and I'm having a heck of a time getting Python to do the job.
    I used to know basic and QuickBasic, but that was back in the day. Any suggestions on what language I should use, or what modules I should try using in Python? I'm kinda stumped with this whole file manipulation thing.
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Usually Japan when not on contract
    Rep Power
    On Windows you have to add a "b" to the file mode when you open it. On adult operating systems there is no different between "binary" and "text" files, but adding a "b" to the mode doesn't hurt anything (its just ignored).
    python Code:
    f = open('file.name', 'rb')

    Manipulation of individual bytes can occur in binary, octal, hex, whatever, just be sure to specify by using the proper value codes ('\x01' notations) or the encode()/decode() methods.

    There is also the struct module you might want to look in to:

    Hopefully this is enough to get you thinking in a new direction.

    When doing low-level manipulations of the sort required for data compression I personally find it easier to think about things in C than Python, since its already closer to that level (but not nearly as tedious as assembler!). One of the awesome things about Python, though, is that you can write your compression routine in (what I find to be) the more natural language for it, C, and use it directly in a larger Python program as a native module. So you can write the interface stuff in Python and the bit manipulation functions in C at the same time.

IMN logo majestic logo threadwatch logo seochat tools logo