#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2014
    Posts
    6
    Rep Power
    0

    Bpe compression help


    I have complied the bpe.c source program for philp gage and I have managed to make it fit will no totally but I made it fit to my type of bpe files I am using hex editor to compare between the new bpe file I compress and the original bpe file there Is a problem that always the two digits after 00 00 especially after any 00 00 8x number are always twisted

    the bpe.c program source I use I compiled it using dev-c++ and ran it on the cmd on values 4000 8192 200 3

    so if any one can help me or just modifying the bpe.c source program because I don't know how to deal with c language and If you need the decompress source program I have it
    & it works perfectly
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2012
    Posts
    186
    Rep Power
    82
    I have complied the bpe.c source program for philp gage and I have managed to make it fit will no totally but I made it fit to my type of bpe files I am using hex editor to compare between the new bpe file I compress and the original bpe file there Is a problem that always the two digits after 00 00 especially after any 00 00 8x number are always twisted
    I really don't understand what you're trying to convey from the above quote. I have Philip Gage's compress and decompress source code and can try to recreate the issue if only I had some understanding of the issue.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2014
    Posts
    6
    Rep Power
    0
    will first I made pilip gage's compress source code a dos program by Dev-Cpp 5.6.1 TDM-GCC x64 4.8.1 you know like this
    then I gave It a try to compress my files but it gives me wrong files every time
    I have files compressed in bpe I have to decompress it to start my work then after I finish I have to compress it again
    so I have decompressed a file & tried to compress it but every time the new file be different than the original one
    I use hex editor to find out the difference between the file compressed by pilip gage's source code & the original one after every 00 00 81 or 82 or 83 etc the next two numbers are twisted it tired my to reach to these codes 4000 is the blocksize, 8192 is the hash size, 200 is the maxchars and 3 is the THRESHOLD look man all I want is a compress tool fits my files no thing else
    I don't know were the fault is but I have the right decompress source code it just decompress the files perfectly so I thought we can use It to modify pilip gage's source code to fit my files
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2012
    Posts
    186
    Rep Power
    82
    NOTE!!!!!: the code tags just don't seem to work properly for me in this post. So, you'll have to sort out the source code listed below

    I just can't recreate your problem.

    My test consisted of creating an ASCII test file named test.txt. This file only contained the string test.

    I then compressed this ASCII text file as follows....

    Code:
    bpe test.txt out.txt 4000 8192 200 3
    The contents of out.txt are as follows...

    fe 7f fe ff 00 08 74 65 73 74 0d 0a 0d 0a

    I then decompressed the out.txt file as follows....

    Code:
    expand out.txt test.txt
    Next, I again compressed test.txt as follows

    Code:
    bpe test.txt out.txt 4000 8192 200 3
    And again, the contents of out.txt are as follows...

    fe 7f fe ff 00 08 74 65 73 74 0d 0a 0d 0a

    The bottomline here, is that the compressed out text file will always contain the same hex
    values no matter how many times I run the compression and decompression utilities.

    I have attached the Gage source code that I am using for my tests to ensure that we're comparing apples to apples.

    [CODE]/* bpe.c - rewritten to handle parameterised command line input */

    /* from compress.c */
    /* Copyright Philip Gage */
    /* printed in 'The C Users Journal' February, 1994 */

    #include <stdio.h>
    #include <stdlib.h>

    #define BLOCKSIZE 10000 /* maximum block size */
    #define HASHSIZE 8192 /* size of hash table */
    #define MAXCHARS 220 /* char set per block */
    #define THRESHOLD 3 /* minimum pair count */

    unsigned char buffer[BLOCKSIZE]; /* data block */
    unsigned char leftcode[256]; /* pair table */
    unsigned char rightcode[256]; /* pair table */
    unsigned char left[HASHSIZE]; /* hash table */
    unsigned char right[HASHSIZE]; /* hash table */
    unsigned char count[HASHSIZE]; /* pair count */
    int size; /* size of current data block */

    /* function prototypes */
    int lookup (unsigned char, unsigned char, int );
    int fileread (FILE *, int, int, int);
    void filewrite (FILE *);
    void compress (FILE *, FILE *, int, int, int, int);

    /* return index of character pair in hash table */
    /* deleted nodes have a count of 1 for hashing */
    int lookup (unsigned char a, unsigned char b, int hs)
    {
    int index; /* ? - will add question marks until I understand each variable */


    /* compute hash key from both characters */
    index = (a ^ (b << 5)) & (hs-1); /* ? */
    /* if b = 10110101 then '(b << 5)' --> b = 10100000. */
    /* ie shift the bits in b left by five positions and fill holes with zeros */

    /* search for pair or first empty slot */
    while ((left[index] != a || right[index] != b) && count[index] != 0)
    {
    index = (index + 1) & (hs - 1);
    }

    left[index] = a;
    right[index] = b;
    return index;
    }

    /* read next block from input file into buffer */
    int fileread (FILE *input, int bs, int hs, int mc)
    {
    int c, index, used=0;

    /* reset hash table and pair table */
    for (c = 0; c < hs; c++)
    count[c] = 0;
    for (c = 0; c < 256; c++)
    {
    leftcode[c] = c;
    rightcode[c] = 0;
    }
    size = 0;

    /* read data until full or few unused chars */
    while (size < bs && used < mc && (c = getc(input)) != EOF)
    {
    if (size > 0)
    {
    index = lookup(buffer[size-1], c, hs);
    if (count[index] < 255)
    {
    ++count[index];
    }
    }
    buffer[size++] = c;

    /* use right code to flag data chars found */
    if (!rightcode[c])
    {
    rightcode[c] = 1;
    used++;
    }
    }
    return c == EOF;
    }

    /* write each pair table and data block to output */
    void filewrite( FILE *output )
    {
    int i, len, c = 0;

    /* for each character 0..255 */
    while ( c < 256 )
    {
    /* if not a pair code, count run of literals */
    if ( c == leftcode[c] )
    {
    len = 1; c++;
    while ( len < 127 && c < 256 && c == leftcode[c])
    {
    len++; c++;
    }
    putc( len + 127, output );
    len = 0;
    if ( c == 256 ) break;
    }

    /* else count run of pair codes */
    else
    {
    len = 0;
    c++;


    /* original, will add extra brackets per compiler suggestions: while ( len < 127 && c < 256 && c != leftcode[c] || len < 125 && c < 254 && c+1 != leftcode[c+1]) */
    while (( len < 127 && c < 256 && c != leftcode[c]) || (len < 125
    && c < 254 && c+1 != leftcode[c+1]))
    {
    len++;
    c++;
    }
    putc(len, output);
    c -= len+1;
    }

    /* write range of pairs to output */
    for ( i = 0; i <= len; i++ )
    {
    putc(leftcode[c], output);
    if ( c != leftcode[c] )
    { putc(rightcode[c], output); }
    c++;
    }
    }
    /* write size bytes and compressed data block */
    putc(size/256, output);
    putc(size%256, output);
    fwrite(buffer, size, 1, output);
    }

    /* compress from input file to output file */
    void compress( FILE *infile, FILE *outfile,
    int bs, int hs, int mc, int th )
    {
    int leftch, rightch, code, oldsize;
    int index, r, w, best, done = 0;

    /* compress each data block until end of file */
    while ( !done )
    {
    done = fileread(infile, bs, hs, mc);
    code = 256;

    /* compress this block */
    for(;;)
    {
    /* get next unused chr for pair code */
    for ( code--; code >= 0; code-- )
    {
    if ( code == leftcode[code] && !rightcode[code] )
    {
    break;
    }
    }

    /* must quit if no unused chars left */
    if ( code < 0 )
    {
    break;
    }

    /* find most frequent pair of chars */
    for ( best = 2, index = 0; index < hs; index++ )
    {
    if (count[index] > best)
    {
    best = count[index];
    leftch = left[index];
    rightch = right[index];
    }
    }

    /* done if no more compression possible */
    if ( best < th )
    {
    break;
    }

    /* Replace pairs in data, adjust pair counts */
    oldsize = size - 1;
    for ( w = 0, r = 0; r < oldsize ; r++ )
    {
    if (buffer[r] == leftch && buffer[r+1] == rightch)
    {
    if ( r > 0 )
    {
    index = lookup(buffer[w-1], leftch, hs);
    if ( count[index] > 1 )
    {
    --count[index];
    }
    index = lookup( buffer[w-1], code, hs );
    if ( count[index] < 255 )
    {
    ++count[index];
    }
    }
    if ( r < oldsize - 1 )
    {
    index = lookup( rightch, buffer[r+2] , hs);
    if ( count[index] > 1 )
    {
    --count[index];
    }
    index = lookup( code, buffer[r+2], hs );
    if ( count[index] < 255 )
    {
    ++count[index];
    }
    }
    buffer[w++] = code;
    r++;
    size--;
    }
    else
    {
    buffer[w++] = buffer[r];
    }
    }
    buffer[w] = buffer[r];

    /* add to pair substitution table */
    leftcode[code] = leftch;
    rightcode
    Code:
     = rightch;
        
          /* delete pair from hash table */
          index = lookup( leftch, rightch, hs );
          count[index] = 1;
        }
        filewrite( outfile );
      }
    }
    
    void main (int argc, char *argv[] )
    {
      FILE *infile, *outfile;
    /* argc = 7              */
    /*   argv[0] = command   */
    /*   argv[1] = infile    */
    /*   argv[2] = outfile   */
    /*   argv[3] = BLOCKSIZE */
    /*   argv[4] = HASHSIZE  */
    /*   argv[5] = MAXCHARS  */
    /*   argv[6] = THRESHOLD */
      int bs = 20000; /* maxval */
      int hs = 16384; /* maxval */
      int mc = 200;   /* default value */
      int th = 3;     /* default min */
    
    
      if (argc != 7)
      {
        printf("Usage: bpe infile outfile blocksize hashsize maxchars threshold\n");
        printf("typical: bpe infile outfile 5000 4096 200 3\n");
      }
      else
      {
        if (( infile = fopen( argv[1], "rb" )) == NULL )
        {
          printf("Error opening input %s\n",argv[1]);
        }
        else
        {
          if (( outfile = fopen( argv[2], "wb" )) == NULL )
          {
            printf("Error opening output %s\n",argv[2]);
          }
          else
          {
            bs = atoi( argv[3] );
            hs = atoi( argv[4] );
            mc = atoi( argv[5] );
            th = atoi( argv[6] );
            /* because these inputs come from the command line generated
               by clustor, I have included very little error checking here
            */    
    
            compress( infile, outfile, bs, hs, mc, th );
            fclose( outfile );
            fclose( infile );
          }
        }
      }
    }
    
    /* end of file */
    Code:
    /* expand.c */
    /* Copyright 1994 by Philip Gage */
    
    #include <stdio.h>
    
    /* decompress data from input to output */
    void expand (FILE *input, FILE *output)
    {
      unsigned char left[256], right[256], stack[30];
      short int c, count, i, size;
    
      /* unpack each block until end of file */
      while (( count = getc ( input )) != EOF )
      {
        /* set left to itself as literal flag */
        for ( i = 0 ; i < 256; i++ )
        {
          left[i] = i;
        }
    
        /* read pair table */
        for ( c = 0 ; ; )
        {
          /* skip range of literal bytes */
          if ( count > 127 )
          {
            c += count -127;
            count = 0;
          }
          if ( c==256 )
          { 
            break;
          }
       
          /* read pairs, skip right if literal */
          for ( i = 0; i <= count; i++, c++ )
          {
            left[c] = getc(input);
            if ( c != left[c] )
            {
              right[c] = getc(input);
            }
          }
          if (c == 256)
          {
            break;
          }
          count = getc(input);
        }
        
        /* calculate packed data block size */
        size = 256 * getc(input) + getc(input);
    
        /* unpack data block */
        for ( i = 0 ; ; )
        {
          /* pop byte from stack or read byte */
          if ( i )
          { 
            c = stack[--i];
          }
          else
          {
            if ( !size--)
            {
              break;
            }
            c = getc(input);
          }
    
          /* output byte or push pair on stack */
          if ( c == left[c] )
          {
            putc(c, output);
          }
          else
          {
            stack[i++] = right[c];
            stack[i++] = left[c];
          }
        }
      }
    }
    
    void main ( int argc, char *argv[] )
    {
      FILE *infile, *outfile;
    
      if ( argc != 3 )
      {
        printf("Usage: expand infile outfile\n");
      }
      else
      {
        if (( infile = fopen(argv[1],"rb"))==NULL)
        {
          printf("Error opening input %s\n",argv[1]);
        }
        else
        {  
          if ((outfile=fopen(argv[2],"wb"))==NULL)
          {
            printf("Error opening output %s\n", argv[2]);
          }
          else
          {
            expand ( infile, outfile );
            fclose ( outfile );
            fclose ( infile );
          }
        }
      }
    }
    
    /* end of file */
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2014
    Posts
    6
    Rep Power
    0
    I don't want to compress txt files I want to compress these types of files
    the original bpe
    original
    the uncompressed file:-
    http://www.mediafire.com/download/81..._file.unpacked
    the new one I have created:-
    http://www.mediafire.com/download/1e...my_new_one.bpe

    (look man all I need Is to make a bpe compressor can compress these files again & if you want to see my problem just open the new one & the original in hex editor & you will see the hole files are the same except in some points always after 00 00)
    I don't know how to make the source code compress my files again to be the same as the original
    I can also provide you the decompress source code I use it just decompresses them perfectly
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2012
    Posts
    186
    Rep Power
    82
    if you want to see my problem just open the new one & the original in hex editor & you will see the hole files are the same except in some points always after 00 00)
    I just don't see your problem. I've listed the hex dumps below from both of the files you had uploaded. The "hole" files do not appear to be the same per the hex dump. Nor are the file sizes the same.

    Maybe I'm extremely dense but I can't see the problem that you're describing.


    Your unpacked file that you uploaded....

    Code:
    10 00 00 00 03 00 00 00 38 7e 00 00 00 00 00 00
    00 00 0a 00 00 00 00 00 38 7e 00 01 00 00 00 00
    0a 00 15 00 00 00 00 00 ff ff 00 00 00 00 00 00
    00 00 00 00 00 00 00 00 65 00 51 00 6f 00 51 00
    6a 00 5a 00 c9 00 67 00 68 00 6a 00 31 23 6c 00
    23 03 7a 00 66 00 86 00 55 03 ab 00 30 23 29 01
    65 00 23 00 6a 00 3b 00 66 00 46 00 b9 0b 63 00
    65 00 64 00 6f 00 64 00 67 00 74 00 65 00 85 00
    66 00 96 00 65 00 ae 00 66 00 c4 00 20 03 e6 00
    65 00 f4 00 6f 00 f4 00 66 00 05 01 65 00 19 01
    66 00 2f 01 65 00 42 01 66 00 57 01 65 00 6c 01
    66 00 82 01
    Your packed file that you uploaded...

    Code:
    fe 7f f0 01 fb 0c fd fd 00 f7 f4 00 f6 6f 00 6a
    00 01 fc 00 fc 00 fb 66 00 65 00 fe fe 00 00 80
    00 6a 10 fe 00 03 fe 00 38 7e f2 0a fd 00 38 7e
    00 01 fd 0a 00 15 fd 00 ff ff f2 fd fe fc 51 f5
    51 f3 5a 00 c9 00 67 00 68 f3 31 23 6c 00 23 03
    7a fa 86 00 55 03 ab 00 30 23 29 f8 23 f3 3b fa
    46 00 b9 0b 63 f9 64 f5 64 00 67 00 74 f9 85 fa
    96 f9 ae fa c4 00 20 03 e6 f9 f4 f5 f4 fa 05 f8
    19 f1 2f f8 42 f1 57 f8 6c f1 82 01
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2014
    Posts
    6
    Rep Power
    0
    the unpacked file is the original bpe after decompress I tried to compress it again using bpe.c source code but the product is't the same like the original
    I was talking about comparing betwwen the new produced file after the compression trial & the original compressed file not the unpacked & the original
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2012
    Posts
    186
    Rep Power
    82
    the unpacked file is the original bpe after decompress I tried to compress it again using bpe.c source code but the product is't the same like the original
    We can only work with the files that you have uploaded. I have packed (compressed) and unpacked (exapnded) your original uncompressed file that you uploaded. I could not find any deviations whatsoever in either the unpacked or packed files. I have listed the hex values of the unpacked files and packed files below. The unpacked files are consistently the same and the packed files are consistently the same.

    I could continue this packing/unpacking process repeatedly and I still would not find any deviations. In other words, Unpack3 is the same hex format as the original unpacked file.


    Original unpack (downloaded file)

    10 00 00 00 03 00 00 00 38 7e 00 00 00 00 00 00
    00 00 0a 00 00 00 00 00 38 7e 00 01 00 00 00 00
    0a 00 15 00 00 00 00 00 ff ff 00 00 00 00 00 00
    00 00 00 00 00 00 00 00 65 00 51 00 6f 00 51 00
    6a 00 5a 00 c9 00 67 00 68 00 6a 00 31 23 6c 00
    23 03 7a 00 66 00 86 00 55 03 ab 00 30 23 29 01
    65 00 23 00 6a 00 3b 00 66 00 46 00 b9 0b 63 00
    65 00 64 00 6f 00 64 00 67 00 74 00 65 00 85 00
    66 00 96 00 65 00 ae 00 66 00 c4 00 20 03 e6 00
    65 00 f4 00 6f 00 f4 00 66 00 05 01 65 00 19 01
    66 00 2f 01 65 00 42 01 66 00 57 01 65 00 6c 01
    66 00 82 01

    Pack1

    fe 7f f0 01 fb 0c fd fd 00 f7 f4 00 f6 6f 00 6a
    00 01 fc 00 fc 00 fb 66 00 65 00 fe fe 00 00 80
    00 6a 10 fe 00 03 fe 00 38 7e f2 0a fd 00 38 7e
    00 01 fd 0a 00 15 fd 00 ff ff f2 fd fe fc 51 f5
    51 f3 5a 00 c9 00 67 00 68 f3 31 23 6c 00 23 03
    7a fa 86 00 55 03 ab 00 30 23 29 f8 23 f3 3b fa
    46 00 b9 0b 63 f9 64 f5 64 00 67 00 74 f9 85 fa
    96 f9 ae fa c4 00 20 03 e6 f9 f4 f5 f4 fa 05 f8
    19 f1 2f f8 42 f1 57 f8 6c f1 82 01

    Unpack2

    10 00 00 00 03 00 00 00 38 7e 00 00 00 00 00 00
    00 00 0a 00 00 00 00 00 38 7e 00 01 00 00 00 00
    0a 00 15 00 00 00 00 00 ff ff 00 00 00 00 00 00
    00 00 00 00 00 00 00 00 65 00 51 00 6f 00 51 00
    6a 00 5a 00 c9 00 67 00 68 00 6a 00 31 23 6c 00
    23 03 7a 00 66 00 86 00 55 03 ab 00 30 23 29 01
    65 00 23 00 6a 00 3b 00 66 00 46 00 b9 0b 63 00
    65 00 64 00 6f 00 64 00 67 00 74 00 65 00 85 00
    66 00 96 00 65 00 ae 00 66 00 c4 00 20 03 e6 00
    65 00 f4 00 6f 00 f4 00 66 00 05 01 65 00 19 01
    66 00 2f 01 65 00 42 01 66 00 57 01 65 00 6c 01
    66 00 82 01

    Pack2

    fe 7f f0 01 fb 0c fd fd 00 f7 f4 00 f6 6f 00 6a
    00 01 fc 00 fc 00 fb 66 00 65 00 fe fe 00 00 80
    00 6a 10 fe 00 03 fe 00 38 7e f2 0a fd 00 38 7e
    00 01 fd 0a 00 15 fd 00 ff ff f2 fd fe fc 51 f5
    51 f3 5a 00 c9 00 67 00 68 f3 31 23 6c 00 23 03
    7a fa 86 00 55 03 ab 00 30 23 29 f8 23 f3 3b fa
    46 00 b9 0b 63 f9 64 f5 64 00 67 00 74 f9 85 fa
    96 f9 ae fa c4 00 20 03 e6 f9 f4 f5 f4 fa 05 f8
    19 f1 2f f8 42 f1 57 f8 6c f1 82 01

    Unpack3

    10 00 00 00 03 00 00 00 38 7e 00 00 00 00 00 00
    00 00 0a 00 00 00 00 00 38 7e 00 01 00 00 00 00
    0a 00 15 00 00 00 00 00 ff ff 00 00 00 00 00 00
    00 00 00 00 00 00 00 00 65 00 51 00 6f 00 51 00
    6a 00 5a 00 c9 00 67 00 68 00 6a 00 31 23 6c 00
    23 03 7a 00 66 00 86 00 55 03 ab 00 30 23 29 01
    65 00 23 00 6a 00 3b 00 66 00 46 00 b9 0b 63 00
    65 00 64 00 6f 00 64 00 67 00 74 00 65 00 85 00
    66 00 96 00 65 00 ae 00 66 00 c4 00 20 03 e6 00
    65 00 f4 00 6f 00 f4 00 66 00 05 01 65 00 19 01
    66 00 2f 01 65 00 42 01 66 00 57 01 65 00 6c 01
    66 00 82 01




  16. #9
  17. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2014
    Posts
    6
    Rep Power
    0
    ok now I am confused but here is what I mean

    the original bpe in hex
    FE 7F F0 01 FB 0C FD FD 00 F7 F4 00 F6 6F 00 6A
    00 01 FC 00 FC 00 FB 66 00 65 00 FE FE 00 00 80
    6A 00 10 FE 00 03 FE 00 38 7E F2 0A FD 00 38 7E
    00 01 FD 0A 00 15 FD 00 FF FF F2 FD FE FC 51 F5
    51 F3 5A 00 C9 00 67 00 68 F3 31 23 6C 00 23 03
    7A FA 86 00 55 03 AB 00 30 23 29 F8 23 F3 3B FA
    46 00 B9 0B 63 F9 64 F5 64 00 67 00 74 F9 85 FA
    96 F9 AE FA C4 00 20 03 E6 F9 F4 F5 F4 FA 05 F8
    19 F1 2F F8 42 F1 57 F8 6C F1 82 01

    the new one which I made in hex
    FE 7F F0 01 FB 0C FD FD 00 F7 F4 00 F6 6F 00 6A
    00 01 FC 00 FC 00 FB 66 00 65 00 FE FE 00 00 80
    00 6A 10 FE 00 03 FE 00 38 7E F2 0A FD 00 38 7E
    00 01 FD 0A 00 15 FD 00 FF FF F2 FD FE FC 51 F5
    51 F3 5A 00 C9 00 67 00 68 F3 31 23 6C 00 23 03
    7A FA 86 00 55 03 AB 00 30 23 29 F8 23 F3 3B FA
    46 00 B9 0B 63 F9 64 F5 64 00 67 00 74 F9 85 FA
    96 F9 AE FA C4 00 20 03 E6 F9 F4 F5 F4 FA 05 F8
    19 F1 2F F8 42 F1 57 F8 6C F1 82 01

    you see the two numbers after 00 00 80 are in the original bpe 6a 00
    but in the new one are 00 6A that is my problem
  18. #10
  19. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2014
    Posts
    6
    Rep Power
    0
    the image I don't know why it didn't show up but

IMN logo majestic logo threadwatch logo seochat tools logo