Any Way To Compress A 256 Bytes "md5-like" String Into 160 Bytes Or Less?
Solution 1:
If you convert to binary, you go from 256 hex digits to 128 bytes. Then use (or modify) one of the techniques mentioned in this thread to convert to an acceptable character set for SMS. (That thread deals with targeting JSON, but the same ideas can be applied to SMS.)
Solution 2:
You could use ascii85 (the ASCII85 version used by PostScript) because that compacts any zero-byte sequences too. Here is the transformation in a Python shell:
>>>a = b'633a88d35a0f8fd172bd21158a03a8bb17ddc0acc6edb8ae19a9dbd1aa855b75319e540910fb70cf7bb51d608219dd4b387623f94262705a9c2c19332240e2a6d696d4cb896abf0101afae1aeebf3d6299675e0e67904e7a544de9e3e65fb9def9b0b047fb57a0b742226d602d386d9e2fe176a88837eddd0c77d6911d386c2e'>>>ascii85_encoded = base85_encode(hex_decode(a))>>>repr(ascii85_encoded)
b'@lfFp=q?\\AEkNV2M?Bfh(Yum.`pL:=)6)B<WeFZ"0qM>N&GpFmHaOl%Jf3B;3-HPB6=On;S1GO6,!b.bes=h/M/\'d+!O&XEm_:noR:fh9B95l7<))W;k$P[Uq67(nqcBH"66^8S/N@U=0B%)QLc=_W%!U9b*B7jf'
>>>len(ascii85_encoded)
160
Now the above code is in Python based on:
https://code.google.com/p/python-mom/source/browse/mom/codec/base85.py
You may want to port it to Java for your needs.
HTH.
Solution 3:
You can't quite do it. The reason is that MD5-like data maximized entropy, and so gzip and friends will have a hard time getting close to 50% efficiency, and even if they did, it would be hit or miss.
The optimal 2:1 compression is: Treat every 2 chars as a byte in hex, and convert it into a binary char. That will cut the size down to 1/2. However, the binary data can't be sent, so you have to base64 encode it, leading to 33% increase. That leaves you at ~170 chars. "Base-128" encoding won't help, since there aren't 128 chars that are certain to transmit.
In short, you need to cut the data down. After all, the easiest way to send less data is to have less data :)
Solution 4:
It really depends on the exact type of data you are trying to send.
If there are predictable patterns in your data you can probably use http://en.wikipedia.org/wiki/Huffman_coding with a pre-defined alphabet of symbols to bring your size down.
Solution 5:
That string is hex-encoded. Therefore it's using 200% of the space of the binary message.
If you used base64 encoding instead, it would use 134% which is 171 characters. Still a bit too much.
Base85, which was invented by a relative of mine, could do it. It would use exactly 160 characters.
Post a Comment for "Any Way To Compress A 256 Bytes "md5-like" String Into 160 Bytes Or Less?"