Is there a better compression library for strings than DotNetZip or LZMA?

This answer is related to Guffa's answer. He said that QR code can accept binary data and it must be a limitation of the library you are using.

This answer is related to Guffa's answer. He said that QR code can accept binary data and it must be a limitation of the library you are using. I looked at the source code of the library.

You call the Encode function right? This the contents of the encode function public virtual Bitmap Encode(String content, Encoding encoding) { bool matrix = calQrcode(encoding. GetBytes(content)); SolidBrush brush = new SolidBrush(qrCodeBackgroundColor); Bitmap image = new Bitmap( (matrix.

Length * qrCodeScale) + 1, (matrix. Length * qrCodeScale) + 1); Graphics g = Graphics. FromImage(image); g.

FillRectangle(brush, new Rectangle(0, 0, image. Width, image. Height)); brush.

Color = qrCodeForegroundColor ; for (int I = 0; I FillRectangle(brush, j * qrCodeScale, I * qrCodeScale, qrCodeScale, qrCodeScale); } } } return image; } The first line (encoding. GetBytes(content)) converts the string to bytes. Get the source code then modify it to have this function: "public virtual Bitmap Encode(bytes content).

The compression works by removing redundancy in the data, but the string seems to contain random/encrypted data, so there is no redundancy to remove. However, it's data encoded using base-64, so each character only carries six bits of information. If you keep the binary data instead of base-64 encoding it, it's only 631 bytes.

The QR code library I'm using (MessagingToolkit. QRCode) encodes the data as a string. Wouldn't I have to convert the byte array to a string to encode it as the QR code?

(Sorry for my ineptitude. ) Is there some mechanism that keeps it as binary data while still making it a string? I've been using Convert.

ToBase64String on the byte array to create a string that can be encoded. How would I do it the way you suggest? Thanks!

– Sam Aug 20 at 6:36 The QR code supports binary, so that would be a limitation in the library that you use. Anyway, you would be better off doing binary -> compress -> base64 -> string instead of binary -> base64 -> string -> binary -> compress -> binary -> base64 -> string. – Guffa Aug 20 at 6:58 If you don't go with binary, base64 is suboptimal by a fair amount.

Base64 includes lowercase letters and those aren't included in the QR character set for alphanumeric so the encoder is still going to use 8 bits per base64 character and thus you're throwing away two bits. – smparkes Aug 20 at 15:37 Sorry ... meant to edit that (I hate SO doing a submit when I hit return.) Just wanted to add that alphanumeric in QR is A-Z0-9 $%*+-. /: for 45 symbols.

But I don't have any references to encoders that can encode to a given symbol set size. I'm sure they exist, but may be obscure. You could encode in base32, but then you're throwing away symbols on the other side and it may be a wash.

– smparkes Aug 20 at 15:50.

You are comparing different compressors. The Zip-family usually use a statistical compression and the LZ-family an acronym for Lempel-Ziv is a dictionary compression to remove the redundancy in the input text. So, compression works by removing superflous informations.It works good on text files and images, not so good on audio, video and program files.

For the latter there is lossy compression but not for program files. Given your example string it contains too much entropy to be compressed well. You can calculate the information entropy with -log(p)+log(2) where p is the probability of the character that occurs in your text.

See also information theory and shannon-theorem.

As can be guessed from one of the asker's previous questions the data is in an encrypted form, so the data is expected to contain a high entropy and a lossy compression algorithm would be harmful to the encrypted data. – Peter O. Aug 20 at 8:42 No upvote?

Did you understand what I wrote? I don't think I've spoke to you because I didn't suggest a lossy compression I wrote BUT NOT FOR PROGRAMS FILES. Should I clarify this?

– David Aug 20 at 8:52 Sorry, I misunderstood "program files" to mean programs, that is, files that contain machine code. The encrypted data given by the asker is neither text, images, audio, video, or "program files" as I understood it. – Peter O.

Aug 20 at 8:54 Well, maybe my answer is useless but it doesn't contain wrong or harmful information. An upvote would be nice. A downvote and I will delete my answer.

– David Aug 20 at 8:57 I have voted you up. – Peter O. Aug 20 at 9:00.

At first glance, it appears that you are trying to take some data and convert it into a QR code with this process: --> encrypt --> base64 encode --> compress --> make QR code. I suggest using this process instead: --> compress --> encrypt --> make QR code. When you want to both encrypt and compress, pretty much everyone recommends compress-then-encrypt.(Because encryption works just as well with compressed data as with uncompressed data.

But compression usually makes plaintext shorter and encrypted files longer. For more details, see: "Can I compress an encrypted file? " "Compress and then encrypt, or vice-versa?" "Composing Compression and Encryption" "Compress, then encrypt tapes" "Is it better to encrypt a message and then compress it or the other way around?

Which provides more security? " "Compressing and Encrypting files on Windows" "Encryption and Compression" "Do encrypted compression containers like zip and 7z compress or encrypt first?" "When compressing and encrypting, should I compress first, or encrypt first? ", etc. ) "am I able to use alphanumeric encoding for the QR code, or do I have to use binary?"

Most encryption algorithms produce binary output, so it will be simplest to directly convert that to binary-encoded QR code. I suppose you could somehow convert the encrypted data to something that QR alphanumeric coding could handle, but why? "Is there some better compression algorithm" For encrypted data, No.It is (almost certainly) impossible to compress well-encrypted data, no matter what algorithm you use.

If you compress-then-encrypt, as recommended, then the effectiveness of various compression algorithms depends on the particular kinds of input data, not on what you do with it after compression. What kind of data is your input data? If, hypothetically, your input data is some short of ASCII text, perhaps you could use one of the compression algorithms mentioned at "Really simple short string compression" "Best compression algorithm for short text strings" "Compression of ASCII strings in C" "Twitter text compression challenge".

If, on the other hand, your input data is some sort of photograph, perhaps you could use one of the many compression algorithms mentioned at "Twitter image encoding challenge".

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions