Is there a way to compress a JavaScript array of 250+ 1s and 0s into something a little more manageable (say a shorter string) and then manageably decompress the same? Sort of like the way Google did its image encodings…
Thanks!
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
I can give you almost 1:5 compression by encoding as base 32. I chose to include a simple length value to make it allow variable-length. Please see this fiddle demonstrating the technique with two functions that allow you to round-trip the value. (Or you can see an earlier, more naive hexadecimal version I created before @slebetman reminded me of the native number base conversion that exists in javascript.)
Here’s sample output for one set of 250 1s and 0s. The number of characters does not count the leading “250|”:
You can use a base 64 encoding to get it down to 42 characters, but be aware that with both the base 32 and base 64 versions you can end up with words in your final result that may be objectionable (please see the fiddle above for an example). The hex version can also have objectionable content, but much less so (a bad face bade a dad be a cad?)
Please let me know if you need to save 8 more characters and I will work up additional script for you. Avoiding vowels could be one way to deal with the objectionable word problem. Let me know if you need to do this as well.
If your bit strings will always be 250 characters, then the functions can be simplified a bit, but I didn’t want to make this assumption.
For reference here’s the bits-to-base-32 function.
This function will pad to the nearest 5 bits, and may generate a spurious extra character at the end for the length you are providing. I included a second version of each conversion function that pads to the nearest 10 bits, which may generate up to two spurious extra characters. I included them because if speed is important, they may (or may not) be faster, as they take larger chunks from the inputs.