Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 37455
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 10, 20262026-05-10T14:33:12+00:00 2026-05-10T14:33:12+00:00

The most common method for corrupting compressed files is to inadvertently do an ASCII-mode

  • 0

The most common method for corrupting compressed files is to inadvertently do an ASCII-mode FTP transfer, which causes a many-to-one trashing of CR and/or LF characters.

Obviously, there is information loss, and the best way to fix this problem is to transfer again, in FTP binary mode.

However, if the original is lost, and it’s important, how recoverable is the data?

[Actually, I already know what I think is the best answer (it’s very difficult but sometimes possible – I’ll post more later), and the common non-answers (lots of off-the-shelf programs for repairing CRCs without repairing data), but I thought it would be interesting to try out this question during the stackoverflow beta period, and see if anyone else has gone down the successful-recovery path or discovered tools I don’t know about.]

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. 2026-05-10T14:33:13+00:00Added an answer on May 10, 2026 at 2:33 pm

    From Bukys Software

    Approximately 1 in 256 bytes is known to be corrupted, and the corruption is known to occur only in bytes with the value ‘\012’. So the byte error rate is 1/256 (0.39% of input), and 2/256 bytes (0.78% of input) are suspect. But since only three bits per smashed byte are affected, the bit error rate is only 3/(256*8): 0.15% is bad, 0.29% is suspect.

    …

    An error in the compressed input disrupts the decompression process for all subsequent bytes…The fact that the decompressed output is recognizably bad so quickly is cause for hope — a search for the correct answer can identify wrong answers quickly.

    Ultimately, several techniques were combined to successfully extract reasonable data from these files:

    • Domain-specific parsing of fields and quoted strings
    • Machine learning from previous data with low probability of damage
    • Tolerance for file damage due to other causes (e.g. disk full while logging)
    • Lookahead for guiding the search along the highest-probability paths

    These techniques identify 75% of the necessary repairs with certainty, and the remainder are explored highest-probability-first, so that plausible reconstructions are identified immediately.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

The below code is the web method (is the most common one as you
One of the most common operation with source code is to modify the method
I'm trying to return the most common elements in a list (statistical mode). Unfortunately
In most common sense of course it is a method to influence the behavior
So from looking around at examples and tutorials, it seems the most common method
Most common ORMs implement persistence by reachability, either as the default object graph change
The most common source of free historical information is Wikipedia, but I am missing
On most common platforms (the most important being x86; I understand that some platforms
What's the most common/best way to setup a WCF service project and applications? Here's
Where are the most common places where you've gotten an org.hibernate.LazyInitializationException in Grails, what

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.