I have a text file in this format:
abc? cdfde" nhj.cde' dfwe-df$sde.....
How can i ignore all the special characters, blanks, numbers, end of the lines, etc and write only the characters in another file?For example, the above file becomes
abccdfdenhjcdedfwedfsde.....
And from this output file,
- Should able to read single character by character till the end of file.
- Should be able to read two characters at a time, like ab,bc,cc,cd,df,… from above file
- Should be able to read three characters at a time, like abc,bcc,ccd,cdf,… from the above file
First of all, how can i read only characters and write to external file?
I can read single character by character by using f.read(1) till end of file.How can i apply this to read 2,3 chars at a time, that too skipping only one character(that is, if i have abcd, i should read ab,bc,cd but not ab,cd(this, i think can be done by f.read(2))). Thanks. I am doing this for cryptanalysis work to analyze ciphertexts by frequency.
If you need to peek ahead (read a few extra characters at a time), you need a buffered file object. The following class does just that:
This attempts to find extra characters beyond the one character you are reading, but doesn’t make a guarantee it’ll be able to satisfy your requirements. It could read fewer if we are at the end of the file or if there is a lot of non-alphabetic text in the next block.
Usage:
Demo, using a file with your example input:
To use the
readalpha()calls in a loop, where you get each and every character separately plus the two next 2 bytes, use theiter()with a sentinel: