Some days ago I asked a question about tagging differencies in 2 text files, and was answered quickly.
now I have a rather similar question but a bit more complicated.
I have 2 pair of files by the following characteristics:
pair1: (File1.txt , File2.txt)
pair2: (File3.txt , File4.txt)
There is a line by line correspondence between each files in these pairs. say that File1.txt and File3.txt are some English words, and File2.txt and File4.txt are their Arabic and French translations respectively. In addition, File1.txt and File3.txt are very similar (and in some cases the same).
File1.txt File2.txt
EnWord1 ArTrans1
EnWord2 ArTrans2
EnWord3 ArTrans3
Enword4 ArTrans4
File3.txt File4.txt
EnWord1 FrTrans1
EnWord3 FrTrans3
Enword4 FrTrans4
Enword5 FrTrans5
Now what I want to do is to compare English sides of these pairs, find the common words that appear in both files (EnWord1,EnWord3, and EnWord4) and filter out their corresponding translations.
In short, I can say that using two bilingual English-Arabic and English French dictionaries, I am trying to build a 3-lingual English-Arabic-French dictionary.
How it is possible?
I have to add that since there are many such pairs (the dictionaries are stored in different files, each file contains a part of the words, and by some reasons it is not possible to merge files and then process them) the speed of the code is very important and I am looking for a fast way to do this.
Finally, please give me some points (or even possible the complete code) to do this in Perl.
Best regards,
Hakim
I assume that the order you would like to maintain follows
File1.txt. The followingperlshould accomplish what your looking for:Run like this:
grep "" NEW_File*results:May not be the most efficient way to do things, but should give you somewhere to start at least. HTH.