I want to see if I have repeated items in my array, there are over 16.000 so will automate it
There may be other ways but I started with this and, well, would like to finish it unless there is a straightforward command. What I am doing is shifting and pushing from one array into another and this way, check the destination array to see if it is “in array” (like there is such a command in PHP).
So, I got this sub routine and it works with literals, but it doesn’t with variables. It is because of the ‘eq’ or whatever I should need. The ‘sourcefile’ will contain one or more of the words of the destination array.
// Here I just fetch my file
$listamails = <STDIN>;
# Remove the newlines filename
chomp $listamails;
# open the file, or exit
unless ( open(MAILS, $listamails) ) {
print "Cannot open file \"$listamails\"\n\n";
exit;
}
# Read the list of mails from the file, and store it
# into the array variable @sourcefile
@sourcefile = <MAILS>;
# Close the handle - we've read all the data into @sourcefile now.
close MAILS;
my @destination = ('hi', 'bye');
sub in_array
{
my ($destination,$search_for) = @_;
return grep {$search_for eq $_} @$destination;
}
for($i = 0; $i <=100; $i ++)
{
$elemento = shift @sourcefile;
if(in_array(\@destination, $elemento))
{
print "it is";
}
else
{
print "it aint there";
}
}
Well, if instead of including the $elemento in there I put a ‘hi’ it does work and also I have printed the value of $elemento which is also ‘hi’, but when I put the variable, it does not work, and that is because of the ‘eq’, but I don’t know what else to put. If I put == it complains that ‘hi’ is not a numeric value.
By the way, when looking for duplicates in a large number of items, it’s much faster to use a strategy based on sorting. After sorting the items, all duplicates will be right next to each other, so to tell if something is a duplicate, all you have to do is compare it with the previous one:
This will print multiple dupe messages if there are multiple dupes, but you can clean it up.