I want to create a java app. that can be deployed on Hadoop which purpose is below:
- I have lot of larger log files from various servers (Tuxedo logs, Websphere logs and IIS logs)
- I want to analyze these large files to generate as a report which states that from Tux, this many of errors, From Websphere this many are errors/warnings etc
So I need assistance at this point in time with my limited Hadoop knowledge
- Most of the map reduce algo. works on with same type of files not in my case where log files are from various sources (Tux, Websphere, IIS etc)
- How do I design my map() and reduce() functions in this case
- How do I store the log reports (For ex: from Tux the error/warnings, from Websphere the error/warning/info) combination etc
Thanks in advance
apache flume is the answer for this scenario