Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 667763
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 13, 20262026-05-13T23:57:51+00:00 2026-05-13T23:57:51+00:00

I have some HUGE log files (50Mb; ~500K lines) I need to start filtering

  • 0

I have some HUGE log files (50Mb; ~500K lines) I need to start filtering some of the crap out of. The log files are being produced using log4j and have the basic pattern of:

[log-level] date-time class etc, etc  
log-message  

I’m looking for a way that I can identify a regex start and regex end (or something similar) that will filter out the matching entries from the file so I can more easily wade through these massive files. My thoughts are that the start regex would be the log-level and the end regex would be something in the log-message. I’m sure I could write a java program to accomplish this task, but I thought I’d ask the community before going down that path. Thanks in advance.


Let me expand on my question. Let’s assume I have the following snippet in my log file:

[DEBUG] date-time class etc, etc  
log-message-1

[WARN] date-time class etc, etc  
log-message-2

[DEBUG] date-time class etc, etc  
log-message-3

[DEBUG] date-time class etc, etc  
log-message-1

[WARN] date-time class etc, etc  
log-message-2

[DEBUG] date-time class etc, etc  
log-message-6

I’d like a way to filter out logEntry1 and logEntry2 so I end up with:

[DEBUG] date-time class etc, etc  
log-message-3

[DEBUG] date-time class etc, etc  
log-message-6

I would hope to accomplish this be defining some sets of regex patterns pairs. In my example above, I’d want to define a pair for logEntry1 and another for logEntry2.

I hope that helps clarify my question.

  • 1 1 Answer
  • 1 View
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-13T23:57:51+00:00Added an answer on May 13, 2026 at 11:57 pm

    Assuming log-message-1 and log-message-2 and unique patterns.

    $ awk -vRS= '!/log-message-[12]/' ORS="\n\n" file
    [DEBUG] date-time class etc, etc
    log-message-3
    
    [DEBUG] date-time class etc, etc
    log-message-6
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have some huge log files I need to sort. All entries have a
I have a huge file that has some lines that need to have a
I have some huge layouts which I need to free from the memory in
i have some huge js files and there are some texts/messages/... which are output
I have large log files which contains timestamps every one second.what I need is
I have some huge binary files which I'm currently reading using memory mapping on
One of the huge benefits in languages that have some sort of reflection/introspecition is
I have a huge application and at some point, when a redirect is involved
I have a huge matrix that I divided it into some sub matrices and
I have a huge global array of structures. Some regions of the array are

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.