Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8784761
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 13, 20262026-06-13T21:11:10+00:00 2026-06-13T21:11:10+00:00

I get the following input that I want to split into four parts: –

  • 0

I get the following input that I want to split into four parts:

-
KPDX 021453Z 16004KT 10SM FEW007 SCT060 BKN200 11/09 A3002 RMK
     AO2 SLP166 T01060094 55008
TAF AMD KPDX 021453Z 0215/0312 10005KT P6SM FEW006 SCT060 BKN150
     FM021800 11005KT P6SM SCT050 OVC100
     FM022200 11007KT P6SM -RA OVC050
     FM030500 12005KT P6SM -RA OVC035
KSEA 021453Z 15003KT 10SM FEW035 BKN180 11/09 A3001 RMK AO2
     SLP168 60000 T01110089 58010
TAF AMD KSEA 021501Z 0215/0318 14004KT P6SM SCT020 BKN150
     FM021800 16005KT P6SM SCT025 OVC090
     FM030100 19005KT P6SM OVC070
     FM030200 15005KT P6SM -RA OVC045
     FM030600 16007KT P6SM -RA BKN025 OVC045

It’s a METAR, then a TAF, then a METAR, then a TAF.

Input rules:

  1. The airport codes can change, but should always be 3 or 4 letters.
  2. METARS will start with either the airport code, or “SPECI” followed by the airport code (SPECI KPDX).
  3. TAFs will start with either the aiport code, or “TAF AMD” followed by the airport code (TAF AMD KPDX).
  4. In any report, the airport code will always be followed by the datetime stamp.
  5. In a TAF, the datetime stamp will always be followed by the valid times (0215/0318 for example).
  6. There could be as few as 2 reports, or many more than 4.
  7. Any report could be just a single line.

I want to grab each report by itself, so I’m using the regex ^(\\w+.*?)(?:^\\b|\\Z) in the following code:

ArrayList<String> reports = new ArrayList<String>();
Pattern pattern = Pattern.compile( "^(\\w+.*?)(?:^\\b|\\Z)", Pattern.DOTALL|Pattern.MULTILINE );
Matcher matcher = pattern.matcher( input );
while( matcher.find() )
    reports.add( new String( matcher.group( 1 ).trim() ) );

It works great, I get 4 results:

1:

KPDX 021453Z 16004KT 10SM FEW007 SCT060 BKN200 11/09 A3002 RMK
     AO2 SLP166 T01060094 55008

2:

TAF AMD KPDX 021453Z 0215/0312 10005KT P6SM FEW006 SCT060 BKN150
     FM021800 11005KT P6SM SCT050 OVC100
     FM022200 11007KT P6SM -RA OVC050
     FM030500 12005KT P6SM -RA OVC035

3:

KSEA 021453Z 15003KT 10SM FEW035 BKN180 11/09 A3001 RMK AO2
     SLP168 60000 T01110089 58010

4:

TAF AMD KSEA 021501Z 0215/0318 14004KT P6SM SCT020 BKN150
     FM021800 16005KT P6SM SCT025 OVC090
     FM030100 19005KT P6SM OVC070
     FM030200 15005KT P6SM -RA OVC045
     FM030600 16007KT P6SM -RA BKN025 OVC045

I have encountered a case where my regex fails. Occasionally, a TAF line will run too long and will be wrapped (I have no control over this), so it might look like (notice the “BKN150” right below “TAF AMD PDX”):

-
KPDX 021453Z 16004KT 10SM FEW007 SCT060 BKN200 11/09 A3002 RMK
     AO2 SLP166 T01060094 55008
TAF AMD KPDX 021453Z 0215/0312 10005KT P6SM FEW006 SCT060
BKN150
     FM021800 11005KT P6SM SCT050 OVC100
     FM022200 11007KT P6SM -RA OVC050
     FM030500 12005KT P6SM -RA OVC035
KSEA 021453Z 15003KT 10SM FEW035 BKN180 11/09 A3001 RMK AO2
     SLP168 60000 T01110089 58010
TAF AMD KSEA 021501Z 0215/0318 14004KT P6SM SCT020 BKN150
     FM021800 16005KT P6SM SCT025 OVC090
     FM030100 19005KT P6SM OVC070
     FM030200 15005KT P6SM -RA OVC045
     FM030600 16007KT P6SM -RA BKN025 OVC045

When this happens, I get 5 results:

1:

KPDX 021453Z 16004KT 10SM FEW007 SCT060 BKN200 11/09 A3002 RMK
     AO2 SLP166 T01060094 55008

2:

TAF AMD KPDX 021453Z 0215/0312 10005KT P6SM FEW006 SCT060

3:

BKN150
     FM021800 11005KT P6SM SCT050 OVC100
     FM022200 11007KT P6SM -RA OVC050
     FM030500 12005KT P6SM -RA OVC035

4:

KSEA 021453Z 15003KT 10SM FEW035 BKN180 11/09 A3001 RMK AO2
     SLP168 60000 T01110089 58010

5:

TAF AMD KSEA 021501Z 0215/0318 14004KT P6SM SCT020 BKN150
     FM021800 16005KT P6SM SCT025 OVC090
     FM030100 19005KT P6SM OVC070
     FM030200 15005KT P6SM -RA OVC045
     FM030600 16007KT P6SM -RA BKN025 OVC045

Can anyone figure out a regex that will correctly split this odd case? Alternatively I could try to remove the problem line break in the input string before running the regex on it, but I can’t figure out how to detect it.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-13T21:11:11+00:00Added an answer on June 13, 2026 at 9:11 pm

    You could start with a line that begins with a letter. Then collect at least one line, that starts with five spaces (you could easily loosen that condition to at least one whitespace character or something). And then go until the next line that starts with a word character.

    "^(\\w+.*?^[ ]{5}.*?)(?:^\\b|\\Z)"
    

    The [] around the space are not necessary, but I like to include them for readability. If you want only to assert, that there is a line that starts with any whitespace, replace [ ]{5} by \\s.

    Note that you do not have to use the capturing group. A lookahead will make sure that you end at a position that is followed by either a new report or the end of the file:

    "^\\w+.*?^[ ]{5}.*?(?=^\\b|\\Z)"
    

    This is slightly more efficient and cleans up the following code a bit (because you can use the full match instead of retrieving the group.

    Update:

    To accommodate the possibility of single-line reports (and in general) it is even easier, to change the ending condition ^\\b to match the beginning of a new report. According to the format description given in the comment, you could use:

    "^\\w+.*?(?=^(?:SPECI\\s|TAF\\sAMD\\s)?[A-Z]{3,4}\\s\\d+Z|\\Z)"
    

    This requires a new report to start with either “optional SPECI”-“3 or 4 letters”-“timestamp” or “optional TAF AMD”-“3 or 4 letters”-“timestamp”.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

For ScoreOption , I expect to get the following input A, B, and T_(state)
when I start GDB, I get the following error message in debugger: input:--- token
I'm using the following: el = doc.createElement(input); el.size = 2%; I get the error
Consider the following String : 5|12345|value1|value2|value3|value4+5|777|value1|value2|value3|value4?5|777|value1|value2|value3|value4+ Here is how I want to split string,
Learning sed, I want to change the drive letter of the following input lines
I am trying to get the first input field in a form that isn't
I have the following code that read input from txt file as follow Paris,Juli,5,3,6
I am trying for regular expression to get following thing Input- {foo} {bar} \{notgood}
So now that I got the autosuggest working... I want to get my form
I have an output evbuffer that I want to populate with the following data:

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.