Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8340395
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 9, 20262026-06-09T05:10:15+00:00 2026-06-09T05:10:15+00:00

So I found a script online for xml parsing in linux that I am

  • 0

So I found a script online for xml parsing in linux that I am wanting to use, and I was hoping to get some help with understanding how the script works, and how to edit it for my own use.

Here is the script (credit)

#!/bin/bash

cat $1 | awk '

START {    pos=1;    xml=$0    len=length(xml);    endp=1 }

{    while(pos <= len)    {
      if(substr(xml,pos,7) == "<title>")
      {
         pos=pos+7;
         endp=pos;
         while((substr(xml,endp,8) != "</title>") && (endp < len))
         {
            endp++;
         }
         print "   ",substr(xml,pos,endp-pos)," * ";
         pos=endp+7;
      }
      pos++;    } }'

Here is a simplified sample of the xml data I will be using

I have already gotten rid of the extra characters on both sides of the tags and made a few other adjustments by changing the script to this

  #!/bin/bash

    cat $1 | awk '

    START {    pos=1;    xml=$0    len=length(xml);    endp=1 }

    {    while(pos <= len)    {
          if(substr(xml,pos,16) == "<sport><![CDATA[")
          {
             pos=pos+16;
             endp=pos;
             while((substr(xml,endp,11) != "]]></sport>") && (endp < len))
             {
                endp++;
             }
             print "",substr(xml,pos,endp-pos),"";
             pos=endp+10;
          }
          pos++;    } }'

So using this script leaves me with a plain text file with this result

Women's Soccer
Men's Soccer
Women's Soccer

Ultimately I’d like to have a script output the following

Women's Soccer Away @ South Carolina (Exhibition) at 7:00 PM
Men's Soccer Home vs. Ohio State at 7:00 PM
Women's Soccer Away @ William and Mary at 7:00 PM

For those wondering, this is the shell that calls the parse script (ignore file names and locations)

wget -O rss.xml http://en-us.fxfeeds.mozilla.com/en-US/firefox/headlines.xml
        ~dsl/bin/rssparse! rss.xml > headlines_$$.tmp
        cd /tmp/ldmtrx
        split --lines=30 /tmp/headlines_$$.tmp ldmtrxnews
        cd /tmp
        rm headlines_$$.tmp rss.xml 

While it would be greatly appreciated, I don’t expect anyone to complete this task for me, I’d just really like some tips and help getting started. I’m not sure how to run this script on a different element and then print both elements (for example <sport> and <homeaway>) I could run the script again, but then the elements would be printed on different lines.

Lastly, I’d like to know how to exclude all data that does not have a <date> matching today’s date. Thanks for your help.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-09T05:10:16+00:00Added an answer on June 9, 2026 at 5:10 am

    You must know that your example lacks of validation. It needs some tweaks

    check this pastie instead of that pastie

    then using xmlstarlet you can superseed all that this script does.

    $ wget --output-document - http://pastie.org/pastes/4408130/download | xmlstarlet sel -t -m rss/channel/item -v sport -o ' Away @ ' -v opponent -o ' at ' -v time -na
    

    That outputs:

    Women's Soccer Away @ South Carolina (Exhibition) at 7:00 PM
    Men's Soccer Away @ Ohio State (Exhibition) at 7:00 PM
    Women's Soccer Away @ William and Mary at 7:00 PM
    

    And when the output is what you need you can use -C with xmlstarlet to show an xml template you can source in any language you need that particular parsing.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I found this script that gives you the username in Windows, but I get
I found this script online that creates a thumbnail out of a image but
I reverse coded a script I found online and it works well, the only
I found this perl script online but I don't know how to use it.
Did some search online, found simple 'tutorials' to use named pipes. However when I
I found a script online that I thought was going to do what I
Hi I found this script online that adds an onChange event to an element
I found this script attached to a modified index page. This looks like some
I found a script on here that will export each worksheet in a file
I have just found a script we are using which has a sub that

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.