Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8815415
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 14, 20262026-06-14T04:22:16+00:00 2026-06-14T04:22:16+00:00

I’m a novice Java programmer, and am just now beginning to expand into the

  • 0

I’m a novice Java programmer, and am just now beginning to expand into the world of libraries, APIs, and the like. I’m at the point where I have an idea that is relatively simple, and can be my pet project when I’m not working on homework.

I’m interested in scraping html from a few different sites, and building strings that look like ” Artist – “Track Name” “. I’ve got one site working the way I want, but I feel it could be done a lot more smoothly… Here’s the rundown on what I do for Site A:

I have JSoup create Elements for everything that is of the class plrow like so:

<p class="plrow"><b><a href="playlist.php?station=foo">Artist</a></b> “Title” (<span class="sn_ld"><a href="playlist.php?station=foo">Label</a></span>) <SMALL><b>N </b></SMALL></p></td></tr><tr class="ev"><td><a name="98069"></a><p class="pltime">Time</p>

From there, I create a String array of lines that are split after the last </p>, then use the following code to process the array:

for (int i = 0; i < tracks.length; i++){
            tracks[i] = Jsoup.parse(tracks[i]).text();
            tracks[i] = tracks[i].split("”")[0];
            tracks[i] = tracks[i].toString()+ "”";          
        }

Which is a pretty hackish way to get Artist "Title" the way I want, but the result is fine for me.

Site B is a little bit different.

I’ve determined that the Artists and Titles are all contained like this:
<span class="artist" property="foaf:name">Artist Name</span> </a> </span> <span class="title" property="dc:title">Title</span>

along with more information, all inside of <li id="segmentevent-random" class="segment track" typeof="po:MusicSegment" about="/url"> song info </li>

I was trying to go through and snag all of the artists first, and then the titles and then merge the two, but I was having trouble with that because the “dc:title” property used to display the track title is used for other non music things, so I can’t directly match up the artist with a track.

I have spent the lion’s share of this weekend trying to get this working by viewing countless questions tagged with Jsoup, and spending a lot of time reading the Jsoup cookbook and API guide. I have a feeling that part of my trouble could also stem from my relatively limited knowledge of how web pages are coded, though that may mostly be my trouble with my understanding of how to plug these bits of code into Jsoup.

I appreciate any help or guidance, and I’ve got to say, it’s really nice to ask a non-homework question here (though I find quite a few hints from what others have asked! 😉 )

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-14T04:22:17+00:00Added an answer on June 14, 2026 at 4:22 am

    Common:

    If you have some different websites where you want to parse content its a good idea to differ between them. Maybe you can decide if you parse Page A or Page B by the URL.

    Example:

    if( urlToPage.contains("pagea.com") )
    {
        // Call parsemethod for Page A or create parserclass
    }
    else if( urlToPage.contains("pageb.com") )
    {
        // Call parsemethod for Page B or create parserclass
    }
    // ... 
    else
    {
        // Eg. throw Exception because there's no parser available
    }
    

    You can connect and parse each page into a document with a single line of code:

    // Note: the protocol (http) is required here
    Document doc = Jsoup.connect("http://pagewhaterver.com").get(); 
    

    Without knowing the Html or the structure of each page, here are some basic approaches:

    Page A:

    for( Element element : doc.select("p.plrow") )
    {
        String title = element.ownText();                           // Title - output: '“Title” ()' (you have to replace the " and () here)
        String artist = element.select("a").first().text();         // Artist
        String label = element.select("span.sn_ld").first().text(); // Label
    
        // etc.
    }
    

    Page B:

    Similar to Page B, Artitst and Title can be selected like this:

    String artist = doc.select("span.artist").first().text();
    String title = doc.select("span.title").first().text();
    

    Here’s a good overview of the Jsoup Selector API: http://jsoup.org/cookbook/extracting-data/selector-syntax

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a string like this: La Torre Eiffel paragonata all&#8217;Everest What PHP function
this is what i have right now Drawing an RSS feed into the php,
link Im having trouble converting the html entites into html characters, (&# 8217;) i
I have just tried to save a simple *.rtf file with some websites and
I have a French site that I want to parse, but am running into
I have thousands of HTML files to process using Groovy/Java and I need to
I have a .ini file as follows: [playlist] numberofentries=2 File1=http://87.230.82.17:80 Title1=(#1 - 365/1400) Example
For some reason, after submitting a string like this Jack’s Spindle from a text
I've got a string that has curly quotes in it. I'd like to replace
I have a small JavaScript validation script that validates inputs based on Regex. I

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.