Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 5958721
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 22, 20262026-05-22T18:35:01+00:00 2026-05-22T18:35:01+00:00

I am already parsing pages with the HtmlAgilityPack, and getting most img sources. However

  • 0

I am already parsing pages with the HtmlAgilityPack, and getting most img sources. However many websites include img urls in places other than the img src attributes (e.g. inlined javascript, a different attribute, a different element). I would like to cast a slightly wider net and run a regex on the entire html string capture the following in a regex.

  1. Must begin with http://, https://, //, or /
  2. Then, any number of valid url path characters
  3. Must end with either, .jpeg, .jpg, .png, or .gif

I imagine this would be simple to write, however I am not an awesome regexer. I imagine the parts would look like this

  1. ^((https?\:\/\/)|(\/{1,2}))
  2. (any ideas?)
  3. (.(jpe?g|png|gif))$

Can anyone help me fill the blanks?

Thanks

Answer

(https?:)?//?[^\'"<>]+?\.(jpg|jpeg|gif|png)
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-22T18:35:01+00:00Added an answer on May 22, 2026 at 6:35 pm

    There are a number of ad-hoc regular expressions for matching URLs out there, but none that I am aware of claim total reliability. However, this one will attempt to satisfy your conditions.

    According to [1], valid URL characters (which are not reserved) are alphanumeric and the symbols $-_.+!*'(),. However, there are reserved characters as well, which are +/?%#& which is concisely given by [2] — I couldn’t find a list in the bulk of the RFC. I know there are other characters used for query strings though, namely =;, so those need inclusion. Then you run into issues that not everyone properly encodes their URL characters, so spaces may be present among other things (which I do not know how to account for as how a browser auto-corrects things can be mystifying).

    Therefore, you might just assume that anything can be in a URL, but merely it must start with something particular and end with something particular (which you provided) but this is still unreliable.

    @(https?:)?//?[^'"<>]+?\.(jpg|jpeg|gif|png)@

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I already looked at a similar question here : null pointer exception-parsing json with
I am trying to automate JavaScript forms/pages in IE7 using VBA. I have already
I'm parsing this date format from XML:=> 2011-12-06T07:41:14.016+00:00 , and I'm getting this error:
i have already finished code to passing data to url , and it did
Already implemented performance boosters : - Get compatible image of GraphicsConfiguration to draw on
Already finished implementing the player. I want to implement the progress bar. But I
Already found this page with some helpful hints. Problem is I need to debug
I already asked how HSV color pickers work, now I would like to know
I already have implemented JOOQ with Union Platform as a java based game server
Canon already has an application that allows me to stream the live view to

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.