Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6034077
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 23, 20262026-05-23T05:34:38+00:00 2026-05-23T05:34:38+00:00

I need to parse an URL. I’m currently using urlparse.urlparse() and urlparse.urlsplit(). The problem

  • 0

I need to parse an URL. I’m currently using urlparse.urlparse() and urlparse.urlsplit().

The problem is that i can’t get the “netloc” (host) from the URL when it’s not present the scheme.
I mean, if i have the following URL:

http://www.amazon.com/Programming-Python-Mark-Lutz/dp/0596158106/ref=sr_1_1?ie=UTF8&qid=1308060974&sr=8-1

I can’t get the netloc: http://www.amazon.com

According to python docs:

Following the syntax specifications in
RFC 1808, urlparse recognizes a netloc
only if it is properly introduced by
‘//’. Otherwise the input is presumed
to be a relative URL and thus to start
with a path component.

So, it’s this way on purpose. But, i still don’t know how to get the netloc from that URL.

I think i could check if the scheme is present, and if it’s not, then add it, and then parse it. But this solution doesn’t seems really good.

Do you have a better idea?

EDIT:
Thanks for all the answers. But, i cannot do the “startswith” thing that’s proposed by Corey and others. Becouse, if i get an URL with other protocol/scheme i would mess it up. See:

If i get this URL:

ftp://something.com

With the code proposed i would add “http://” to the start and would mess it up.

The solution i found

if not urlparse.urlparse(url).scheme:
   url = "http://"+url
return urlparse.urlparse(url)

Something to note:

I do some validation first, and if no scheme is given i consider it to be http://

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-23T05:34:39+00:00Added an answer on May 23, 2026 at 5:34 am

    The documentation has this exact example, just below the text you pasted. Adding ‘//’ if it’s not there will get what you want. If you don’t know whether it’ll have the protocol and ‘//’ you can use a regex (or even just see if it already contains ‘//’) to determine whether or not you need to add it.

    Your other option would be to use split(‘/’) and take the first element of the list it returns, which will ONLY work when the url has no protocol or ‘//’.

    EDIT (adding for future readers): a regex for detecting the protocol would be something like re.match('(?:http|ftp|https)://', url)

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I need to parse a URL to get the protocol, host, path, and query
I need to parse a url to get a list of urls that link
I only need to parse URL Request.Querystrings on GET, not on postback, right? if(!IsPostBack)
I have a request URL i need to parse that URL and have to
I'm using javascript and need to parse out a query argument that is a
I need to parse many html files using php. foreach($url_array as $url){ $file =
I need to parse a xml file using jQuery from an external domain. How
I need to parse the following URL: http://m.flickr.com/#/photos/westconn/ I need to get the following
I need to build a function which parses the domain from a URL. So,
I have a rather big number of source files that I need parse and

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.