I need to uniquely identify and store some URLs. The problem is that sometimes

Question

0

Asked: May 18, 20262026-05-18T08:25:00+00:00 2026-05-18T08:25:00+00:00

I need to uniquely identify and store some URLs. The problem is that sometimes

0

I need to uniquely identify and store some URLs. The problem is that sometimes they come containing “..” like http://somedomain.com/foo/bar/../../some/url which basically is http://somedomain.com/some/url if I’m not wrong.

Is there a Python function or a tricky way to resolve this URLs ?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-18T08:25:00+00:00

There’s a simple solution using urllib.parse.urljoin:

>>> from urllib.parse import urljoin
>>> urljoin('http://www.example.com/foo/bar/../../baz/bux/', '.')
'http://www.example.com/baz/bux/'

However, if there is no trailing slash (the last component is a file, not a directory), the last component will be removed.

This fix uses the urlparse function to extract the path, then use (the posixpath version of) os.path to normalize the components. Compensate for a mysterious issue with trailing slashes, then join the URL back together. The following is doctestable:

from urllib.parse import urlparse
import posixpath

def resolve_components(url):
    """
    >>> resolve_components('http://www.example.com/foo/bar/../../baz/bux/')
    'http://www.example.com/baz/bux/'
    >>> resolve_components('http://www.example.com/some/path/../file.ext')
    'http://www.example.com/some/file.ext'
    """
    parsed = urlparse(url)
    new_path = posixpath.normpath(parsed.path)
    if parsed.path.endswith('/'):
        # Compensate for issue1707768
        new_path += '/'
    cleaned = parsed._replace(path=new_path)
    return cleaned.geturl()

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I need to uniquely identify and store some URLs. The problem is that sometimes

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply