Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6625929
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 25, 20262026-05-25T21:50:00+00:00 2026-05-25T21:50:00+00:00

I can’t seem to find a question on SO about my particular problem, so

  • 0

I can’t seem to find a question on SO about my particular problem, so forgive me if this has been asked before!

Anyway, I’m writing a script to loop through a set of URL’s and give me a list of unique urls with unique parameters.

The trouble I’m having is actually comparing the parameters to eliminate multiple duplicates. It’s a bit hard to explain, so some examples are probably in order:

Say I have a list of URL’s like this

  • hxxp://www.somesite.com/page.php?id=3&title=derp
  • hxxp://www.somesite.com/page.php?id=4&title=blah
  • hxxp://www.somesite.com/page.php?id=3&c=32&title=thing
  • hxxp://www.somesite.com/page.php?b=33&id=3

I have it parsing each URL into a list of lists, so eventually I have a list like this:

sort = [['id', 'title'], ['id', 'c', 'title'], ['b', 'id']]

I nee to figure out a way to give me just 2 lists in my list at that point:

new = [['id', 'c', 'title'], ['b', 'id']]

As of right now I’ve got a bit to sort it out a little, I know I’m close and I’ve been slamming my head against this for a couple days now :(. Any ideas?

Thanks in advance! 🙂

EDIT: Sorry for not being clear! This script is aimed at finding unique entry points for web applications post-spidering. Basically if a URL has 3 unique entry points

['id', 'c', 'title']

I’d prefer that to the same link with 2 unique entry points, such as:

['id', 'title']

So I need my new list of lists to eliminate the one with 2 and prefer the one with 3 ONLY if the smaller variables are in the larger set. If it’s still unclear let me know, and thank you for the quick responses! 🙂

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-25T21:50:01+00:00Added an answer on May 25, 2026 at 9:50 pm

    I’ll assume that subsets are considered “duplicates” (non-commutatively, of course)…

    Start by converting each query into a set and ordering them all from largest to smallest. Then add each query to a new list if it isn’t a subset of an already-added query. Since any set is a subset of itself, this logic covers exact duplicates:

    a = []
    for q in sorted((set(q) for q in sort), key=len, reverse=True):
        if not any(q.issubset(Q) for Q in a):
            a.append(q)
    a = [list(q) for q in a] # Back to lists, if you want
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm parsing an RSS feed that has an ’ in it. SimpleXML turns this
Can anyone help me trying to find out why this doesn't work. The brushes
Can somebody point me to a resource that explains how to go about having
Can anyone (maybe an XSL-fan?) help me find any advantages with handling presentation of
Can a LINQ enabled app run on a machine that only has the .NET
I have a jquery bug and I've been looking for hours now, I can't
Can I run this in a Windows command prompt like I can run it
Can anybody help me? What should be the datatype for this type -07:00:00 of
can someone explain why the compiler accepts only this code template<typename L, size_t offset,
Can't figure out how to do this in a pretty way : I have

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.