I’ll try to explain in detail what I need: I’m parsing an RSS feed

Question

0

Asked: May 31, 20262026-05-31T10:40:58+00:00 2026-05-31T10:40:58+00:00

I’ll try to explain in detail what I need: I’m parsing an RSS feed

0

I’ll try to explain in detail what I need:

I’m parsing an RSS feed in Python using feedparser. This feed has, of course, a list of items, with title, link and description just like a common RSS feed.

In the other hand I have a list of strings with some keywords I need to find in the item’s description.

What I need to do is find the item which has the most keyword matches

Example:

RSS feed

<channel>
    <item>
        <title>Lion</title>
        <link>...</link>
        <description>
            The lion (Panthera leo) is one of the four big cats in the genus 
            Panthera, and a member of the family Felidae.
        </description>
    </item>
    <item>
        <title>Panthera</title>
        <link>...</link>
        <description>
            Panthera is a genus of the Felidae (cats), which contains 
            four well-known living species: the tiger, the lion, the jaguar, and the leopard.
        </description>
    </item>
    <item>
        <title>Cat</title>
        <link>...</link>
        <description>
            The domestic cat is a small, usually furry, domesticated, 
            carnivorous mammal. It is often called the housecat, or simply the 
            cat when there is no need to distinguish it from other felids and felines.
        </description>
    </item>
</channel>

Keyword list

['cat', 'lion', 'panthera', 'family']

So in this case, the item with most (unique) matches is the first one, because it contains all 4 keywords (doesn’t matter it says ‘cats’ instead of just ‘cat’, I just need to find the literal keyword inside the string)

Let me clarify that even if some description contained the ‘cat’ keyword 100 times (and none of the other keywords), this will not be the winner, because I’m looking for the most keywords contained, not the most times a keyword appears.

Right now, I’m looping over the rss items and doing it “manually”, counting the times a keyword appears (but I’m having the problem mentioned in the above paragraph).

I’m very new at Python and I come from a different kind of language (C#), so I’m sorry if this is pretty trivial.

How would you approach to this problem?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-31T10:41:00+00:00

texts = [ "The lion (Panthera leo) ...", "Panthera ...", "..." ]
keywords  = ['cat', 'lion', 'panthera', 'family']

# gives the count of `word in text`
def matches(text):
    return sum(word in text.lower() for word in keywords)

# or inline that helper function as a lambda:
# matches = lambda text:sum(word in text.lower() for word in keywords)

# print the one with the highest count of matches
print max(texts, key=matches)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’ll try to explain in detail what I need: I’m parsing an RSS feed

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply