I have seen in opensource projects by big companies on github that they do the simple programs in as small steps as possible.
They try to minimize the lines of code by using advanced methods, which make even the simple function calls appear complex.
How can I understand these things. It is not about advanced concepts, but doing simple things in efficient way, and that makes the code look hard to understand.
As an example this is the code of crawlerSpider from scrapy
def identity(x):
return x
class Rule(object):
def __init__(self, link_extractor, callback=None, cb_kwargs=None, follow=None, process_links=None, process_request=identity):
self.link_extractor = link_extractor
self.callback = callback
self.cb_kwargs = cb_kwargs or {}
self.process_links = process_links
self.process_request = process_request
if follow is None:
self.follow = False if callback else True
else:
self.follow = follow
This is where they used the rule
rules = ()
for link in links:
r = Request(url=link.url, callback=self._response_downloaded)
r.meta.update(rule=n, link_text=link.text)
yield rule.process_request(r)
Now I am not able to understand what is the use of identity function there its doing nothing
and process_request = identity so it means
rule.process_request(r) = identity(r) whcih returns r so basically that line equals to
yield r. why can’t he write that simple line instead of identity.
There are many things like this, which I am not able to comprehend.
Is there any book which explains writing programs that way. I don’t like books where they start with data types go chapter by chapter and in the end I still don’t see even a single program like we have on github.
The identity function is a sort of technicality, that lets you put your data through a process that expects a transformation when you really don’t want one.
For instance, if you have a function that expects an array and another function to map over the array, but you don’t want the mapping (but everything else the function does), then you pass it the identity function.
This pattern is very common in languages with good support for higher order abstractions, especially functional languages. It’s not limited to Python.