I code in python and know very little about html, mysql, javascript or other database type languages.
I’m using pythons urllib module to retrieve web source code and I would like to know if there is a way to identify if a webpage has dynamic content. By dynamic content i mean, any autonomous changes the source code not deriving from user input. For example, if an advert on that webpage changes every 10 minutes. Even if I load the page twice and compare the source code, it will not pick up that the page is in fact dynamic. I’m interested in knowing if there are any ‘keywords’ I can be on the lookout for in the source code that will identify that the webpage is using dynamic content.
Thanks
update:
I dont claim to know anything about javascript but I found the following code in a page which I know is dynamic, but often does not reveal it:
document.write('<script language="JavaScript" src="http://ad.doubleclick.net...Could
document.writebe a good keyword for identifying dynamic pages
It is a very hard thing to do. Basically you would look for ajax requests and see where it leads you to. If you want to parse that dynamic content you would have to use a javascript interpretor or a browser like loading type. I can’t see other solutions.
Good luck.