I have a problem. My customers (and potential customers) are asking me if my crawler based software (sitemapper, website cloner and similar) can access their SharePoint websites.
However, I am not using Sharepoint myself. I only think I know it’s a generalized collaboration / document sharing plaform that also allows to intranet/internet websites? I also believe SharePoint runs on top of IIS, right? So there ought to be HTTP access?
Currently my software supports
HTTP, HTTPs and normal disk/network paths. It can log in into most HTTPs websites with no problems through POST forms and cookies. It also support various “oldschool” basic authentication.
But SharePoint is something different it seems. How can my software get access to HTTP(s) of intranet SharePoint websites? I would be happy if I could provide my users with a guideline.
(I would think it would just a matter of “log in” the computer running my software…? And then possibly give it the correct address to crawl?)
Agreed – I think the easiest solution is to run your software under an account within AD that has the required access privileges to the sites you want to crawl. This way credentials are passed through from the currently logged in user.
This will only work if the intranet sites are indeed using Windows Authentication (and most probably will) – but some intranet/extranet sites might be using FBA or different authentication methods, so keep this in mind.