I want to crawl some data out of a phpBB forum i’m a member

Question

0

Asked: May 17, 20262026-05-17T02:56:12+00:00 2026-05-17T02:56:12+00:00

I want to crawl some data out of a phpBB forum i’m a member

0

I want to crawl some data out of a phpBB forum i’m a member of. But for that, login is required. I can login using cURL, but if I try to crawl the data after logging in using cURL, it still shows that I need to login before viewing that page. Is it possible to login using cURL AND retain that session to do some farther job?

Another thing, that forum usually shows a confirmation page after logging in and then after 5sec, automatically redirects to the index page. And the thing is, if I login using cURL, my script also follow that header location and shows me that page..

Any workaround of this?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-17T02:56:13+00:00

This is what usually works for me


$timeout=5;
$file='cookies.jar';
$this->handle=curl_init('');
curl_setopt($this->handle, CURLOPT_COOKIEFILE,  $file);
curl_setopt($this->handle, CURLOPT_COOKIEJAR,   $file);
curl_setopt($this->handle, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($this->handle, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($this->handle, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($this->handle, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($this->handle, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.6) Gecko/20100625 Firefox/3.6.6 (.NET CLR 3.5.30729)");
curl_setopt($this->handle, CURLOPT_TIMEOUT, round($timeout,0));
curl_setopt($this->handle, CURLOPT_CONNECTTIMEOUT, round($timeout,0));

and i generally use it like this


$now=grab_first_page();
if(not_logged_in($now)) {
   send_login_info();
}
if(not_logged_in()) { end_of_script_with_error(); }
// rest of script

This way the cookies are kept across sessions and the script does not have to login every time it does something.

— explian for below —-

Im using an object, but you can replace $this->handle with a simple variable named $mycurl, the lines will be like


$mycurl=curl_init(''
curl_setopt($mycurl, CURLOPT_COOKIEFILE, $file)

What the code below does is:
– initialize “a curl instance” (to keep it simple) (3rd line)
– 4th and 5th line: save cookies to a file. Curl works just like a browser, so when you login to a page with curl it keeps the cookies with the authentication data in memory. I’m telling it to save it to a file so that the second time i run the script it will have the same cookies and will not need to authenticate again. Or you can have multiple scripts using the same cookie file, and just one for login that you run every 24 hours or whenever you’re logged out…
– other settings:
* followlocation – when curl receives a http redirect it should return the page it was redirected to, not the redirect code
* useragent – curl presents itself as firefox
* timeout – how much time should it wait for a connection to be established, 5 or 10 is more than enough usually

I have put a simple class i use here http://pastebin.com/Rfpc103X

you can use it like this



// -- initialize curl
$ec=new easyCurl;

// -- set some options
//if the file you are in right now is named file_a.php it will create a file_a.jar cookie file
$ec->start(str_replace('.php','.jar',__FILE__));
$ec->headersPrepare(false);
$ec->prepareTimeOut(20);

$url='http://www.google.com/';

// --- set url
$ec->curlPrepare($url);

// --- get the actual data
$page=$ec->grab();

echo $page;

// to send GET data
$get_data=array('id'=>10);
$ec->curlPrepare($url,$get_data);

// and to post data
$post_data=array('user'=>'blue','password'=>'black');
$ec->curlPrepare($url,array(),$post_data);

It handles automatically the settings for POST/GET and other option i usually encounter. I hope the examples above will be useful to you. Good luck.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I want to crawl some data out of a phpBB forum i’m a member

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply