i want to parse a site with the PHP DOM-Document way: Note it is

Question

0

Asked: May 22, 20262026-05-22T21:15:31+00:00 2026-05-22T21:15:31+00:00

i want to parse a site with the PHP DOM-Document way: Note it is

0

i want to parse a site with the PHP DOM-Document way: Note it is faster and easier to use. Some of you have convinced me!! One question – since i am a php-newbie 😉 can i apply the XPaths-code

Example: http://buergerstiftungen.de/cps/rde/xchg/SID-F8780E81-ABF20567/buergerstiftungen/hs.xsl/db.htm

Goal: to fetch the results ( approx 213 different records) too and parse them in order to get a database-dump for the saving on a local MySQL-Db!?

by the way: see two resultpages:

http://buergerstiftungen.de/cps/rde/xchg/SID-F8780E81-ABF20567/buergerstiftungen/hs.xsl/db_20302.htm
http://buergerstiftungen.de/cps/rde/xchg/SID-F8780E81-ABF20567/buergerstiftungen/hs.xsl/db_20289.htm

You see there are lots of information stored…

well i have tried to do write a scraper with Perl – but i had no luck. Perl is for newbies very very hard. Afterwards i tired to write a parser in PHP – it is a bit easier. But the site (see the detail-resultpages) are a bit complex. How to parse them – in order to get the dataset for a locally based MySQL database. Then i have more opportunities for a retrieval.
I want to get the datas to have them local (on my OpenSuse Linux System Version 11.3) in a MySQL-database.

well: i have three parts:

fetching
parsing
storing (in MySQL: that is creating a MySQL-dump)

Since i have some very little experience with XPath i have a Xpather-Tool in my Mozilla-Browser. But i am not sure how i should apply them – see the data i gathered – below:
Perhaps some of you can help me here – and show me how to apply them in a parsercode:

I love to hear from you

See here some details:
for the results (from the approx 213 different records) – see two resultpages: – gathered some Xpath-datas:

Example: Bürgerstiftung Wiesloch
http://buergerstiftungen.de/cps/rde/xchg/SID-A7DCD0D1-702CE0FA/buergerstiftungen/hs.xsl/db_20289.htm

/html/body/div[@id=’main’]/div[@id=’wrapper’]/div[@id=’inner’]/div[@id=’marginalblock’]/div[1]/p

1. Gründungsgeschichte
/html/body/div[@id=’main’]/div[@id=’wrapper’]/div[@id=’inner’]/div[@id=’contentblock’]/div/p[1]/strong

2. Kurzvorstellung/Ziele
/html/body/div[@id=’main’]/div[@id=’wrapper’]/div[@id=’inner’]/div[@id=’contentblock’]/div/p[2]/span[2]/span/b

3. Projekte
/html/body/div[@id=’main’]/div[@id=’wrapper’]/div[@id=’inner’]/div[@id=’contentblock’]/div/p[3]/span[2]/span/strong

Kontakt:
/html/body/div[@id=’main’]/div[@id=’wrapper’]/div[@id=’inner’]/div[@id=’marginalblock’]/div[1]/h6

Question: well, how to apply the gained datas in the Libxml – in order to get the PARSER-Part up and running!? I am a XPath-starter!

Look forward to hear from you!
zero

PS – if i have to add more infos – or if i have to ask more propperly – plz let me know! Sorry for being the newbie!;-)

PPS – and update: i have the Mysql-part: it can look like this:

CREATE TABLE IF NOT EXISTS `address` (
`id` int(4) NOT NULL auto_increment,
`name` varchar(30) default NULL,
`contact-details` varchar(30) default NULL,
`street` varchar(30) default NULL,
`postal-code` varchar(30) default NULL,
`town` varchar(30) default NULL,
`phone` varchar(30) default NULL,
`email` varchar(30) default NULL,
`homepage` varchar(30) default NULL,
`summary` varchar(30) default NULL,
`projects` varchar(30) default NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=9 ;

something like this would fit the needs..

Update; many many thanks Lenzai for the quick answer:

you suggest to try something like this:

$url="http://...";
$xpath_query="/html/body/...";

/html/body/div[@id='main']/div[@id='wrapper']/div[@id='inner']/div[@id='marginalblock']/div[1]/p
/html/body/div[@id='main']/div[@id='wrapper']/div[@id='inner']/div[@id='contentblock']/div/p[1]/strong
/html/body/div[@id='main']/div[@id='wrapper']/div[@id='inner']/div[@id='contentblock']/div/p[2]/span[2]/span/b
/html/body/div[@id='main']/div[@id='wrapper']/div[@id='inner']/div[@id='contentblock']/div/p[3]/span[2]/span/strong
/html/body/div[@id='main']/div[@id='wrapper']/div[@id='inner']/div[@id='marginalblock']/div[1]/h6

$ch=curl_init($url);
$res=curl_exec($ch);
$dom = new DOMDocument()
$dom->loadHTML($res);
$xpath=new DomXPath($dom);
$node= $xpath->query($xpath_query)->item(0);
echo $node->nodeValue;

I have Curl enabled here. That is no problem. And the Xpaths i should enter

in this line: $xpath_query=”/html/body/…”;

Question: should i enter all Xpaths that are mentioned above..from 1. to 3… and so forth How does this look like finally Can you help me here – i am very very new to php?

Look forward to hear from you!! Many many thanks for all and any help!

zero

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-22T21:15:32+00:00

Editorial Team

2026-05-22T21:15:32+00:00Added an answer on May 22, 2026 at 9:15 pm

try something like this

$url="http://...";
$xpath_query="/html/body/...";
$ch=curl_init($url);
$res=curl_exec($ch);
$dom = new DOMDocument()
$dom->loadHTML($res);
$xpath=new DomXPath($dom);
$node= $xpath->query($xpath_query)->item(0);
echo $node->nodeValue;

you just need to enable curl in your php.ini

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

i want to parse a site with the PHP DOM-Document way: Note it is

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply