I want to extract all the urls from an XML file, excludeing the the tracking code in the url:
Here’s an example of a URL, they all follow the same format
http://www.domain.com.au/category/pXXXXXX?uni_id=XXXXXX&cid=1_demo_1
So the only thing that changes between the domains is XXXXXX which is a numerical value
The end result I want is
http://www.domain.com.au/category/pXXXXXX
I have tried to use preg_replace in the below code but it ended up replacing the whole URL with a random (i think) number
$data = preg_replace('/http\:\/\/www\.domain\.com.au\/[^\?]+([^.]+)/','',$data);
Match URLs in the XML with
preg_match():Then, you should use
preg_replace()and should only match the part of the string that needs to be removed: