I want to parse a robots.txt file and extract the sitemap reference. Assuming that

Question

0

Asked: May 29, 20262026-05-29T23:39:46+00:00 2026-05-29T23:39:46+00:00

I want to parse a robots.txt file and extract the sitemap reference. Assuming that

0

I want to parse a robots.txt file and extract the sitemap reference. Assuming that the file is something like this;

stuff
foobar
Sitemap: http://www.cgdomestics.co.uk/sitemap.xml
hello world
more stuff

I’m trying to use regex to extract exactly this;

http://www.cgdomestics.co.uk/sitemap.xml

So far I have this PHP code;

<?php
  $robots_url = "http://www.cgdomestics.co.uk/robots.txt";
  $robots_file = file_get_contents($robots_url);
  $pattern = "/Sitemap: .*/";
  $i = preg_match($pattern, $robots_file, $match);
  echo $match[0];
?>

The output of the above is;

Sitemap: http://www.cgdomestics.co.uk/sitemap.xml

but I want it to output only;

http://www.cgdomestics.co.uk/sitemap.xml

Can I use regex to return exactly what I want or do I need to do another step to remove the “Sitemap: ” part? Or is there a better way to do this?

As you can probably tell I’m an infrequent user of PHP and regex.

Thanks.

Nigel

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-29T23:39:48+00:00

Editorial Team

2026-05-29T23:39:48+00:00Added an answer on May 29, 2026 at 11:39 pm

Set a sub pattern and grab it from the matches array

<?php
  $robots_url = "http://www.cgdomestics.co.uk/robots.txt";
  $robots_file = file_get_contents($robots_url);
  $pattern = "/Sitemap: ([^\r\n]*)/";
  $i = preg_match($pattern, $robots_file, $match);
  echo $match[1];
?>

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I want to parse a robots.txt file and extract the sitemap reference. Assuming that

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply