Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6814407
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T20:41:07+00:00 2026-05-26T20:41:07+00:00

I’m having an issue properly detecting my user agent against a string of agents,

  • 0

I’m having an issue properly detecting my user agent against a string of agents,
even though this is listed and run through preg_match no matter what bot I try
to emulate I can never get a positive. My current HTTP_USER_AGENT is

Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) 

my $str variable looks like this

/(abcdatos botlink|ariadne|aspider|atn worldwide|auresys|acme.spider|ahoy!|alkaline|alkalinebot|anthill|arachnophilia|arale|araneo|araybot|architextspider|aretha|ask jeeves|askjeeves|atomz|bspider|backrub|bay spider|bayspider|big brother|bjaaland|blackwidow|bloodhound|borg-bot|botlink|boxseabot|cactvs chemistry spider|cmc|calif|cassandra|checkbot|christcrawler.com|collective|combine system|computingsite robi|conceptbot|confuzzledbot|coolbot|cusco|cyberspyder|cydralspider|diibot|dnabot|dwcp|deweb|desert realm spider|die blinde kuh|dienstspider|digger|digimarc|digital integrity robot|direct hit grabber|download express|dragonbot|eit link verifier robot|elfinbot|esi|esirover|esismartspider|ebiness|emacs-w3|esther|evliya celebi|fdse|fastcrawler|felix ide|fetchrover|fish search|fluid dynamics robot|fouineur|freecrawl|funnelweb|gcreep|geneva|getbot|geturl|getterrobo-plus|getterroboplus puu|golem|googlebot|grapnel|griffon|gromit|gulliver|gulper|hi (html index) search|hku www octopus|htmlgobble|hambot|harvest|hometown spider pro|hulud|hyper-decontextualizer|hämähäkki|i, robot|ibm_planetwide|ingrid|ilse|imagelock|incywincy|infoseek robot 1.0|infospiders|informant|infoseek sidewinder|ingrid|inktomi slurp|inspector web|intelliagent|internet cruiser robot|internet shinchakubin|iron33|israeli-search|jbot|jbot java web robot|jcrawler|javabee|jeeves|jobo|jobo java web robot|jobot|joebot|jumpstation|kdd-explorer|kit-fireball|ko_yappo_robot|katipo|kilroy|lwp|labelgrab|labelgrabber|link validator|linkscan|linkscan server|linkscan workstation|linkwalker|linkidator|lockon|lycos|momspider|msnbot|msnbot|mac wwwworm|magpie|mattie|mediafox|merzscope|mindcrawler|monster|motor|mozilla 3.01 pbwf|mozilla|muncher|muninn|muscat ferret|muscatferret|mwd.search|mwdsearch|ndspider|nec-meshexplorer|nhse web forager|nederland.zoek|netcarta webmap engine|netmechanic|netscoop|nomad|northern light gulliver|objectssearch|occam|ontospider|open text index robot|openfind data gatherer|orb search|orbsearch|pgp key agent|pack rat|packrat|pageboy|parasite|patric|perlcrawler 1.0|perlcrawler|phantom|phpdig|piltdownman|pimptrain|pimptrain.com's robot|pioneer|plumtreewebaccessor|poppi|popular iconoclast|portal juice spider|portalb spider|portalbspider|portaljuice.com|puu|rbse spider|rhcs|raven|raven search|raven-v2|resume robot|rixbot|road runner: imagescape robot|road runner: the imagescape robot|roadhouse crawling system|robbie|robbie the robot|robocrawl|robocrawl spider|robofox|robofox v2.0|robot francoroute|robozilla|roverbot|rules|sg-scout|slcrawler|safetynet robot|scooter|search-au|search.aus-au.com|searchprocess|senrigan|shagseeker|shagseeker|shai|shai'hulud|sift|simbot|simmany robot ver1.0|site searcher|site valet|sitetech-rover|skymob.com|sleek|slurp|smart spider|snooper|solbot|spanner|speedy spider|spiderbot|spiderman|spiderman 1.0|spiderview(tm)|spiderline crawler|spry wizard robot|suke|sven|sygol|t-h-u-n-d-e-r-s-t-o-n-e|tach black widow|titan|tlspider|tarantula|tcl w3 robot|techbot|templeton|teoma|teomatechnologies|the jubii indexing robot|the nwi robot|the northstar robot|the peregrinator|the python robot|the tkwww robot|the web moose|the web wombat|the webfoot robot|the world wide web worm|titin|ucsd crawl|url check|url spider pro|udmsearch|ukonline|uptimebot|user-agent: mozilla|vwbot|vwbot_k|valkyrie|verticrawl|verticrawlbot|victoria|voyager|w3m2|wwwc|wwwc ver 0.2.5|walhello appie|wallpaper (alias crawlpaper)|web core |webbandit web spider|webbandit|webcatcher|webcopy|weblinker|webmechanic|webmirror|webmoose|webquest|webreaper|webspider|webstolperer|webvac|webwalker|webwatch|webzinger|webinator|weblog monitor|websnarf|wget|whowhere robot|wild ferret web hopper #1, #2, #3|wired digital|xget|xyleme robot|xavatoria|zilla"|awapclient|abcdatos|ahoy|ananzi|anthill|appie|arale|araneo|araybot|ariadne|arks|askjeeves|atn|atomz|auresys|bigbrother|bjaaland|blindekuh|borg-bot|boxseabot|bright.net caching robot|brightnet|bspider|cienciaficcion.net|cienciaficcion.net spider|calif|cassandra|cgireader|christcrawler|churl|cienciaficcion|cmc|combine|confuzzledbot|coolbot|cosmos|crawlpaper|cruiser|cusco|cyberspyder|cydralspider|desert realm|desertrealm|dienstspider|digger|diibot|directhit|dnabot|download_express|downloadexpress|dragonbot|dwcp|e-collector|ebiness|ecollector|elfinbot|esculapio|esther|evliyacelebi|fastcrawler|fetchrover|fido|fireball|fouineur|freecrawl|gammaspider|gammaspider, focusedcrawler|gazz|gcreep|gestalticonoclast|golem|googlebot|grabber|grapnel|griffon|gromit|gulliver|gulper|gulperbot|hambot|havindex|hometown|hotwired|ht:|htdig|html_analyzer|iajabot|iajabot|iconoclast|image.kapsi.net|imagelock|informant|infoseek|infospider|inspectorwww|irobot|javabee|jcrawler|jobo|kapsi|ko_yappo_robot|label-grabber|labelgrabber.txt|larbin|legs|linkidator|linkwalker|logo.gif|logo.gif crawler|logo_gif_crawler|magpie|marvin|mattie|mediafox|mnogosearch software|mnogosearch|moget|mouse.house|msnbot|muncher|muninn|muscatferret|myweb|netmechanic|netscoop|newscan-online|nil|nzexplorer|occam|orb_search|packrat|pageboy|parasite|patric|pegasus|perlcrawler|phpdig|piltdownman|pimptrain|pjspider|poppi|portalb|psbot|raven|rhcs|rixbot|roadrunner|robbie|robi|robocrawl|robofox|robozillaob o|rules|scooter|search-info|search_au|searchprocess|shaihulud|sharp-info-agent|sift|skymob|slurp|smartspider|snooper|solbot|speedy|spider_monkey|spiderbot|spiderline|spiderview|ssearcher|ssearcher100|straight flash!! getterroboplus 1.5|suke|suntek|sven|tach_bw|tarspider|techbot|templeton|the world wide web wanderer|titin|tlspider|topiclink|udmsearch|uptimebot|urlck|us|valkyrie|verticrawl|victoria|vision-search|void-bot|voidbot|voyager|vwbot|w3mir|w@pspider by wap4.com|w@pspider|wallpaper|wapspider|webcatcher|webfetcher|webinator|weblayers|webquest|webreader|webreaper|webs|webspider|webwalk|webwalker|wget|whatuseek winona|whatuseek_winona|whatuseek|whowhere|winona|wired-digital|wired-digital-newsbot|wlm|wlm-1.1|wolp|wwwc|wz101|xget)^$/

It should (theoretically) see googlebot in my $str variable, preg_match that with strtolower($_SERVER[‘HTTP_USER_AGENT’]), consider that a increment in $matches and pass headers, but never seems to? Here’s the code I am working with currently, maybe someone can shed some light on it for me?

    //looking for this
    $query = 'klat-badge'; 

    //if not found, continue
    if(strpos($content, $query) === false) { 

        //require banlist
        require('botlist.php'); 

        //compact banlist
        $str = strtolower('/(' . implode('|', $list) .')^$/');
        $matches = array();

        //can we find a match in user agent versus banlist?
        $numMatches = preg_match($str, strtolower($_SERVER['HTTP_USER_AGENT']), $matches, 'i');

            if($numMatches > 0 || $_GET['botban'] == 'true') {

                //so tell bots we're broken
                header("Status: 503");
                header($_SERVER["SERVER_PROTOCOL"].' 503 Service Temporarily Unavailable');

                exit;

            }
    }
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T20:41:08+00:00Added an answer on May 26, 2026 at 8:41 pm

    It won’t see “Googlebot” from the User-Agent, since your list specifies that as |googlebot|, and your regex doesn’t have the /i case-insensitivity modifier.

    And the ^$ at the end is certainly also wrong.

    Edit: Just noticed, you do have a 'i' parameter for your preg_replace call. That won’t work. The flags parameter will only accept integers, and those are intended for the PHP wrapper function, not passed over to the PCRE regex library.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a string like this: La Torre Eiffel paragonata all’Everest What PHP function
link Im having trouble converting the html entites into html characters, (&# 8217;) i
For some reason, after submitting a string like this Jack’s Spindle from a text
I'm parsing an RSS feed that has an ’ in it. SimpleXML turns this
Does anyone know how can I replace this 2 symbol below from the string
I want to count how many characters a certain string has in PHP, but
I would like to count the length of a string with PHP. The string
this is what i have right now Drawing an RSS feed into the php,
I've got a string that has curly quotes in it. I'd like to replace
Specifically, suppose I start with the string string =hello \'i am \' me And

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.