Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8121871
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 6, 20262026-06-06T05:38:36+00:00 2026-06-06T05:38:36+00:00

I am working on a project which lists file sharing urls from the likes

  • 0

I am working on a project which lists file sharing urls from the likes of Oron, filespost, depositfiles etc that reports sharing of copyrighted materials to identified content owners and rights holders in my network.

To better improve the service, which currently stands at a table populated from MySQL database with some filters built in to the php, I want to be able to identify the links that have ceased to function.

My thoughts are that when the data is retrieved from the MySQL database the download URL column entries (the url to the file or file host page) will be checked to see if they link to the actual file sharing page that allows users to start the download process, if they are working and provide the ability to download the file they should be left, link text or the cell colour turned green, if the file site displays file not found or similar the link text or cell background colour should turn red.

At present there is no quick and easy visual representation of active or inactive links.

I have a simple validation on the url based on if a 404 error is received but quickly realised that won’t work given that these sites don’t 404 or redirect even, they change the dynamically generated page to say the file is not available or file has been removed etc.

I have also incorporated a link checker script that uses a third part file share link checking service but this would require manual checks and manual updating of the database.

I have also checked to see if I can find specific fields or words on the page, but the given the range of sites and the broader range of terms used on the sites this to has been proven to be accurate and difficult to implement on all links.

It would also be helpful if urls could then be filtered down based on the active status. I’m guessing if the colour change was managed by a link class or cell class style I could filter the column based on class eg: link-dead or link-active. I think I can do this so help with this last bit on filtering based on class is not necessarily required.

Any help would be greatly appreciated.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-06T05:38:39+00:00Added an answer on June 6, 2026 at 5:38 am

    As the sites you want to check are created by different people, there is unlikely to be a one-liner to detect if a link is broken or not over a vast number of sites.

    I suggest that you create a simple function for each site that detects if the link is broken for that particular site. When you want to check a link, you would decide which function to run on the external site’s HTML based on the domain name.

    You can use parse_url() to extract the domain/host from the file links:

    // Get your url from the database. Here I'll just set it:
    $file_url_from_database = 'http://example.com/link/to/file?var=1&hello=world#file'
    
    $parsed_link = parse_url($file_url_from_database);
    $domain = $parsed_link['host']; // $domain now equals 'example.com'
    

    You could store the function names in an associative array and call them that way:

    function check_domain_com(){ ... }
    function check_example_com(){ ... }
    
    $link_checkers = array();
    $link_checkers['domain.com'] = 'check_domain_com';
    $link_checkers['example.com'] = 'check_example_com';
    

    or store the functions in the array (PHP >=5.3).

    $link_checkers = array();
    $link_checkers['domain.com'] = function(){ ... };
    $link_checkers['example.com'] = function(){ ... };
    

    and call these with

    if(isset($link_checkers[$domain]))
        // call the function stored under the index 'example.com'
        call_user_func($link_checkers[$domain]); 
    else
        throw( new Exception("I don't know how to check the domain $domain") );
    

    Alternatively you could just use a bunch of if statements

    if($domain == 'domain.com')
        check_domain_com();
    else if($domain == 'example.com')
        check_example_com(); // this function is called
    

    The functions could return a boolean (true or false; 0 or 1) to use, or call another function themselves if needed (for example to add an extra CSS class to broken links).

    I did something similar recently, though I was fetching metadata for stock photography from multiple sites. I used an abstract class because I had a few functions to run for each site.

    As a side note, it would be wise to store the last checked date in your database and limit the checking rate to something like 24 or 48 hours (or further apart depending on your needs).


    Edit to clarify implementation a little:

    As making HTTP requests to other websites is potentially very slow, you will want to check and update link statuses independently of page loads. You could achieve this like this:

    • A script could run every 12 hours and check all links from the database that were last checked more than 24 hours ago. For each ‘old’ link, it would update the active and last_checked columns in your database appropriately.
    • When someone requests a page, your script would read from the active column in your database instead of downloading the remote page to check every time.
    • (extra thought) When a new link is submitted, it is checked immediately in the script, or added to a queue to be checked by the server as soon as possible.

    As people can easily click a link to check it’s current state, it would be redundant to allow them to click a button to check from your page (nothing against the idea though).

    Note that the potentially resource-heavy update-all script should not be executable (accessible) via web.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Iam working on a project which involves writing a Mork File (Mork is a
I am working on a project which involves getting data from a .doc or
I'm working on a project which required me to create an CCArray that contains
I have been working on a project in which I take a file with
I'm working on a project which builds a KML file, taking GPS coordinates and
I have been working on a project that dynamically creates a javascript file using
For a project I'm working on, I'm implementing a linked-list data-structure, which is based
For a project I'm working on, I'm implementing a linked-list data-structure, which is based
Background: I'm working a project which uses Django with a Postgres database. We're also
Am working on a project which uses HBase. Even though I formed the rowkey

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.