Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 466509
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 12, 20262026-05-12T23:29:14+00:00 2026-05-12T23:29:14+00:00

Is there a common way of dealing with tags in databases? I’m thinking of

  • 0

Is there a common way of dealing with tags in databases?

I’m thinking of using tinytext with pipes.
I think adding another table and using IDs might make it more complicated for little gain.

What’s your preferred way of doing this?

and what is the right way of doing queries in a table to find results matching multiple or single tags?

Thanks

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-12T23:29:15+00:00Added an answer on May 12, 2026 at 11:29 pm

    I’ll spread little heresy here.

    Big boys, including this site are using denormalised schemas for tags for scalability reasons, storing comma, pipe or space delimited tags in text type field for each row and separate table for tags with counts. Upon inserting or updating an item just check what tags were added or dropped and update counts accordingly (explode to arrays old and new tag strings and do array_diff() ).

    Now you have cheap way to display tag cloud with counts by simple SELECT * FROM tags, no fancy queries. To find items tagged with given name just do LIKE '%TAG%', this will work well for small traffic website (say less then 100k page views per day) and small data sets (again, say less than 100k of records). Above that you could use Fulltext Search to speed things up and ultimately proper search engine like Lucene or Sphinx.

    Finding related tags, like here on SO, is easy too (Kohana specific code, LIKE based, MySQL specific):

    $tags = array('foo', 'bar');
    
    private function get_related_tags( $tags )
    {
        ## Get db entries with specific tags and build array with counts
    
        ## is it cached already? ------------------------------------------------
        $this->cache = Cache::instance();
        $tags = array_filter( array_flip(array_flip($tags)) );
        sort($tags);
        $cache_name = implode('', $tags);
        $cache = $this->cache->get( $cache_name );
    
        if( $cache )
         return $cache;
    
        ## not cached, fire up ---------------------------------------------------
    
        $db = Database::instance();
    
        ## count tagged items ----------------------------------------------------
    
        // build like string
        $like = array();
        foreach( $tags as $tag )
           $like[] = "tags LIKE '%$tag%'";
    
        $like = implode(' AND ', $like);
    
        // get counts
        $count = $db->query("SELECT count(id) AS count FROM `articles` WHERE $like")->current()->count;
    
        ## check what tags are related ------------------------------------------
    
        $offset = 0;
        $step = 300;
    
        $related_tags = array();
    
        while( $offset < $count )
        {
            $assets = $db->query("SELECT tags FROM `articles` WHERE $like ORDER BY id ASC LIMIT $step OFFSET $offset");
    
            foreach($assets as $asset)
            {
                // tags 
                $input = explode( ' ', trim($asset->tags) );
                foreach( $input as $k => $v )
                {
                     if( $v == ''){
                         //do nothing, shouldnt be here anyway
                     }
                     elseif( array_key_exists($v, $related_tags) ){
                         $related_tags[$v]++;
                     }
                     else{
                        $related_tags[$v] = 1;
                     }        
                }
            }
            $offset += $step;
        }
    
        // remove already displayed from list
        foreach( $tags as $tag )
            unset( $related_tags[$tag] );
    
        ksort($related_tags);
    
        // set cache 
        $this->cache->set( $cache_name, array($related_tags, $count), 'related_tags_counts', 0);
    
        return array($related_tags, $count);
    }
    

    This is not really cheap so I keep counts cached for given set of tags until I make changes to tags in articles table.

    This setup is not perfect by any means, but certainly has some advantages. Schema is simple, getting tag cloud is straightforward, getting articles along with tags with one simple query (ie without subqueries). As main disadvantages I would see inability to rename or drop tag system-wide without amending every single row where it occurs, but hey, how often you do that anyway?

    Currently I’m using this setup for few projects of mine and it works like a dream, but I must admit these are not high traffic websites (hence I get away with LIKE), next year I will be able to test it with busy site but I’m pretty sure it will do. Normalization nazis will vote me down perhaps, but I just love simplicity of it and I’m happy to trade off cpu cycles for that.

    Actually I was going to post this tag system a while ago on SO and ask experts what they think of it so feel free to leave comments.

    Traditionally, sorry for my English, I believe it’s funny =)

    EDIT

    Since you’ve provided your requiremnents in comments, I think this setup is perfect for you. I’ve posted full Tag Model in pastie here, with methods to handle counts, Kohana specific but if you know Codeigniter you’ll feel home. Just use it this way:

    table TAGS: id, tag_name, tag_count
    
    // insert new item/article
    $tag_model->update_tags( $tags_str, null );
    
    // update existing item 
    $tag_model->update_tags( $new_tags_str, $old_tags_str ); // $old_tags as stored in db
    
    // delete item, you'll have to get item from db before deletion
    $tag_model->update_tags( null, $old_tags_str ); 
    

    I’ve amended the code as markdown have mangled it up, also queries are mySQL flavour, not SQLite.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Is there a common way of counting source lines of code (SLOC) in a
Is there a standard/common way to give compiler-style error messages that point to a
Is there a standard or common way in C++ to handle static strings that
Is there a way to set a common Backgroundcolor for all the tabs in
Is there any way to use common tools (ffmpeg? mplayer/mencoder? mkvmerge? etc ) to
I am wondering if there is some way to call C++ code from Common
Is there a common way to express the usage of arguments in C++? I
Is there any common way to decompose an expression created by [NSPredicate predicateWithFormat] to
When dealing with debugging queries using Profiler and SSMS, its pretty common for me
Is there a common way to share a lock between different objects operating on

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.