Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 719119
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 14, 20262026-05-14T05:36:21+00:00 2026-05-14T05:36:21+00:00

I have a large database of resumes (CV), and a certain table skills grouping

  • 0

I have a large database of resumes (CV), and a certain table skills grouping all users skills.

inside that table there’s a field skill_text that describes the skill in full text.

I’m looking for an algorithm/software/method to extract significant terms/phrases from that table in order to build a new table with standarized skills..

Here are some examples skills extracted from the DB :

  • Sectoral and competitive analysis
  • Business Development (incl. in international settings)
  • Specific structure and road design software – Microstation, Macao, AutoCAD (basic knowledge)
  • Creative work (Photoshop, In-Design, Illustrator)
  • checking and reporting back on campaign progress
  • organising and attending events and exhibitions
  • Development : Aptana Studio, PHP, HTML, CSS, JavaScript, SQL, AJAX
  • Discipline: One to one marketing, E-marketing (SEO & SEA, display, emailing, affiliate program) Mix marketing, Viral Marketing, Social network marketing.

The output shoud be something like :

  • Sectoral and competitive analysis
  • Business Development
  • Specific structure and road design software –
  • Macao
  • AutoCAD
  • Photoshop
  • In-Design
  • Illustrator
  • organising events
  • Development
  • Aptana Studio
  • PHP
  • HTML
  • CSS
  • JavaScript
  • SQL
  • AJAX
  • Mix marketing
  • Viral Marketing
  • Social network marketing
  • emailing
  • SEO
  • One to one marketing

As you see only skills remains no other representation text.

I know this is possible using text mining technics but how to do it ?
the database is realy large.. it’s a good thing because we can calculate text frequency and decide if it’s a real skill or just meaningless text…
The big problem is .. how to determin that “blablabla” is a skill ?

Edit :
please don’t tell me to use standard things like a text tokinzer, or regex .. because users input skills in a very arbitrary way !!

thanks

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-14T05:36:21+00:00Added an answer on May 14, 2026 at 5:36 am

    If I was doing this programmatically I would:

    Extract all punctuation delimited data (or perhaps just brackets and commas) into a new table (with no primary key, just skill) so Creative work (Photoshop, In-Design, Illustrator) becomes

     Skill            
     -------------
     Creative work    
     Photoshop        
     In-Design        
     Illustrator      
    

    Then, after you’ve proceed all CVs, query for the most common skills (this is MySQL)

    SELECT skill, COUNT(1) cnt FROM newTable GROUP BY skill ORDER BY cnt DESC;
    

    Which may look like this contrived example

     Skill            Cnt
     ---------------------
     Photoshop        3293
     Illustrator      2134
     Creative work     932
     In-Design         123
    

    Then you decide, from the top X skills, which you want to capture, which must map to other skills (Indesign and In-design should map to the same skill, for example) and which to discard, then script the process using a data map.

    Use the data map to write a new word frequency table (this time skill_id, skill, frequency) and the second time when parsing the data also write to a lookup table (cv_id,skill_id). Your data will then be in a state where each CV is mapped to a number of skills, and each skill to a number of CVs. You can query for the most popular skills, CVs matching certain criteria etc.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a large database table that I need to display on a Windows
I have a large database of users (~200,000) that I'm transferring from a ASP.NET
I have a relatively large database application that has a user table with a
I have a large database and would like to select table names that have
I have a large database that contains many urls, there are many domains repeating
I have large database table, approximately 5GB, now I wan to getCurrentSnapshot of Database
I have one large database table of request data, much like Apache request logs,
I have a large database with over 150 tables that I've recently been handed.
I have a very large database with about 120 Million records in one table.I
I have a large database that were used to archive tables before implementing structural

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.