Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8551723
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 11, 20262026-06-11T14:15:28+00:00 2026-06-11T14:15:28+00:00

This is similar to Compute dates and durations in mysql query , except that

  • 0

This is similar to Compute dates and durations in mysql query, except that I don’t have a unique ID column to work with, and I have samples not start/end points.

As an interesting experiment, I set cron to ps aux > 'date +%Y-%m-%d_%H-%M'.txt. I now have around 250,000 samples of “what the machine was running”.

I would like to turn this into a list of “process | cmd | start | stop”. The assumption is that a ‘start’ event is the first time when the pair existed, and a ‘stop’ event is the first sample where it stopped existing: there is no chance of a sample “missing” or anything.

That said, what ways exist for doing this transformation, preferably using SQL (on the grounds that I like SQL, and this seems like a nice challenge). Assuming that pids cannot be repeated this is a trivial task (put everything in a table, SELECT MIN(time), MAX(time), pid GROUP BY pid). However, since PID/cmd pairs are repeated (I checked, there are duplicates), I need a method that does a true “find all contiguous segments” search.

If necessary I can do something of the form

Load file0 -> oldList
ForEach fileN:
    Load fileN ->newList
    oldList-newList = closedN
    newList-oldList = openedN
    oldList=newList

But that is not SQL and not interesting. And who knows, I might end up having real SQL data to deal with with this property at some point.

I’m thinking something where one first constructs a table of diff’s, and then joins all close’s against all open’s and pulls the minimum-distance close after each open, but I’m wondering if there’s a better way.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-11T14:15:29+00:00Added an answer on June 11, 2026 at 2:15 pm

    You don’t mention what database you are using. Let me assume that you are using a database that supports ranking functions, since that simplifies the solution.

    The key to solving this is an observation. You want to assign an id to each pid to see if it is unique. I am going to assume that a pid represents a single process when the pid did not appear in the previous timestamped output.

    Now, the idea is:

    1. Assign a sequential number to each set of output. The first call to ps gets 1, the next 2, and so on, based on date.
    2. Assign a sequential number to each pid, based on date. The first appearance gets 1, the next 2, and so on.
    3. For pids that appear in sequence, the difference is a constant. We can call this the groupid for that set.

    So, this is the query in action:

    select groupid, pid, min(time), max(time)
    from (select t.*,
                 (dense_rank() over (order by time) -
                  row_number() over (partition by pid order by time)
                 ) as groupid
          from t
         ) t
    group by groupid, pid
    

    This works in most databases (SQL Server, Oracle, DB2, Postgres, Teradata, among others). It does not work in MySQL because MySQL does not support the window/analytic functions.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a simple RMI 'compute' server application (similar to this ) that accepts
I have checked this similar question, but the suggestions did not solve my problem:
This is similar to a question that has already been asked. However, I am
I have a set of arrays that are very large and expensive to compute,
I need to split this/similar string to get the VALUE <a href=javascript:void(0); id=def_ name=color
I found this similar question here , but this is really old. Was it
int main() { int p; scanf(%d,&p); fun() { int arr[p]; // isn't this similar
This is similar to: How to open a file using JavaScript? Goal: to retrieve/open
(this is similar to GNU make: Execute target but take dependency from file but
This is similar to this question, which is about a bash file . We've

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.