Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8097919
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 5, 20262026-06-05T21:54:48+00:00 2026-06-05T21:54:48+00:00

Current Process: I have a tar.gz file. (Actually, I have about 2000 of them,

  • 0

Current Process:

  1. I have a tar.gz file. (Actually, I have about 2000 of them, but that’s another story).
  2. I make a temporary directory, extract the tar.gz file, revealing 100,000 tiny files (around 600 bytes each).
  3. For each file, I cat it into a processing program, pipe that loop into another analysis program, and save the result.

The temporary space on the machines I’m using can barely handle one of these processes at once, never mind the 16 (hyperthreaded dual quad core) that they get sent by default.
I’m looking for a way to do this process without saving to disk. I believe the performance penalty for individually pulling files using tar -xf $file -O <targetname> would be prohibitive, but it might be what I’m stuck with.

Is there any way of doing this?

EDIT: Since two people have already made this mistake, I’m going to clarify:

  • Each file represents one point in time.
  • Each file is processed separately.
  • Once processed (in this case a variant on Fourier analysis), each gives one line of output.
  • This output can be combined to do things like autocorrelation across time.

EDIT2: Actual code:

for f in posns/*; do
    ~/data_analysis/intermediate_scattering_function < "$f"
done | ~/data_analysis/complex_autocorrelation.awk limit=1000 > inter_autocorr.txt
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-05T21:54:49+00:00Added an answer on June 5, 2026 at 9:54 pm

    This sounds like a case where the right tool for the job is probably not a shell script. Python has a tarfile module which can operate in streaming mode, letting you make only a single pass through the large archive and process its files, while still being able to distinguish the individual files (which the tar --to-stdout approach will not).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a current process that exists in 2 MS Access databases that exports
I am new to iOS projects. In my current project, I have to process
I want to design a splash screen that can show the current loading process
I want to make pipe or queue in Python between one process (current) and
I am currently in the process of creating an application that records current location
Im running a long process using ProgressBox, and in that process im using System.Web.HttpContext.Current.Server.MapPath()
I am attempting to open the current process's executable file for read-write operations (I
I have a Web server that reads and writes to a data file on
I have a current existing project repository that has a lot of code that
We are currently upgrading a current data import process we have written in C#.

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.