I am currently starting a project titled Cloud computing for time series mining algorithms

Question

0

Asked: June 9, 20262026-06-09T04:48:33+00:00 2026-06-09T04:48:33+00:00

I am currently starting a project titled Cloud computing for time series mining algorithms

0

I am currently starting a project titled “Cloud computing for time series mining algorithms using Hadoop”.
The data which I have is hdf files of size over a terabyte.In hadoop as I know that we should have text files as input for further processing (map-reduce task). So I have one option that I convert all my .hdf files to text files which is going to take a lot of time.

Or I find a way of how to use raw hdf files in map reduce programmes.
So far I have not been successful in finding any java code which reads hdf files and extract data from them.
If somebody has a better idea of how to work with hdf files I will really appreciate such help.

Thanks
Ayush

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-09T04:48:35+00:00

For your first option, you could use a conversion tool like HDF dump to dump HDF file to text format. Otherwise, you can write a program using Java library for reading HDF file and write it to text file.

For your second option, SciHadoop is a good example of how to read Scientific datasets from Hadoop. It uses NetCDF-Java library to read NetCDF file. Hadoop does not support POSIX API for file IO. So, it uses an extra software layer to translate POSIX call of NetCDF-java library to HDFS(Hadoop) API calls. If SciHadoop does not already support HDF files, you might go along a little harder path and develop a similar solution yourself.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am currently starting a project titled Cloud computing for time series mining algorithms

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply