Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8097331
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 5, 20262026-06-05T21:43:31+00:00 2026-06-05T21:43:31+00:00

I posted a related but still different question regarding Protobuf-Net before, so here goes:

  • 0

I posted a related but still different question regarding Protobuf-Net before, so here goes:

I wonder whether someone (esp. Marc) could comment on which of the following would most likely be faster:

(a) I currently store serialized built-in datatypes in a binary file. Specifically, a long(8 bytes), and 2 floats (2x 4 bytes). Each 3 of those later make up one object in deserialized state. The long type represents DateTimeTicks for lookup purposes. I use a binary search to find the start and end locations of a data request. A method then downloads the data in one chunk (from start to end location) knowing that each chunk consists of a packet of many of above described triplets(1 long, 1 float, 1 float) and each triplet is always 16 bytes long. Thus the number triples retrieved is always (endLocation – startLocation)/16. I then iterate over the retrieved byte array, deserialize (using BitConverter) each built-in type and then instantiate a new object made up of a triplet each and store the objects in a list for further processing.

(b) Would it be faster to do the following? Build a separate file (or implement a header) that functions as index for lookup purposes. Then I would not store individual binary versions of the built-in types but instead use Protbuf-net to serialize a List of above described objects (= triplet of int, float, float as source of object). Each List would contain exactly and always one day’s worth of data (remember, the long represents DateTimeTick). Obviously each List would vary in size and thus my idea of generating another file or header for index lookup purposes because each data read request would only request a multiple of full days. When I want to retrieve the serialized list of one day I would then simply lookup the index, read the byte array, deserialize using Protobuf-Net and already have my List of objects. I guess why I am asking is because I do not fully understand how deserialization of collections in protobuf-net works.

To give a better idea about the magnitude of the data, each binary file is about 3gb large, thus contains many millions of serialized objects. Each file contains about 1000 days worth of data. Each data request may request any number of day’s worth of data.

What in your opinion is faster in raw processing time? I wanted to garner some input before potentially writing a lot of code to implement (b), I currently have (a) and am able to process about 1.5 million objects per second on my machine (process = from data request to returned List of deserialized objects).

Summary: I am asking whether binary data can be faster read I/O and deserialized using approach (a) or (b).

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-05T21:43:33+00:00Added an answer on June 5, 2026 at 9:43 pm

    I currently store serialized built-in datatypes in a binary file. Specifically, a long(8 bytes), and 2 floats (2x 4 bytes).

    What you have is (and no offence intended) some very simple data. If you’re happy dealing with raw data (and it sounds like you are) then it sounds to me like the optimum way to treat this is: as you are. Offsets are a nice clean multiple of 16, etc.

    Protocol buffers generally (not just protobuf-net, which is a single implementation of the protobuf specification) is intended for a more complex set of scenarios:

    • nested/structured data (think: xml i.e. complex records, rather than csv i.e. simple records)
    • optional fields (some data may not be present at all in the data)
    • extensible / version tolerant (unexpected or only semi-expected values may be present)
      • in particular, can add/deprecate fields without it breaking
    • cross-platform / schema-based
    • and where the end-user doesn’t need to get involved in any serialization details

    It is a bit of a different use case! As part of this, protocol buffers uses a small but necessary field-header notation (usually one byte per field), and you would need a mechanism to separate records, since they aren’t fixed-size – which is typically another 2 bytes per record. And, ultimately, the protocol buffers handling of float is IEEE-754, so you would be storing the exact same 2 x 4 bytes, but with added padding. The handling of a long integer can be fixed or variable size within the protocol buffers specification.

    For what you are doing, and since you care about fastest raw processing time, simple seems best. I’d leave it “as is”.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I posted a related question to no response but here I'll be less narrow.
This post is a follow-up of a related question posted here by Ran .
Question related to HABTM has been posted in some good numbers on stackoverflow but
This question is related to the question posted here: Why isn't my custom WCF
I posted a question related to this topic earlier, but I'm having trouble figuring
This is related to a question i posted earlier.I am still stuck on it.Heres
I've read several related posts and even posted and answer here but it seems
This is somewhat related to the question posed in this question but I'm trying
Some time ago i posted a question related to a WriteableBitmap memory leak, and
This is a related question to one I posted earlier... I'm trying to sum

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.