Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8185167
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 7, 20262026-06-07T01:45:09+00:00 2026-06-07T01:45:09+00:00

I wondered if unordered_map is implemented using type erasure, since an unordered_map<Key, A*> and

  • 0

I wondered if unordered_map is implemented using type erasure, since an unordered_map<Key, A*> and unordered_map<Key, B*> can use exactly the same code (apart from casting, which is a no-op in machine code). That is, the implementation of both could be based on unordered_map<Key, void*> to save code size.

Update: This technique is commonly referred to as the Thin Template Idiom (Thanks to the commenters below for pointing that out).

Update 2: I would be particlarly interested in Howard Hinnant‘s opinion. Let’s hope he reads this.

So I wrote this small test:

#include <iostream>

#if BOOST
# include <boost/unordered_map.hpp>
  using boost::unordered_map;
#else
# include <unordered_map>
  using std::unordered_map;
#endif

struct A { A(int x) : x(x) {} int x; };
struct B { B(int x) : x(x) {} int x; };

int main()
{
#if SMALL
    unordered_map<std::string, void*> ma, mb;
#else
    unordered_map<std::string, A*> ma;
    unordered_map<std::string, B*> mb;
#endif

    ma["foo"] = new A(1);
    mb["bar"] = new B(2);

    std::cout << ((A*) ma["foo"])->x << std::endl;
    std::cout << ((B*) mb["bar"])->x << std::endl;

    // yes, it leaks.
}

And determined the size of the compiled output with various settings:

#!/bin/sh

for BOOST in 0 1 ; do
    for OPT in 2 3 s ; do
        for SMALL in 0 1 ; do
            clang++ -stdlib=libc++ -O${OPT} -DSMALL=${SMALL} -DBOOST=${BOOST} map_test.cpp -o map_test
            strip map_test
            SIZE=$(echo "scale=1;$(stat -f "%z" map_test)/1024" | bc)
            echo boost=$BOOST opt=$OPT small=$SMALL size=${SIZE}K
        done
    done
done

It turns out, that with all settings I tried, lots of inner code of unordered_map seems to be instantiated twice:

With Clang and libc++:
          |   -O2   |   -O3   |   -Os
-DSMALL=0 |  24.7K  |  23.5K  |  28.2K
-DSMALL=1 |  17.9K  |  17.2K  |  19.8K


With Clang and Boost:
          |   -O2   |   -O3   |   -Os
-DSMALL=0 |  23.9K  |  23.9K  |  32.5K
-DSMALL=1 |  17.4K  |  17.4K  |  22.3K


With GCC and Boost:
          |   -O2   |   -O3   |   -Os
-DSMALL=0 |  21.8K  |  21.8K  |  35.5K
-DSMALL=1 |  16.4K  |  16.4K  |  26.2K

(With the compilers from Apple’s Xcode)

Now to the question: Is there some convincing technical reason due to which the implementers have chosen to omit this simple optimization?

Also: why the hell is the effect of -Os exactly the opposite of what is advertised?

Update 3:

As suggested by Nicol Bolas, I have repeated the measurements with shared_ptr<void/A/B> instead of naked pointers (created with make_shared and cast with static_pointer_cast). The tendency in the results is the same:

With Clang and libc++:
          |   -O2   |   -O3   |   -Os
-DSMALL=0 |  27.9K  |  26.7K  |  30.9K
-DSMALL=1 |  25.0K  |  20.3K  |  26.8K


With Clang and Boost:
          |   -O2   |   -O3   |   -Os
-DSMALL=0 |  35.3K  |  34.3K  |  43.1K
-DSMALL=1 |  27.8K  |  26.8K  |  32.6K
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-07T01:45:12+00:00Added an answer on June 7, 2026 at 1:45 am

    Since I’ve been specifically asked to comment, I will, though I’m not sure I have much more to add than has already been said. (sorry it took me 8 days to get here)

    I’ve implemented the thin template idiom before, for some containers, namely vector, deque and list. I don’t currently have it implemented for any container in libc++. And I’ve never implemented it for the unordered containers.

    It does save on code size. It also adds complexity, much more so than the referenced wikibooks link implies. One can also do it for more than just pointers. You can do it for all scalars which have the same size. For example why have different instantiations for int and unsigned? Even ptrdiff_t can be stored in the same instantiation as T*. After all, it is all just a bag bits at the bottom. But it is extremely tricky to get the member templates which take a range of iterators correct when playing these tricks.

    There are disadvantages though (besides difficulty of implementation). It doesn’t play nearly as nicely with the debugger. At the very least it makes it much more difficult for the debugger to display container innards. And while the code size savings can be significant, I would stop short of calling the code size savings dramatic. Especially when compared to the memory required to store the photographs, animations, audio clips, street maps, years of email with all of the attachments from your best friends and family, etc. I.e. optimizing code size is important. But you should take into account that in many apps today (even on embedded devices), if you cut your code size in half, you might cut your app size by 5% (statistics admittedly pulled from thin air).

    My current position is that this particular optimization is one best paid for and implemented in the linker instead of in the template container. Though I know this isn’t easy to implement in the linker, I have heard of successful implementations.

    That being said, I still do try to make code size optimizations in templates. For example in the libc++ helper structures such as __hash_map_node_destructor are templated on as few parameters as possible, so if any of their code gets outlined, it is more likely that one instantiation of the helper can serve more than one instantiation of unordered_map. This technique is debugger friendly, and not that hard to get right. And can even have some positive side effects for the client when applied to iterators (N2980).

    In summary, I wouldn’t hold it against code for going the extra mile and implementing this optimization. But I also wouldn’t classify it as high a priority as I did a decade ago, both because linker technology has progressed, and the ratio of code size to application size has tended to fairly dramatically decrease.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I wondered if anybody can come up with a shortened version for this code:
I wondered if/how I can override the default [Authorize] behavior in ASP.NET MVC. I
I wondered if there is a programming language which compiles to machine code/binary that
I've always wondered, lets say you have two asp.net websites running in the same
I wondered if I should write my code clean and readable or rather small
Just wondered if someone could clarify the use of BeginInvoke on an instance of
I have always wondered when to use identifiers (for example, functions) with capital first
wondered if you can help me with a php/query question. I want to format
I wondered if anyone can help me out - I'm attempting to load a
Just wondered if someone could point me some simple code which shows multi threaded

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.