I’m thinking the answer is no based on the experimentation I’ve done. However I wasn’t sure if I was doing things correctly.
My function is:
select buyer_key, DBMS_UTILITY.get_hash_value(buyer_key||'|'||buyer_entity_id||'|'||buyer_io_id||'|'||buyer_line_item_id||'|'||is_billing_enabled||'|'||currency_id_b_trgt||'|'||currency_id_b_prfrd||'|'||ymdh_max,1,POWER(2,16)-1) as hashvalue from network_buyer_dim order by hashvalue asc;
When I run it it returns numerous rows with duplicate hashkey values. But when I go to the database and look at those rows (BTW, each buyer_key is unique) I see that the rows DO NOT contain the same values.
Am I calling the function correctly?
Obviously NOT!!
This means that if the input domain set size is bigger than the output domain set size there shold be duplicates.
In addition to this the best hash funcions are considered those ones that tend to give the same number of duplicate output values for all the possible input values.