Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6221035
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 24, 20262026-05-24T08:03:52+00:00 2026-05-24T08:03:52+00:00

When you join tables which are distributed on the same key and used these

  • 0

When you join tables which are distributed on the same key and used these key columns in the join condition, then each SPU (machine) in netezza works 100% independent of the other (see nz-interview).

In hive, there’s bucketed map join, but the distribution of the files representing the tables to datanode is the responsibility of HDFS, it’s not done according to hive CLUSTERED BY key!

so suppose I have 2 tables, CLUSTERED BY the same key, and I join by that key – can hive get a guarantee from HDFS that matching buckets will sit on the same node? or will it always have to move the matching bucket of the small table to the datanode containing the big table bucket?

Thanks, ido

(note: this is a better phrasing of my previous question: How does hive/hadoop assures that each mapper works on data that is local for it?)

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-24T08:03:53+00:00Added an answer on May 24, 2026 at 8:03 am

    I think it is not possible to tell to HDFS where to store blocks of data.
    I can consider the following trick, which will do for small clusters – to increase replication factor for one of the tables to the number close or equal to the number of nodes in the cluster.
    As a result – during join process appropriate data will be almost always (or always) present on the required node.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I want to join two tables (below) which i get. But then I want
I have two tables which I want to join together using a left outer
I have a MySQL Left Join problem. I have three tables which I'm trying
I want to join 2 tables 'addresses' and 'user_info' on user_id and app_id (which
A legacy database contains a join table which links tables table1 and table2, and
I'm porting a process which creates a MASSIVE CROSS JOIN of two tables. The
I have two tables which get updated at almost the exact same time -
I have 2 tables which doesn't have any references to each other and I'm
I have two tables which I would like to join by ID field. I
I want to LEFT JOIN two tables with the same column name. I have

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.