Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8033909
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 5, 20262026-06-05T01:51:04+00:00 2026-06-05T01:51:04+00:00

Junior python programmer here and I’ve been beating my head against a brick wall

  • 0

Junior python programmer here and I’ve been beating my head against a brick wall on unexpected for loop and dictionary behavior. I’m looping through a CSV file of log entries and parsing the data into a categories dict. When I initialize the categories dict each time through the loop, it works as expected..

Like so:

log_entries = AutoVivification()
# http://stackoverflow.com/questions/635483/what-is-the-best-way-to-implement-nested-dictionaries-in-python

def scrublooper(log_file):

    for ll in log_file:
    # Initialize  categories dict every round through the loop
    categories = {'requests': {'Content_Visual': 0, 'Content_ProgramsUpdates': 0, 'Content_Text': 0, 'Pages': 0, 'Content_Files': 0}, 'filter_action': {'re': 0, 'pl': 0, 'bs': 0}}
    lld = LogDomain(ll)
    domain, hostname, lan_host = lld.domain, lld.hostname, lld.lan_host


    mimetypes = url_searcher(Settings.mimetypes, lld.mime_type)

    if mimetypes:
        category = mimetypes[2]

        if not log_entries[lan_host].has_key(domain): 
            log_entries[lan_host][domain]= categories

        log_entries[lan_host][domain]['requests'][category] += 1 

print log_entries['192.168.5.210']['google.com']['requests']
print log_entries['192.168.5.210']['webtrendslive.com']['requests']
print log_entries['192.168.5.210']['osnews.com']['requests']
print log_entries['192.168.5.210']['question-defense.com']['requests']
print log_entries['192.168.5.210']['optimost.com']['requests']

Output from this look is what I would expect:

{'Content_Visual': 0, 'Content_ProgramsUpdates': 0, 'Content_Text': 95, 'Pages': 0, 'Content_Files': 0}
{'Content_Visual': 0, 'Content_ProgramsUpdates': 0, 'Content_Text': 1, 'Pages': 0, 'Content_Files': 0}
{'Content_Visual': 0, 'Content_ProgramsUpdates': 0, 'Content_Text': 2, 'Pages': 0, 'Content_Files': 0}
{'Content_Visual': 0, 'Content_ProgramsUpdates': 0, 'Content_Text': 18, 'Pages': 0, 'Content_Files': 0}
{'Content_Visual': 0, 'Content_ProgramsUpdates': 0, 'Content_Text': 3, 'Pages': 0, 'Content_Files': 0}

HOWEVER! Here is my problem. I don’t want to initialize the categories dict every time through the loop. In this simplified example case it doesn’t matter, but down the road for this program, it’ll cause significant performance degradation (30%).

I need to initialize the categories dict ONCE:

log_entries = AutoVivification()
categories = {'requests': {'Content_Visual': 0, 'Content_ProgramsUpdates': 0, 'Content_Text': 0, 'Pages': 0, 'Content_Files': 0}, 'filter_action': {'re': 0, 'pl': 0, 'bs': 0}}

def scrublooper(log_file):

    for ll in log_file:
    lld = LogDomain(ll)
    # etc, etc, etc

However, when I initialize the categories dict ANYWHERE outside the for loop (whether in the scrublooper function or simply right after the log_entries variable), the output is:

{'Content_Visual': 0, 'Content_ProgramsUpdates': 0, 'Content_Text': 685, 'Pages': 0, 'Content_Files': 0}
{'Content_Visual': 0, 'Content_ProgramsUpdates': 0, 'Content_Text': 685, 'Pages': 0, 'Content_Files': 0}
{'Content_Visual': 0, 'Content_ProgramsUpdates': 0, 'Content_Text': 685, 'Pages': 0, 'Content_Files': 0}
{'Content_Visual': 0, 'Content_ProgramsUpdates': 0, 'Content_Text': 685, 'Pages': 0, 'Content_Files': 0}
{'Content_Visual': 0, 'Content_ProgramsUpdates': 0, 'Content_Text': 685, 'Pages': 0, 'Content_Files': 0}

All ‘Conent_Text’ values have incremented equally! What is happening here? I’m sure I’ve violating some python principle but don’t know what or how to find out. It took me hours simply to figure out the problem was connected to the categories dict.

Much obliged for any explanation.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-05T01:51:05+00:00Added an answer on June 5, 2026 at 1:51 am

    I’m not familiar with the tools you’re using, but when you create the dictionary outside of the loop, you’re just creating one dictionary.

    if not log_entries[lan_host].has_key(domain): 
            log_entries[lan_host][domain]= categories
    

    This code just makes log_entries[lan_host][domain] point to that single dictionary. Python doesn’t copy the values or anything like that. So these lines refer to the same dictionary.

    log_entries['192.168.5.210']['google.com']
    log_entries['192.168.5.210']['webtrendslive.com']
    

    P.S. I can’t say for sure, but my gut says that not wanting to initialize a new dictionary for performance is probably excessive.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

A junior programmer in our office has an unfortunate (but understandable) habit of using
I'm a junior VB.net developer with little application design knowledge. I've been reading a
It's me your junior programmer extraodinaire I'm trying to display an image from my
I'm been programming with Python in AppEngine for the last months, and I need
My first question on here so be nice! I am a junior developer with
I'm a junior programmer, i do not get the concept of MVC! My method
So lately I've been catching a lot of crap from a junior developer whenever
Let me set the stage here. I'm a very junior developer who's recently made
I'm using Windows, and trying to install html5lib-0.90 library on python C:\>python C:\Users\Junior\Downloads\Python\html5lib-0.90\setup.py install
We have a junior programmer that simply doesn't write enough tests. I have to

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.