I am making a Python script that parses an Excel file using the xlrd

Question

0

Asked: June 15, 20262026-06-15T23:52:17+00:00 2026-06-15T23:52:17+00:00

I am making a Python script that parses an Excel file using the xlrd

0

I am making a Python script that parses an Excel file using the xlrd library.
What I would like is to do calculations on different columns if the cells contain a certain value. Otherwise, skip those values. Then store the output in a dictionary.
Here’s what I tried to do :

import xlrd


workbook = xlrd.open_workbook('filter_data.xlsx')
worksheet = workbook.sheet_by_name('filter_data')

num_rows = worksheet.nrows -1
num_cells = worksheet.ncols - 1

first_col = 0
scnd_col = 1
third_col = 2

# Read Data into double level dictionary
celldict = dict()
for curr_row in range(num_rows)  :

    cell0_val = int(worksheet.cell_value(curr_row+1,first_col))
    cell1_val = worksheet.cell_value(curr_row,scnd_col)
    cell2_val = worksheet.cell_value(curr_row,third_col)

    if cell1_val[:3] == 'BL1' :
        if cell2_val=='toSkip' :
        continue
    elif cell1_val[:3] == 'OUT' :
        if cell2_val == 'toSkip' :
        continue
    if not cell0_val in celldict :
        celldict[cell0_val] = dict()
# if the entry isn't in the second level dictionary then add it, with count 1
    if not cell1_val in celldict[cell0_val] :
        celldict[cell0_val][cell1_val] = 1
        # Otherwise increase the count
    else :
        celldict[cell0_val][cell1_val] += 1

So here as you can see, I count the number of “cell1_val” values for each “cell0_val”. But I would like to skip those values which have “toSkip” in the adjacent column’s cell before doing the sum and storing it in the dict.
I am doing something wrong here, and I feel like the solution is much more simple.
Any help would be appreciated. Thanks.

Here’s an example of my sheet :

cell0 cell1  cell2
12    BL1    toSkip
12    BL1    doNotSkip
12    OUT3   doNotSkip
12    OUT3   toSkip
13    BL1    doNotSkip
13    BL1    toSkip
13    OUT3   doNotSkip

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-15T23:52:18+00:00

Use collections.defaultdict with collections.Counter for your nested dictionary.

Here it is in action:

>>> from collections import defaultdict, Counter
>>> d = defaultdict(Counter)
>>> d['red']['blue'] += 1
>>> d['green']['brown'] += 1
>>> d['red']['blue'] += 1
>>> pprint.pprint(d)
{'green': Counter({'brown': 1}),
 'red': Counter({'blue': 2})}

Here it is integrated into your code:

from collections import defaultdict, Counter
import xlrd

workbook = xlrd.open_workbook('filter_data.xlsx')
worksheet = workbook.sheet_by_name('filter_data')

first_col = 0
scnd_col = 1
third_col = 2

celldict = defaultdict(Counter)
for curr_row in range(1, worksheet.nrows): # start at 1 skips header row

    cell0_val = int(worksheet.cell_value(curr_row, first_col))
    cell1_val = worksheet.cell_value(curr_row, scnd_col)
    cell2_val = worksheet.cell_value(curr_row, third_col)

    if cell2_val == 'toSkip' and cell1_val[:3] in ('BL1', 'OUT'):
        continue

    celldict[cell0_val][cell1_val] += 1

I also combined your if-statments and changed the calculation of curr_row to be simpler.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am making a Python script that parses an Excel file using the xlrd

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply