This relates to a project to convert a 2-way ANOVA program in SAS to

Question

0

Asked: May 16, 20262026-05-16T16:13:45+00:00 2026-05-16T16:13:45+00:00

This relates to a project to convert a 2-way ANOVA program in SAS to

0

This relates to a project to convert a 2-way ANOVA program in SAS to Python.

I pretty much started trying to learn the language Thursday, so I know I have a lot of room for improvement. If I’m missing something blatantly obvious, by all means, let me know. I haven’t got Sage up and running yet, nor numpy, so right now, this is all quite vanilla Python 2.6.1. (portable)

Primary query: Need a good set of list comprehensions that can extract the data in lists of samples in lists by factor A, by factor B, overall, and in groups of each level of factors A&B (AxB).

After some work, the data is in the following form (3 layers of nested lists):

response[a][b][n]

(meaning [a1 [b1 [n1, … ,nN] …[bB [n1, …nN]]], … ,[aA [b1 [n1, … ,nN] …[bB [n1, …nN]]]
Hopefully that’s clear.)

Factor levels in my example case: A=3 (0-2), B=8 (0-7), N=8 (0-7)

byA= [[a[i] for i in range(b)] for a[b] in response]

(Can someone explain why this syntax works? I stumbled into it trying to see what the parser would accept. I haven’t seen that syntax attached to that behavior elsewhere, but it’s really nice. Any good links on sites or books on the topic would be appreciated. Edit: Persistence of variables between runs explained this oddity. It doesn’t work.)

byB=lstcrunch([[Bs[i] for i in range(len(Bs)) ]for Bs in response])

(It bears noting that zip(*response) almost does what I want. The above version isn’t actually working, as I recall. I haven’t run it through a careful test yet.)

byAxB= [item for sublist in response for item in sublist]

(Stolen from a response by Alex Martelli on this site. Again could someone explain why? List comprehension syntax is not very well explained in the texts I’ve been reading.)

ByO= [item for sublist in byAxB for item in sublist]

(Obviously, I simply reused the former comprehension here, ’cause it did what I need. Edit:)

I’d like these to end up the same datatypes, at least when looped through by the factor in question, s.t. that same average/sum/SS/et cetera functions can be applied and used.

This could easily be replaced by something cleaner:

def lstcrunch(Dlist):
    """Returns a list containing the entire
    contents of whatever is imported,
    reduced by one level.

    If a rectangular array, it reduces a dimension by one.
    lstcrunch(DataSet[a][b]) -> DataOutput[a]
    [[1, 2], [[2, 3], [2, 4]]] -> [1, 2, [2, 3], [2, 4]]
    """
    flat=[]
    if islist(Dlist):#1D top level list
        for i in Dlist:
            if islist(i):
                flat+= i
            else:
                flat.append(i)
        return flat
    else:
        return [Dlist]

Oh, while I’m on the topic, what’s the preferred way of identifying a variable as a list?
I have been using:

def islist(a):
    "Returns 'True' if input is a list and 'False' otherwise"
    return type(a)==type([])

Parting query:
Is there a way to explicitly force a shallow copy to convert to a deep? copy? Or, similarly, when copying into a variable, is there a way of declaring that the assignment is supposed to replace the pointer, too, and not merely the value? (s.t.the assignment won’t propagate to other shallow copies) Similarly, using that might be useful, as well, from time to time, so being able to control when it does or doesn’t occur sounds really nice.
(I really stepped all over myself when I prepared my table for inserting by calling:
response=[[[0]*N]*B]*A
)

Edit:
Further investigation lead to most of this working fine. I’ve since made the class and tested it. it works fine. I’ll leave the list comprehension forms intact for reference.

def byB(array_a_b_c):
    y=range(len(array_a_b_c))
    x=range(len(array_a_b_c[0]))
    return [[array_a_b_c[i][j][k]
    for k in range(len(array_a_b_c[0][0]))
    for i in y]
    for j in x]


def byA(array_a_b_c):
    return [[repn for rowB in rowA for repn in rowB] 
    for rowA in array_a_b_c]

def byAxB(array_a_b_c):
    return [rowB for rowA in array_a_b_c 
    for rowB in rowA]

def byO(array_a_b_c):
    return [rep
    for rowA in array_a_b_c
    for rowB in rowA
    for rep in rowB]


def gen3d(row, col, inner):
"""Produces a 3d nested array without any naughty shallow copies.

[row[col[inner]] named s.t. the outer can be split on, per lprn for easy display"""
    return [[[k for k in range(inner)]
    for i in range(col)]
    for j in range(row)]

def lprn(X):
    """This prints a list by lines.

    Not fancy, but works"""
    if isiterable(X):
        for line in X: print line
    else:
        print x

def isiterable(a):
    return hasattr(a, "__iter__")

Thanks to everyone who responded. Already see a noticeable improvement in code quality due to improvements in my gnosis. Further thoughts are still appreciated, of course.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-16T16:13:46+00:00

byAxB= [item for sublist in response for item in sublist] Again could someone explain why?

I am sure A.M. will be able to give you a good explanation. Here is my stab at it while waiting for him to turn up.

I would approach this from left to right. Take these four words:

for sublist in response

I hope you can see the resemblance to a regular for loop. These four words are doing the ground work for performing some action on each sublist in response. It appears that response is a list of lists. In that case sublist would be a list for each iteration through response.

for item in sublist

This is again another for loop in the making. Given that we first heard about sublist in the previous “loop” this would indicate that we are now traversing through sublist, one item at a time. If I were to write these loops out without comprehensions it would look like this:

for sublist in response:
    for item in sublist:

Next, we look at the remaining words. [, item and ]. This effectively means, collect items in a list and return the resulting list.

Whenever you have trouble creating or understanding list iterations write the relevant for loops out and then compress them:

result = []

for sublist in response:
    for item in sublist:
        result.append(item)

This will compress to:

[
    item 
    for sublist in response
    for item in sublist
]

List comprehension syntax is not very well explained in the texts I’ve been reading

Dive Into Python has a section dedicated to list comprehensions. There is also this nice tutorial to read through.

Update

I forgot to say something. List comprehensions are another way of achieving what has been traditionally done using map and filter. It would be a good idea to understand how map and filter work if you want to improve your comprehension-fu.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

This relates to a project to convert a 2-way ANOVA program in SAS to

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply