This is the script that has very kindly been given to me as a

Question

0

Asked: June 3, 20262026-06-03T05:48:15+00:00 2026-06-03T05:48:15+00:00

This is the script that has very kindly been given to me as a

0

This is the script that has very kindly been given to me as a starter:

#!/usr/bin/python
# -*- coding: utf-8 -*-
from __future__ import with_statement    # needed for Python 2.5
from itertools import chain

def chunk(s):
    """Split a string on whitespace or hyphens"""
    return chain(*(c.split("-") for c in s.split()))

def process(latin, gloss, trans):
    chunks = zip(chunk(latin), chunk(gloss))
    # now you have to DO SOMETHING with the chunks!

def main():
    with open("examples.txt") as inf:
        try:
            while True:
                latin = inf.next().strip()
                gloss = inf.next().strip()
                trans = inf.next().strip()
                process(latin, gloss, trans)
                inf.next()    # skip blank line
        except StopIteration:
            # reached end of file
            pass

if __name__=="__main__":
    main()

I’m not sure if I’m missing anything, but the output is simply blank, taking me back to $.

I’m trying to do the following:
I have a text with a language other than English, broken up into morphemes (parts of each word) using hyphens, with the English gloss (linguistic translation of each morpheme) and a direct translation below.

eg.

Itali-am fat-o profug-us Lavini-a-que ven-it

Italy-Fem:Sg:Acc fate-Neut:Sg:Abl fleeing-Masc:Sg:Nom Lavinian-Neut:Pl:Acc come:Perf-3-Sg:Indic:Act

‘in flight [driven] by fate came to Italy and the Lavinian [shores]’

I’ll have several texts such as the above in one file – i.e.

blank line

a line of latin broken up with hyphens

a line of gloss broken up with corresponding hyphens, using colons to join elements

a line of translation

blank line

latin

gloss

translation

ad infinitum.

What I need to do is write a file that gives me the following output:

Itali:    1    Italy
am:    1    Fem:Sg:Acc
fat:    1    fate
o:    1    Neut:Sg:Abl
profug:   1    fleeing
us:    1    Masc:Sg:Nom
Lavini:    1    Lavinian
a:    1    Neug:Pl:Acc
que:    1    come:Perf
ven:    1   3
it:     1   Sg:Indic:Act

where the first column represents the first line of text without hyphens; the second column indicates the number of occurrences (it’s only 1 each in this example), and the third column is the English translation of the first column, as written in the text.

If there’s a latin morpheme with no corresponding English gloss/translation, the Latin column will be as normal but the English column will print [unknown], like:

a:  1   [unknown]

And if the opposite, i.e. an English morpheme with no corresponding Latin, it should print

[unknown]:  1   kitten

Finally, the prog needs to be able to deal with homophonous morphemes (i.e. two identically spelled latin morphemes with different meanings).
e.g.

a:  16  Neuter:Plural
a:  28  Feminine:Singular

Again, it’s homework, and any pointers would be wonderful. Working on putting together some script now to upload here for critique!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-03T05:48:16+00:00

Processing the file is a bit tricky because of the multiline structure; rather than the usual line-by-line iteration, I suggest something like this (I presume the file does not actually begin with a blank line, per your example, but that it uses blank lines as separators):

with open("input.txt") as inf:
    try:
        while True:
            latin = inf.next().strip()
            gloss = inf.next().strip()
            trans = inf.next().strip()
            process(latin, gloss, trans)
            inf.next()    # skip blank line
    except StopIteration:
        # reached end of file
        pass

process must then split latin and gloss into chunks and pair them appropriately:

from itertools import chain

def chunk(s):
    """Split a string on whitespace or hyphens"""
    return chain(*(c.split("-") for c in s.split()))

def process(latin, gloss, trans):
    chunks = zip(chunk(latin), chunk(gloss))

Calling this like

process(
    "Itali-am fat-o profug-us Lavini-a-que ven-it",
    "Italy-Fem:Sg:Acc fate-Neut:Sg:Abl fleeing-Masc:Sg:Nom Lavinian-Neut:Pl:Acc come:Perf-3-Sg:Indic:Act",
    "in flight [driven] by fate came to Italy and the Lavinian [shores]")

leaves chunks containing

[('Itali', 'Italy'),
 ('am', 'Fem:Sg:Acc'),
 ('fat', 'fate'),
 ('o', 'Neut:Sg:Abl'),
 ('profug', 'fleeing'),
 ('us', 'Masc:Sg:Nom'),
 ('Lavini', 'Lavinian'),
 ('a', 'Neut:Pl:Acc'),
 ('que', 'come:Perf'),
 ('ven', '3'),
 ('it', 'Sg:Indic:Act')]

The rest is an exercise for the student – keeping a running count of the chunks, then sorting and displaying it appropriately. Hope that helps!

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

This is the script that has very kindly been given to me as a

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply