Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 700719
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 14, 20262026-05-14T03:31:46+00:00 2026-05-14T03:31:46+00:00

I’m looking for an easily maintainable and extendable design model for a script to

  • 0

I’m looking for an easily maintainable and extendable design model for a script to parse an excel workbook into two separate workbooks after pulling data from other locations like the command line, and a database. The high level details are as follows.

I need to parse an excel workbook containing a sheet that lists unique question names, the only reliable information that can be parsed from the question name is the book code that identifies the title and edition of the textbook the question is associated with, the rest of the question name is not standardized well enough to be reliably parsed by computer. The general form of the question name is best described by the following regular expression.

'^(\w+)\s(\w{1,2})\.(\w{1,2})\.(\w{1,3})\.(\w{1,3}\.)*$'

The first sub-pattern is the book code, the second sub-pattern is 90% of the time the chapter, and the rest of the sub-patterns could be section, problem type, problem number, or question type information. There is no simple logic, at least not one I can find.

There will be a minimum of three other columns in this spreadsheet; one column will be the chapter the question is associated with, the second will be the section within the chapter the question is associated with, and the third will be some kind of asset indicated by a uniform resource locator.

  • 1 | 1 | qname1 | url | description | url | description …
  • 1 | 1 | qname2 | url | description
  • 1 | 1 | qname3 | url | description | url | description | url |

The asset can be indicated by a full or partial uniform resource locator, the partial url will need to be completed before it can be fed into the application. There theoretically could be no limit to the number of asset columns, the assets will be grouped in columns by type. Some times additional data will have to be retrieved from a database or combined with the book code before the asset url is complete and can be understood by the application that will be using the asset.

The type is an abstraction, there are eight types right now, each with their own logic in how the uniform resource locator is handled and or completed, and I have to add a new type and its logic every three or four months.

For each asset url there is the possibility of a description column, a character string for display in the application, but not always. (I’ve already worked out validating the description text, and squashing MSs obscure code page down to something 7-bit ascii can handle.)

Now that all the details are filled-in I can get to the actual problem of parsing the file. I need to split the information in this excel workbook into two separate workbooks. The first workbook will group all the questions by section in rows. With the first cell being the section doublet and the rest of the cells in the row are the question names.

  • 1.1 | qname1 | qname2 | qname3 | qname4 |
  • 1.2 | qname1 | qname2 | qname3 |
  • 1.3 | qname1 | qname2 | qname3 | qname4 | qname5

There is no set number of questions for each section as you can see from the above example.

The second workbook is more complicated, there is one row per asset, and question names that have more than one asset will be duplicated. There will be four or five columns on this sheet. The first is the question name for the asset, the second is a media type used to select the correct icon for the asset in the application, the third is string representing the asset type, the four is the full and complete uniform resource locator for the asset, and the fifth columns is the optional text description for the asset.

  • q1 | mtype1 | atype1 | url | description
  • q1 | mtype2 | atype2 | url | description
  • q1 | mtype2 | atype3 | url | description
  • q2 | mtype1 | atype1 | url | description
  • q2 | mtype2 | atype3 | url | description

For the original six types I did have a script that parsed the source excel workbook into the other two excel workbooks, and I was able to add two more types until I ran aground on the implementation of the ninth type and tenth types.

What broke my script was the fact that the ninth type is actually a sub-type of one of the original six, but with entirely different logic, and my mostly procedural script could not accommodate without duplicating a lot of code. I also had a lot of bugs in the script and will be writing the test first on this time around.

I’m stuck with the format for the resulting two workbooks, this script is glue code, development went ahead with the project without bothering to get a complete spec from the sponsor. I work for the same company as the developers but in the editorial department, editorial is co-sponsor of the project, and am expected to fix pesky details like this (I’m foaming at the mouth as I type this).

I’ve tried factories, I’ve tried different object models, but each resulting workbook is so different when I find a design that works for generating one workbook the code is not really usable for generating the other. What I would really like are ideas about a maintainable and extensible design for parsing the source workbook into both workbooks with maximum code reuse, and or sympathy.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-14T03:31:47+00:00Added an answer on May 14, 2026 at 3:31 am

    Not sure if I can help, but at the least, sympathy for you I do have 🙂

    Have you tried using Strategies? If haven’t check out the link, there is even a simple Python example. If your types differ only in the way they handle the URLs, you could encapsulate the different logics into strategy subclasses. In the worst case, there may be duplicated logic between some subclasses, but at least the rest of your app can be happily oblivious about it, and adding new types should be simple. But you might even be able to reuse part of the duplicated logic by e.g. parameterizing strategies differently… I am entirely speculating here, without knowing the concrete details of your problem.

    Hope this helps…

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 419k
  • Answers 419k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer Create a subclass that has a single property LogLevel and… May 15, 2026 at 10:11 am
  • Editorial Team
    Editorial Team added an answer Thanks Thomas! For anyone else who might want to get… May 15, 2026 at 10:11 am
  • Editorial Team
    Editorial Team added an answer You can use a DOMParser, like so: var xmlString =… May 15, 2026 at 10:11 am

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.