I’m writing a module that involves parsing html for data and creating an object

Question

0

Asked: May 24, 20262026-05-24T18:21:20+00:00 2026-05-24T18:21:20+00:00

I’m writing a module that involves parsing html for data and creating an object

0

I’m writing a module that involves parsing html for data and creating an object from it. Basically, I want to create a set of testcases where each case is an html file paired with a golden/expected pickled object file.

As I make changes to the parser, I would like to run this test suite to ensure that each html page is parsed to equal the ‘golden’ file (essentially a regression suite)

I can see how to code this as a single test case, where I would load all file pairs from some directory and then iterate through them. But I believe this would end up being reported as a single test case, pass or fail. But I want a report that says, for example, 45/47 pages parsed successfully.

How do I arrange this?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-24T18:21:21+00:00

I’ve done similar things with the unittest framework by writing a function which creates and returns a test class. This function can then take in whatever parameters you want and customise the test class accordingly. You can also customise the __doc__ attribute of the test function(s) to get customised messages when running the tests.

I quickly knocked up the following example code to illustrate this. Instead of doing any actual testing, it uses the random module to fail some tests for demonstration purposes. When created, the classes are inserted into the global namespace so that a call to unittest.main() will pick them up. Depending on how you run your tests, you may wish to do something different with the generated classes.

import os
import unittest

# Generate a test class for an individual file.
def make_test(filename):
    class TestClass(unittest.TestCase):
        def test_file(self):
            # Do the actual testing here.
            # parsed = do_my_parsing(filename)
            # golden = load_golden(filename)
            # self.assertEquals(parsed, golden, 'Parsing failed.')

            # Randomly fail some tests.
            import random
            if not random.randint(0, 10):
                self.assertEquals(0, 1, 'Parsing failed.')

        # Set the docstring so we get nice test messages.
        test_file.__doc__ = 'Test parsing of %s' % filename

    return TestClass

# Create a single file test.
Test1 = make_test('file1.html')

# Create several tests from a list.
for i in range(2, 5):
    globals()['Test%d' % i] = make_test('file%d.html' % i)

# Create them from a directory listing.
for dirname, subdirs, filenames in os.walk('tests'):
    for f in filenames:
        globals()['Test%s' % f] = make_test('%s/%s' % (dirname, f))

# If this file is being run, run all the tests.
if __name__ == '__main__':
    unittest.main()

A sample run:

$ python tests.py -v
Test parsing of file1.html ... ok
Test parsing of file2.html ... ok
Test parsing of file3.html ... ok
Test parsing of file4.html ... ok
Test parsing of tests/file5.html ... ok
Test parsing of tests/file6.html ... FAIL
Test parsing of tests/file7.html ... ok
Test parsing of tests/file8.html ... ok

======================================================================
FAIL: Test parsing of tests/file6.html
----------------------------------------------------------------------
Traceback (most recent call last):
  File "generic.py", line 16, in test_file
    self.assertEquals(0, 1, 'Parsing failed.')
AssertionError: Parsing failed.

----------------------------------------------------------------------
Ran 8 tests in 0.004s

FAILED (failures=1)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m writing a module that involves parsing html for data and creating an object

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply