Regression Testing

Before any changes are committed to trunk or vivified to the live site, developers must verify the correctness of the system by running regression tests. These tests are a series of saved choices files, test suites, and gold-standard parsing profiles. The testing framework uses the customization system to create a grammar from the choices file, then uses [incr tsdb()] and the LKB to parse the test suite with the grammar and verify the results are the same as those in the gold-standard profile.

See below for information on how to run, update, add, or maintain regression tests, as well as a description of the directory structure of the regression test framework.

Running Regression Tests

Before you can run the regression tests, you must make sure the environment variable CUSTOMIZATIONROOT is set.

To run all tests, run the "regression-test" (short form: "r") command of matrix.py with no arguments:

python matrix.py regression-test

To run a single test, run the "regression-test" command with the test name as the argument:

python matrix.py regression-test infl-q-aux-verb

You may also run a single test from a Lisp buffer:

:pa lkb
(setf *customization-root* "~/matrix/gmcs/")
(load "~/matrix/lisp/matrix-regression-tests.lsp")

(run-matrix-regression-test "lg-name")

For each test, a result will be printed to STDOUT. A correct grammar will result in a "Success!" message, while a faulty one will result in a "DIFFS!" message. If you ran a single test, the items causing error messages will be printed to STDOUT, otherwise they will be repressed. The results will also be written to a log file.

After running the tests, check the logs for differences, and either fix the regressions or update any progressions.

Checking Logs

The results of a run of the regression tests are saved in several log files. These files are:

    gmcs/
        regression_tests/
            logs/
                regression-tests.date
                test-name.date
                tsdb.date

Note: The date is only granular to the day, so multiple runs in a single day are in the same file.

Updating Regression Tests

Sometimes, changes to the customization system yield grammars that perform better that the gold standard. These changes will produce a "DIFFS!" result, but it is not necessarily a bad thing. After opening the current and gold profiles in [incr tsdb()] and verifying (by hand) that the current results are more desirable, you can update the gold standard to use the current results. This is done with the "regression-test-update" (short form: "ru") command of matrix.py, and with the name of the test as an argument:

python matrix.py ru test-to-update

Be sure to run svn commit such that any new gold standards are committed.

Adding Regression Tests

:pa lkb
(setf *customization-root* "~/matrix/gmcs/")
(load "~/matrix/lisp/matrix-regression-tests.lsp")

(create-matrix-regression-test "choices-file" "txt-suite")

(clean-up-regression-test "lg-name")

python matrix.py regression-test-add choices-file txt-suite

svn commit

Maintaining regression tests

DO avoid editing regression-test-index directly or changing the contents of choices/ txt-suites/ skeletons/ or home/gold by hand. This code is probably fairly brittle wrt to files not being where they are expected.

Otherwise, no need to do anything here: We want choices files in old versions so that we are routinely testing the up-rev code.

Directory structure:

regression_tests/               
        add_regression_test.py
        call-customize
        run_regression_tests.sh
        regression-test-Index
        regressiontestindex.py
        update-gold-standard.sh
        choices/
        txt-suites/
        skeletons/      [tsdb skeletons]
                Index.lisp
        home/           [tsdb home]
                gold/
                current/
        grammars/
        logs/
        scratch/

Notes

Everything in the home/current/, grammars/, and logs/ subdirectories is placed there and named by the scripts.

Language name in choices file used as the basis of the naming. We need a convention for them :)

One-to-one mapping between choices files and txt-suites and skeletons. We might end up with multiple txt-suites for the same "language", but we would still require separate choices files with the same choices except for the language name.

Scratch is the default place for putting choices files and txt-suites to play with and then eventually add to the system. The contents of scratch should be local only, not under svn.

After creating regression test, be sure to close [incr tsdb()] before running regression tests. Ongoing processes in [incr tsdb()] can block actions needed during the regression tests.

MatrixRegressionTesting (last edited 2011-10-08 21:12:08 by localhost)

(The DELPH-IN infrastructure is hosted at the University of Oslo)