Regression Test Manager

The test manager for PFLOTRAN is a python program that is responsible for reading a configuration file, identifying the tests declared in the file, running PFLOTRAN on the appropriate input files, and then comparing the results to a known gold standard output file.

Running the Test Manager

The test manager can be run in two ways, either as part of the build system using make or manually.

There are two options for calling the test manager through make: make check and make test. The check target runs a small set of tests that verify that PFLOTRAN is built and running on a given system. This would be run by user to verify that their installation of PFLOTRAN is working. The test target runs a fuller set of regression tests intended to identify when changes to the code cause significant changes to PFLOTRAN’s results.

$ cd $PFLOTRAN_DIR/regression_tests
$ make check

or

$ cd $PFLOTRAN_DIR/regression_tests
$ make test

When finished, it is useful to remove all of the outfiles generated by running the regression tests with the command make clean-tests:

$ cd $PFLOTRAN_DIR/regression_tests
$ make clean-tests

Calling the test manager through make relies on make variables from PETSc to determine the correct version of python to use, if PFLOTRAN was build with MPI, and optional configurations such as unstructured meshes. The version of python used to call the test manager can be changed from the command line by specifying python:

$ cd ${PFLOTRAN_DEV}/src/pflotran
$ make PYTHON=/opt/local/bin/python3.3 check

To call the test manager manually:

$ cd ${PFLOTRAN_DEV}/regression_tests
$ python regression-tests.py \
    --executable ../src/pflotran/pflotran \
    --config-file shortcourse/copper_leaching/cu_leaching.cfg \
    --tests cu_leaching

Some important command line arguments when running manually are:

  • executable: the path to the PFLOTRAN executable
  • mpiexec: the name of the executable for launching parallel jobs, (mpiexec, mpirun, aprun, etc).
  • config-file: the path to the configuration file containing the tests you want to run
  • recursive-search: the path to a directory. The test manager searches the directory and all its sub-directories for configuration files.
  • tests: a list of test names that should be run
  • suites: a list of test suites that should be run
  • update: indicate that the the gold standard test file for a given test should be updated to the current output.
  • new-tests: indicate that the test is new and current output should be used for gold standard test file.
  • check-performance: include the performance metrics (SOLUTION blocks) in regression checks.

The full list of command line options and a brief description can be found by running with the --help flag:

$ python regression-tests.py --help

Test output

The test manager produces terse screen output, only printing critical warnings, a progress bar as each test is run, and a summary of the overall test results. Example results from running make test are:

  Test log file : pflotran-tests-2013-05-06_13-07-05.testlog

** WARNING ** : mpiexec was not provided on the command line.
                All parallel tests will be skipped!

Running pflotran regression tests :
..........S........SSSS.......F..........SSS
---------------------------------------------------------------------
Regression test summary:
    Total run time: 179.259 [s]
    Total tests : 44
    Skipped : 8
    Tests run : 36
    Failed : 1

The progress bar records a period, ., for each successful test, an S if a test was skipped, an F for failed tests, a W for a test that generated a warning, and an E for a test that generated an internal error.

Each time the test suite is run, a log file is generated in the regression test directory. The log file contains a detailed record of every test run, including: the directory containing the test, the command line call to PFLOTRAN used to run the test, a diff command to compare the regression files, and a list of failures. The log file can quickly be searched for skip, fail, error to identify the tests that generated the message.

The test directories contain any files generated by PFLOTRAN during the run. Screen output for each test is contained in the file \${TEST\_NAME}.stdout.

Configuration Files

The regression test manager reads tests specified in a series of configuration files in standard cfg (or windows ini file) format. They consist of a series of sections with key-value pairs:

[section-name]
key = value

Section names should be all lower case, and spaces must be replaced by a hyphen or underscore. Comments are specified by a \# character.

A test is declared as a section in the configuration file. It is assumed that there will be a PFLOTRAN input file with the same name as the test section. The key-value pairs in a test section define how the test is run and the output is compared to the gold standard file.

[calcite-kinetics]
#look for an input file named `calcite-kinetics.in'
np = 2
timeout = 30.0
concentration = 1.0e-10 absolute
  • np = N, (optional), indicates a parallel test run with N processors. Default is serial. If mpiexec in not provided on the command line, then parallel tests are skipped.
  • timeout = N, (optional), indicates that the test should be allowed to run for N seconds before it is killed. Default is 60.0 seconds.
  • TYPE = TOLERANCE COMPARISON, indicates that data in the regression file of type TYPE should be compared using a tolerance of TOLERANCE. Know data types are listed below.

The data types and default tolerances are:

  • time = 5 percent
  • concentration = \(1\times 10^{-12}\) absolute
  • generic = \(1\times 10^{-12}\) absolute
  • discrete = 0 absolute
  • rate = \(1\times 10^{-12}\) absolute
  • volume_fraction = \(1\times 10^{-12}\) absolute
  • pressure = \(1\times 10^{-12}\) absolute
  • saturation = \(1\times 10^{-12}\) absolute
  • residual = \(1\times 10^{-12}\) absolute

The default tolerances are deliberately set very tight, and are expected to be overridden on a per-test or per configuration file basis. There are three known comparisons: “absolute”, for absolute differences (\(\delta=|c-g|\)), “relative” for relative differences (\(\delta={|c-g|}/{g}\)), and “percent” for specifying a percent difference (\(\delta=100\cdot{|c-g|}/{g}\)).

In addition there are two optional sections in configuration files. The section “default-test-criteria” specifies the default criteria to be used for all tests in the current file. Criteria specified in a test section override these value. A section name “suites“ defines aliases for a group of tests.

[suites]
serial = test-1 test-2 test-3
parallel = test-4 test-5 test-6

Common test suites are standard and standard_parallel, used by make test, and domain specific test suites, geochemistry, flow, transport, mesh, et cetra.

Creating New Tests

We want running tests to become a habit for developers so that make pflotran is always followed by make test. With that in mind, ideal test cases are small and fast, and operate on a small subsection of the code so it is easier to diagnose where a problem has occurred. While it may (will) be necessary to create some platform specific tests, we want as many tests as possible to be platform independent and widely used. There is a real danger in having test output become stale if it requires special access to a particular piece of hardware, operating system or compiler to run.

The steps for creating new regression tests are:

  • Create the PFLOTRAN input file, and get the simulation running correctly.

  • Tell PFLOTRAN to generate a regression file by adding a regression block to the input file, e.g.:

    REGRESSION
      CELLS
        1
        3978
      /
      CELLS_PER_PROCESS 4
    END
    
  • Add the test to the configuration file

  • Refine the tolerances so that they will be tight enough to identify problems, but loose enough that they do not create a lot of false positives and discourage users and developers from running the tests.

  • Add the test to the appropriate test suite.

  • Add the configuration file, input file and “gold” file to revision control.

Guidelines for setting tolerances go here, once we figure out what to recommend.

Updating Test Results

The output from PFLOTRAN should be fairly stable, and we consider the current output to be “correct”. Changes to regression output should be rare, and primarily done for bug fixes. Updating the test results is simply a matter of replacing the gold standard file with a new file. This can be done with a simple rename in the file system:

mv test_1.regression test_1.regression.gold

Or using the regression test manager:

$ python regression-tests.py --executable ../src/pflotran/pflotran \
    --config-file my_test.cfg --tests test_1 --update

Updating through the regression test manager ensures that the output is from your current executable rather than a stale file.

Please document why you updated gold standard files in your revision control commit message.