whyqd has the single objective of transforming messy input data into a single standardised schema for further validation and analyis in other software. Anything that goes further than that is out of scope.

That still leaves a fair amount to do, including improving the documentation and tests:

  • pandas supports multiple fruity CSV formats (#*-seperated, etc) - need a config to support these wilder problems
  • Zip data files, method, and produce citation report as a single step to aid distribution
  • Validate a zipped output file and produce a validation report