whyqd has the single objective of transforming messy input data into a single standardised schema for further validation and analyis in other software. Anything that goes further than that is out of scope.
That still leaves a fair amount to do, including improving the documentation and tests:
- pandas supports multiple fruity CSV formats (#*-seperated, etc) - need a config to support these wilder problems
- Zip data files, method, and produce citation report as a single step to aid distribution
- Validate a zipped output file and produce a validation report