Pathoplexus processes submitted data to validate, harmonize, and standardize it. We ensure users have maximum flexibility in accessing the most useful data for their needs by only rejecting submissions lacking essential metadata values or sequences not identifiable as the specified pathogen.
We use Nextclade for alignment, mutation calling, quality checks and clade assignment.
The data preprocessing steps encompass:
Submissions that fail to meet these requirements are rejected by the preprocessing pipeline, which provides a detailed error message explaining the reason for rejection.