Processes | Genetics | Collapse Multiallelic Genotypes

Collapse Multiallelic Genotypes
The Collapse Multiallelic Genotypes process creates an output data set containing one column per marker from an input data set that contains one column per each marker allele. This process assumes a diploid organism, which implies that only two alleles are allowed for every individual.
What do I need?
Two SAS data sets are required: An Input Data Set with one column per each marker allele, and an Annotation Data Set containing information about the markers in the input data set. Each row in this data set corresponds to a column containing marker alleles in the input data set. Two columns must be present in the annotation data set, a column with the names of marker allele variables present in the input data set, and another column with name of markers. The names of marker allele variables in the input and annotation data sets must be in the same order.
The ssr_alleles.sas7bdat data set (shown below) is an example of the input data set.
The ssr_alleles_anno data set (shown below) is an example of the input data set.
For detailed information about the files and data sets used or created by JMP Life Sciences software, see Files and Data Sets.
Output/Results
Output from this process is accessed from a Results window. Refer to the Collapse Multiallelic Genotypes output documentation for detailed descriptions and guides to interpreting your results.