The Imputed SNP (Tall Format) Input Engine imports a set of files created by a SNP imputation program, such as IMPUTE (Marchini et al., 2007) or BEAGLE (Browning and Browning, 2009). This process outputs three different SAS genotype data sets that can be used for subsequent analyses.
The output Annotation Data Set lists the map position and alleles for each of the SNP markers.
What do I need?
At least one genotype probability file is required for this process. This file must be in the tall format, where sets of genotype probability columns correspond to individuals and SNPs are in rows. With the options provided, files from programs can be imported and analyzed.
A second, optional file is the sample file. This text file, which contains information about the sample in the genotype probability file(s), must be a space-delimited file with column names in the first row and data beginning on the third row, with rows of samples ordered the same as the columns of samples in the genotype file(s). During the input process, columns from this file are merged with the genotype columns.
The following example uses the example.gen and the example.sample files included in the Sample Data folder, which are example files from the IMPUTE program. They are provided courtesy of Jonathan Marchini at University of Oxford.
