Created VARICARTA_Ref column with following changes: T v.s. Y for location 2:230632309-230632309 T v.s. Y for location 2:230660028-230660028 C v.s. S for location 2:230723547-230723547 G v.s. R for location 2:230723679-230723679 G v.s. S for location 2:230725206-230725206 C v.s. Y for location 2:230650514-230650514 A v.s. R for location 2:230661473-230661473 G v.s. R for location 2:230667106-230667106 C v.s. Y for location 2:230679005-230679005 G v.s. R for location 2:230724096-230724096 T v.s. K for location 2:230724204-230724204 Removed cohort from sample ID (VARICARTA_sampleID). Created VARICARTA_Alt column occupied by: If the ReferenceBase is used by VARICARTA_Ref, then use the other allele provided in SampleAlleles column. The following IDs were changed in a the newer Zhao2019 paper. M8079 M8437 M8871 M8512 M8828 Where the difference is the addition of a 0 after the M. We changed this paper presumably because the SKLMG cohort now uses the padded 0. Other samples in the Wang2016 paper also seemed the have the leading 0 but not those IDs for some reason while some don't. We therefor only fixed specific cases we were able to flag. This change resolves the overlap with Zhao2019.