Data was extracted from PDF and massaged into a spreadsheet. "Gene" and "Base change" columns were combined to obtain and HGVS mutation notation. Gene 1ITIH2 was changed to ITIH2. Gene DDX26B/INTS6L was trimmed to DDX26B. Currently, those variants manually modified because we could not map them back to a transcript: CXorf30/CFAP47 c.8943G>A to NM_152632.3:c.8943G>A on the basis of CFAP47 being at chrX:35937851-36008269, described as NM_152632 cilia- and flagella-associated protein 47 isoform 2. Mutalyzer converts the new notation to NC_000023.10:g.36013665G>A AVPR2 c.739-750delCGCCGCAGGGGA to NM_000054.4:c.739-750delCGCCGCAGGGGA. We chose this transcript as it's variant 1 of the protein coding isoforms. Both transcript variant resulted in the same genomic coordinate. E.g. AVPR2 at chrX:153170529-153172620 - (NM_000054) vasopressin V2 receptor isoform 1 AVPR2 at chrX:153170529-153172620 - (NM_001146151) vasopressin V2 receptor isoform 2 Mutalyzer convert the new notation to NC_000023.10:g.153170949delCGCCGCAGGGGA ITIH2:c.1863_1864insTATT was manually added an anchor for the insertion (position from 7776960 to 7776959, while changeing the insert from TATT to TTATT allowing to set T as the reference anchor.) Using Al-Mubarak2017:ASD_* as sample id where ASD_* represents the original sample ID. This was done the prevent conflicts with other publications using a similar subject ID format.