||Baysian inference on SNP arrays.
Create an algorithm using bayesian inference to analyze SNP arrays.
- Week 1:
- This week I choose a topic and began reading background material. I also prepared the first presentation.
- Week 2:
- I continued reading background material to get an overview
of the current algorithms that analyze SNP arrays. I searched for ways
to get the raw data from .cel and .cdf files with R. I also looked into
the graphlab api as a possible platform to implement our gibbs sampling algorithm.
- Week 3:
This week I learned the basics of R and the bioconductor packages affy and affxparser to read the raw data files. Once the raw data was extracted from the .cel and .cdf files we joined the data with an annotation file and create a csv. Then I created a script to make various plots. After plotting the data there was discrepancies between our intensity plots and the plots in the paper ALCHEMY: a reliable method for automated SNP genotype calling for small batch sizes and highly homozygous populations. After further investigation we found that bioconductor was not outputting some values perhaps as a preprocessing step.
- Week 4:
This week we used affemetrix power tools instead of the bioconductor package to extract the data because of the difficulties last week in extracting data without preprocessing and the script wasnt efficent. I also had to remake the plotting script since the data was in a different format. The plots matched what we expected and we began discussing the model we are going to use to analyze the data and its implementation. I also continued to do background reading.