Rounded Rectangle: Mining EEG Data to Diagnose Epilepsy

Rutgers University

There is an urgent need for a quick screening process that could determine whether a patient was epileptic vs. simply demonstrating symptoms linked to epilepsy but actually stemming from a different illness, as an inaccurate diagnosis could have fatal consequences.á This research, therefore, focuses on developing a model using a support vector machine called C-SVM (connectivity support vector machine), a Data Mining algorithm, that integrates brain-connectivity network modeling that will be able to predict from EEG readings whether a person is epileptic or normal.

The EEG data was collected from 10 epileptic patients, 5 with epilepsy and 5 normal.á The data was collected in the normal fashion, using 18 scalp electrodes placed according to the 10-20 system, using Nicolet BMSI 6000.á The sample consisted of a random and uniform selection of 2 30-second epochs of EEG recordings in each patient.

As scalp EEG recordings are severely limited by outside noise, eye movements, cardiac signals, etc., various ICA (independent component analysis) algorithms were used to separate and remove contamination, as has been the case in many studies.á After the algorithms were used to filter out the contaminated data, the C-SVM was used to classify the different EEG readings into normal and epileptic; rather than simply capturing data from each individual electrode, the dependence between two data samples was calculating using the Euclidean Distance.á This data was then used as input for the SVM.

Ultimately, the Gaussian Kernel model was used to map the data such that the difference between the epileptic and normal readings was maximized.á Five-fold cross validation was then implemented in order to test and train the data, resulting with an average accuracy of 94.8%, using the UNICA algorithm, compared to a 69.4% average accuracy for the regular SVM, still using the Gaussian model and UNICA algorithm.

There are a few elements in this research that we plan to explore in more depth.á Different models besides linear and Gaussian could be used to map the data, for further comparison purposes.á On the same note, the dependence between electrode data samples could be computed in a different fashion, such as T-Statistic and Dynamic Time Warping, and then these new results compared to those obtained by using the Euclidean Distance.á Additionally, different algorithms besides the SVM could be employed, such as na´ve bayes or others from the WEKA workbench.




REU Student: Rebecca Sorla Pottenger

DIMACS Mentor: Dr. . W. Art Chaovalitwongse

PHD Student: Ya-Ju Fan


Contact Information:

96 Frelinghuysen Road

Piscataway, NJ, 08854






Current Research