Flowpeaks analysis for flow cytometry data using MATLAB

For flow cytometry analysis, data clustering has been used extensively to partition cells into distinct groups by measuring fluorescence intensity. The traditional techniques of identifying cell populations manually have been challenged by the increasing number of the cell surface markers (data set dimensions), which resulted in the growing need of automated and high-dimensional analytical methods. The main common methods used for automated clustering problems include the finite mixture model and the spatial exploration of the histograms. However, both methods have their own limitations. This project focuses on implementing a flow cytometric data algorithm that overcomes these limitations and provides efficient data clustering using MATLAB. This algorithm, called flowPeaks, is based on both the finite mixture model and the spatial exploration of the histograms to address the complexities of flow cytometric analysis and to deal with high-dimensional data without the need of transformations. The algorithm first uses the K-means algorithm to partition the data set into many small clusters. Then, the finite mixture model is used to generate a smoothed density function. After that, using the steepest-descent algorithm, the local peaks are searched and associated with each cluster. Finally, each of the points in the flow cytometric data set is classified in one of the clusters. The algorithm is automatic, fast, and robust to model reliably any cluster shape, including in the presence of outliers. The flowPeaks algorithm developed was tested using two sets of white blood cell data, provided by the Center for Biphotonics Science and Technology (CBST). The algorithm successfully partitioned both sets of data into the three clusters desired. However, this algorithm cannot distinguish a cluster within another cluster. Further studies should include this case.

Project (M.S., Electrical and Electronic Engineering)--California State University, Sacramento, 2013.