The majority of microarray studies focus on analysis of gene expression differences between various specimens or conditions. However, the causes of this variability from one cancer to another, from one sample to another and from one gene to another often remain unknown. In this study, we present a systematic procedure for finding genes whose expression levels are altered due to an intrinsic or extrinsic explanatory phenomenon. The procedure consists of three stages: preprocessing, data integration and statistical analysis. We tested and verified the utility of this approach in a case study, where expression and copy number levels of 13,824 genes were determined in 14 breast cancer cell lines. The procedure resulted in identification of 92 genes whose expression levels could be explained by the variability of gene copy number. This set includes several genes that are known to be both overexpressed and amplified in breast cancer. Thus, these genes may represent an important set of primary, genetically altered genes that drive cancer progression.
- data analysis