The cDNA arrays and Gene Chips discussed in this study have not been approved by the FDA for use in patients.
Presenter: Lajos Pusztai Presenter's Affiliation: M.D. Anderson Cancer Center, Houston, TX Type of Session: Scientific
Gene profiling has increasingly been used in attempts to classify cancer into clinically-relevant subgroups
Most studies reporting gene profiles have come from single institutions and used single microarray platforms
Comparison of results across multiple institutions and platforms should be performed to improve the development of accurate gene profiles that may one day be used in clinical tests
Materials and Methods
RNA was isolated from needle samples from 33 breast cancer patients and was hybridized from the same samples to two different platforms: Affymetrix GeneChip (an oligonucleotide array) and Millennium cDNA arrays
A gene expression signature that predicted for pathologic complete response to neoadjuvant paclitaxel followed by 5-FU, doxorubicin, and cyclophosphamide was determined on each platform
The resulting signatures were then tested for predictive value on the other platforms
Using Generic Algorithm and Linear Discriminate Analysis, the top 100 5-gene sets from each platform were determined
These gene sets were also compared across platforms for to determine their predictiveness
30% of all corresponding genes derived from both platforms showed Pearson correlation coefficient of at least 0.7
54% of clones from the cDNA chip matched at least one probe set from the Affymetrix chip
Between the 2 platforms, 9402 genes overlapped with only modest correlation between the two platforms for individual gene expression. Part of this variation is accounted for by the fact more than one Affymetrix oligonucleotide probe corresponds to a cDNA gene (higher correlation is seen with probes closer to the 3' end of the gene because transcription fidelity decreases as the replication product is longer)
Hierarchical clustering revealed 45 genes from the cDNA chip and 182 genes from the Affymetrix chip that were highly predictive for response to treatment with 91% accuracy of prediction for both chips.
The same 45 genes from the cDNA chip when used for prediction on the Affymetrix data were 79% accurate and the 182 genes from the Affymetrix chip when used for prediction on the cDNA chip were 45% accurate.
Only 17 genes overlapped between the top discriminating genes of the two platforms
When only the overlapping 17 genes were used for clustering, 67% of cases in the Affymetrix platform and 64% in the cDNA platform clustered correctly.
When the 100 best 5-gene sets from the cDNA data were tested on the cDNA data, the average misclassification rate was 2% compared to 33% when the same gene sets were tested on the Affymetrix data
When the 100 best 5-gene sets from the Affymetrix data were tested on the Affymetrix data, the average misclassification rate was 20% compared to 33% when the same gene sets were tested on the cDNA data
Gene expression measurements have only modest correlation between platforms
Genes that are identified as predictive on one platform often lose predictiveness when compared to the data from the other platform
Multigene predictors also lose predictiveness when compared across platforms
This study compared two of the most different platforms (cDNA vs oligonucleotide) and better correlation may be seen if more similar platforms are compared; however, loss of accuracy should still be expected
Clinical/Scientific Implications This study demonstrates one of the greatest obstacles seen in genetic profiling studies, namely, the difficulty in reproducing gene signatures, particularly across platforms. The small number of genes that were found to be predictive in both platforms calls into question the overall predictiveness of gene profiles derived from any individual platform. Currently, most gene profiling studies using microarray technology originate from one institution and utilize only one platform. In order to improve the accuracy of the gene sets derived from these studies, multiple platforms should be used and gene sets targeted towards the genes with the highest correlation between platforms. Published data that fails to test genetic signatures across multiple platforms should be viewed cautiously as their generalizability may be limited.
Oncolink's ASCO Coverage made possible by an unrestricted Educational Grant from Bristol-Myers Squibb Oncology.