Leukemia dataset (learning set) contains gene expression levels (3051
genes and 38 patient samples) from Golub et al. (1999). This dataset
has been pre-processed: capping into floor of 100 and ceiling of 16000;
filtering by exclusion of genes with max/min<=5
or
max-min<=500
, where max and min refer respectively to the maximum
and minimum intensities for a particular gene across mRNA samples;
2-base logarithmic transformation.
data(Golub)
Golub
: a gene expression matrix of 3051 genes x 38
samples. These samples include 11 acute myeloid leukemia (AML) and 27
acute lymphoblastic leukemia (ALL) which can be further subtyped into
19 B-cell ALL and 8 T-cell ALL.
Golub et al. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring, Science, Vol. 286:531-537.