| xval-methods {MLInterfaces} | R Documentation |
support for cross-validatory machine learning with exprSets
xval( data, classLab, proc, xvalMethod, group, indFun, niter, fsFun=NULL, fsNum=NULL, decreasing=TRUE, ... ) balKfold(K)
data |
instance of class exprSet |
classLab |
character string identifying phenoData variable to be regarded |
proc |
an MLInterfaces method that returns an instance of classifOutput |
xvalMethod |
character string identifying cross-validation procedure to use: default is "LOO" (leave one out), alternatives are "LOG" (leave group out) and "FUN" (user-supplied partition extraction function, see Details below) |
group |
a vector (length equal to number of samples) enumerating groups for LOG xval method |
indFun |
a function that returns a set of indices to be saved as a test set;
this function must have parameters data, clab, iternum; see Details |
niter |
number of iterations for user-specified partition function to be run |
K |
number of partitions to be used if balKfold is used as indFun |
fsFun |
function computing ranks of features for feature selection |
fsNum |
number of features to be kept for learning in each iteration |
decreasing |
logical, should be TRUE if fsFun provides high scores for high-performing features
(e.g., is absolute value of a test statistics) and false if it provides low scores
for high-performing features (e.g., p-value of a test). |
item{...}{arguments passed to the MLInterfaces generic proc}
If xvalMethod is "FUN", then indFun must be a function
with parameters data, clab, and iternum.
This function returns
indices that identify the training set for a given
cross-validation iteration passed as the value of iternum. An example
function is printed out when the example of this page is executed.
if fsFun is not NULL, then it must be a function with two
arguments: the first can be transformed to a feature matrix (rows are objects,
columns are features) and the second is a vector of class labels.
The function returns a vector of scores, one for each object. The
scores will be interpreted according to the value of decreasing,
to select fsNum features. Thanks to Stephen Henderson of University
College London for
this functionality.
library(golubEsets)
data(golubMerge)
smallG <- golubMerge[200:250,]
lk1 <- xval(smallG, "ALL.AML", knnB, xvalMethod="LOO", group=as.integer(0))
table(lk1,smallG$ALL.AML)
lk2 <- xval(smallG, "ALL.AML", knnB, xvalMethod="LOG", group=as.integer(
rep(1:8,each=9)))
table(lk2,smallG$ALL.AML)
balKfold
lk3 <- xval(smallG, "ALL.AML", knnB, xvalMethod="FUN", 0:0, indFun=balKfold(5), niter=5)
table(lk3, smallG$ALL.AML)
#
# illustrate the xval FUN method in comparison to LOO
#
LOO2 <- xval(smallG, "ALL.AML", knnB, "FUN", 0:0, function(x,y,i) {
(1:ncol(exprs(x)))[-i] }, niter=72 )
table(lk1, LOO2)
#
# use Stephen Henderson's feature selection extensions
#
t.fun<-function(data, fac)
{
require(genefilter)
# deal with the integer storage of golubTrain@exprs!
xd <- matrix(as.double(exprs(data)), nrow=nrow(exprs(data)))
return(abs(rowttests(xd,data[[fac]], tstatOnly=FALSE)$statistic))
}
lk3f <- xval(smallG, "ALL.AML", knnB, xvalMethod="LOO", 0:0, fsFun=t.fun)
table(lk3f$out, smallG$ALL.AML)