ChromBackend
classes for ChromatogramsPackage: Chromatograms
Authors: Laurent Gatto [aut] (ORCID: https://orcid.org/0000-0002-1520-2268),
Johannes Rainer [aut] (ORCID: https://orcid.org/0000-0002-6977-7147),
Philippine Louail [aut, cre] (ORCID:
https://orcid.org/0009-0007-5429-6846, fnd: European Union
HORIZON-MSCA-2021 project Grant No. 101073062: HUMAN – Harmonising
and Unifying Blood Metabolic Analysis Networks)
Compiled: Tue May 27 09:58:49 2025
Similar to the Spectra package, the Chromatograms
also separates the user-faced functionality to process and analyze
chromatographic mass spectrometry (MS) data from the code for storage and
representation of the data. The latter functionality is provided by
implementations of the ChromBackend
class, further on called backends. This
vignette describes the ChromBackend
class and illustrates on a simple example
how a backend extending this class could be implemented.
Contributions to this vignette (content or correction of typos) or requests for additional details and information are highly welcome, ideally via pull requests or issues on the package’s github repository.
ChromBackend
?The purpose of a backend class extending the virtual ChromBackend
is to
provide the chromatographic MS data to the Chromatograms
object, which is
used by the user to interact with - and analyze the data. The ChromBackend
defines the API that new backends need to provide so that they can be used with
Chromatograms
. This API defines a set of methods to access the data. For many
functions default implementations exist and a dedicated implementation for a
new backend is only needed if necessary (e.g. if the data is stored in a way
that a different access to it would be better). In addition, a core set of
variables (data fields), the so called core chromatogram variables, is
defined to describe the chromatographic data. Each backend needs to provide
these, but can also define additional data fields. Before implementing a new
backend it is highly suggested to carefully read the following
Conventions and definitions section.
General conventions for chromatographic MS data of a Chromatograms
are:
Chromatograms
object is designed to contain multiple chromatographic
data (not data from a single chromatogram).NA
) for retention time values are not supported.coreChromVariables()
function.dataOrigin
defines for each chromatogram where the data is from expected to
be of typecharacter
. Missing values should be NA_character_
ChromBackend
implementations can also represent purely read-only data
resources. In this case only data accessor methods need to be implemented but
not data replacement methods (i.e. <-
methods that would allow to add or
set variables. Read-only backends should implement the isReadOnly()
method, that should then return TRUE
.For parallel processing, Chromatograms
splits the backend based on a defined
factor
and processes each in parallel (or in serial if a SerialParam
is
used). The splitting factor
can be defined for Chromatograms
by setting the
parameter processingChunkSize
. Alternatively, through the
backendParallelFactor()
method the backend can also suggest a factor
that
should/could be used for splitting and parallel processing. The default
implementation for backendParallelFactor()
is to return an empty factor
(factor()
) hence not suggesting any preferred splitting.
Besides parallel processing, for on-disk backends (i.e., backends that don’t keep all of the data in memory), this chunk-wise processing can also reduce the memory demand for operations, because only the peak data of the current chunk needs to be realized in memory.
The ChromBackend
class defines core methods that have to be implemented by a
MS backend as well as optional methods for which a default implementation
is already available. These functions are described in sections
Required methods and Optional methods, respectively.
To create a new backend a class extending the virtual ChromBackend
needs to
be implemented. In the following example we define a simple class that uses a
data.frame
to store general properties (chromatogram variables) and a
list of data.frame
for the retention time and intensity values of each
chromatograms, which represent the actual chromatographic MS data. These values
are store in alist
, where each element correspond to one chromatogram, as
the number of values (peaks) can vary between chromatograms. We also provide
a basic constructor function that returns an empty instance of the new class.
library(Chromatograms)
#' Definition of the backend class extending ChromBackend
setClass("ChromBackendTest",
contains = "ChromBackend",
slots = c(
chromData = "data.frame",
peaksData = "list"
),
prototype = prototype(
chromData = data.frame(),
peaksData = list()
)
)
#' Simple constructor function
ChromBackendTest <- function() {
new("ChromBackendTest")
}
The 2 slots @chromData
and @peaksData
will be used to store the general
properties of the chromatograms and the actual chromatographic data,
respectively. each row in chromData
will contain data for one chromatogram
with the columns being the different chromatogram variables (i.e. additional
properties of a chromatogram such as its m/z value or MS level) and each
element in @peaksData
a data.frame
with the retention time and intensity
values representing thus the peaks data of the respective chromatogram. This
is only one of the possibly many ways chromatographic data might be
represented.
We should ideally also add some basic validity function that ensures the data
to be correct (valid). The function below simply checks that the number of
rows of the @chromData
slot matches the length of the @peaksData
slots.
#' Basic validation function
setValidity("ChromBackendTest", function(object) {
if (length(object@peaksData) != nrow(object@chromData)) {
return(
"length of 'peaksData' has to match the number of rows of ",
"'chromData'"
)
}
NULL
})
## Class "ChromBackendTest" [in ".GlobalEnv"]
##
## Slots:
##
## Name: chromData peaksData version
## Class: data.frame list character
##
## Extends:
## Class "ChromBackend", directly
## Class "ChromBackendOrMissing", by class "ChromBackend", distance 2
We can now create an instance of our new class with the ChromBackendTest()
function.
#' Create an empty instance of ChromBackendTest
be <- ChromBackendTest()
be
## An object of class "ChromBackendTest"
## Slot "chromData":
## data frame with 0 columns and 0 rows
##
## Slot "peaksData":
## list()
##
## Slot "version":
## [1] "0.1"
A show()
method would allow for a more convenient way how general information
of our object is displayed. Below we add an implementation of the show()
method.
#' implementation of show for ChromBackendTest
setMethod("show", "ChromBackendTest", function(object) {
cd <- object@chromData
cat(class(object), "with", nrow(cd), "chromatograms\n")
})
be
## ChromBackendTest with 0 chromatograms
Methods listed in this section must be implemented for a new class
extending ChromBackend
. Methods should ideally also be implemented in the
order they are listed here. Also, it is strongly advised to write dedicated
unit tests for each newly implemented method or function already during
the development.
dataStorage()
The dataStorage
chromatogram variable provides information how or where the
data is stored. The dataStorage()
method should therefore return a
character
vector of length equal to the number of chromatograms that are
represented by the object. The values for dataStorage
can be any character
value, except NA
. For our example backend we define a simple dataStorage()
method that simply returns the column "dataStorage"
from the @chromData
(as a character
).
#' dataStorage method to provide information *where* data is stored
setMethod("dataStorage", "ChromBackendTest", function(object) {
as.character(object@chromData$dataStorage)
})
Calling dataStorage()
on our example backend will thus return an empty
character
(since the object created above does not contain any data).
dataStorage(be)
## character(0)
length()
length()
is expected to return an integer
of length 1 with the total number
of chromatograms that are represented by the backend. For our example backend
we simply return the number of rows of the data.frame
stored in the
@chromData
slot.
#' length to provide information on the number of chromatograms
setMethod("length", "ChromBackendTest", function(x) {
nrow(x@chromData)
})
length(be)
## [1] 0
backendInitialize()
The backendInitialize()
method should be called after creating an instance of
the backend class and is responsible for preparing (initializing) the backend
with data. This method can accept any parameters required by the backend to
load or initialize the data, such as file names, a database connection, or
objects containing the data. It is also recommended that the the special
chromatogram variables dataStorage
and dataOrigin
are set during
backendInitialize()
.
It is strongly recommended to validate the input data within the initialize
method. The advantage of performing these validity checks in
backendInitialize()
rather than using setValidity()
is that computationally
expensive operations/checks would only be performed once,during
initialization, instead of each time values within the object are modified
(e.g., through subsetting or similar operations), which would occur with
setValidity()
.
We also use the validChromData()
and validPeaksData()
functions to ensure
that core chromatogram variables and core peaks variables have the correct data
type. These checks verify that thepeaksData
contains only numeric values and
that the number of retention time and intensity values matches for each
chromatogram.
Below we define a backendInitialize()
method that accepts a data.frame
containing chromatogram variables and a list
with retention time and
intensity values for each chromatogram.
#' backendInitialize method to fill the backend with data.
setMethod(
"backendInitialize", "ChromBackendTest",
function(object, chromData, peaksData) {
if (!is.data.frame(chromData)) {
stop(
"'chromData' needs to be a 'data.frame' with the general",
"chromatogram variables"
)
}
## Defining dataStorage and dataOrigin, if not available
if (is.null(chromData$dataOrigin)) {
chromData$dataOrigin <- NA_character_
}
## Validate the provided data
validChromData(chromData)
validPeaksData(peaksData)
## Fill the object with data
object@chromData <- chromData
object@peaksData <- peaksData
object
}
)
In addition to adding the data to object, the function also define the
dataOrigin
chromatographic variables. This variable is expected to provide
information on where the data is originating.
We can now create an instance of our backend class and fill it with data. We
thus first define our MS data and pass this to the backendInitialize()
method.
# A data.frame with chromatogram variables.
cdata <- data.frame(
msLevel = c(1L, 1L),
mz = c(112.2, 123.3)
)
# Retention time and intensity values for each chromatogram.
pdata <- list(
data.frame(
rtime = c(12.4, 12.8, 13.2, 14.6),
intensity = c(123.3, 153.6, 2354.3, 243.4)
),
data.frame(
rtime = c(45.1, 46.2),
intensity = c(100, 80.1)
)
)
#' Create and initialize the backend
be <- backendInitialize(ChromBackendTest(),
chromData = cdata, peaksData = pdata
)
be
## ChromBackendTest with 2 chromatograms
This backendInitialize()
implementation should assure data
validity and integrity. Below we use this function again to create our backend
instance.
The backendInitialize()
method that we implemented for our backend class
expects the user to provide the full MS data. It would alternatively also be
possible to implement a method that takes data file names as input from which
the function can then import the data. The purpose of the backendInitialize()
method is to initialize and prepare the data in a way that it can be accessed
by a Chromatograms
object. Whether the data is actually loaded into memory or
simply referenced and loaded upon request does not matter as long as the
backend is able to provide the data though its accessor methods when requested
by the Chromatograms
object.
chromVariables()
The chromVariables()
method should return a character
vector with the names
of all available chromatogram variables of the backend. While a backend class
should support defining and providing their own variables, each ChromBackend
class must provide also the core chromatogram variables (in the correct
data type). These can be listed by the coreChromVariables()
function:
#' List core chromatogram variables along with data types.
coreChromVariables()
## chromIndex collisionEnergy dataOrigin msLevel mz
## "integer" "numeric" "character" "integer" "numeric"
## mzMin mzMax precursorMz precursorMzMin precursorMzMax
## "numeric" "numeric" "numeric" "numeric" "numeric"
## productMz productMzMin productMzMax
## "numeric" "numeric" "numeric"
A typical chromVariables()
method for a ChromBackend
class will thus be
implemented similarly to the one for our ChromBackendTest
test backend: it
will return the names for all available chromatogram variables that can be
called by chromData()
within the backend object. There is a default
implementation for chromVariables()
that will return the core chromatogram
variables. However if a backend class defines additional chromatogram
variables, the chromVariables()
method should be implemented to return the
names of these additional variables as well.
#' Accessor for available chromatogram variables
setMethod("chromVariables", "ChromBackendTest", function(object) {
union(names(object@chromData), names(coreChromVariables()))
})
chromVariables(be)
## [1] "msLevel" "mz" "dataOrigin" "chromIndex"
## [5] "collisionEnergy" "mzMin" "mzMax" "precursorMz"
## [9] "precursorMzMin" "precursorMzMax" "productMz" "productMzMin"
## [13] "productMzMax"
chromData()
The chromData
method should return the full chromatogram data within a
backend as a data.frame
object. A parameter columns
should allow to define
the names of the variables that should be returned. A parameter drop
should
also be implemented to allow for the calling of one column while still
controlling the return type.
Each row in this data frame should represent one chromatogram, each column a
chromatogram variable. The data.frame
must provide values (even if they
are NA
) for all requested chromatogram variables of the backend
(including the core chromatogram variables). The
fillCoreChromVariables()
function from the Chromatograms package allows
to complete (fill) a provided data.frame
with eventually missing core
chromatogram variables:
#' Get the data.frame with the available chrom variables
be@chromData
## msLevel mz dataOrigin
## 1 1 112.2 <NA>
## 2 1 123.3 <NA>
#' Complete this data.frame with missing core variables
fillCoreChromVariables(be@chromData)
## msLevel mz dataOrigin chromIndex collisionEnergy mzMin mzMax precursorMz
## 1 1 112.2 <NA> NA NA NA NA NA
## 2 1 123.3 <NA> NA NA NA NA NA
## precursorMzMin precursorMzMax productMz productMzMin productMzMax
## 1 NA NA NA NA NA
## 2 NA NA NA NA NA
We can thus use this function to add eventually missing core chromatogram
variables in the chromData
implementation for our backend:
#' function to extract the full chromData
setMethod(
"chromData", "ChromBackendTest",
function(object, columns = chromVariables(object),
drop = FALSE) {
if (!any(chromVariables(object) %in% columns)) {
stop(
"Some of the requested Chromatogram variables are not ",
"available"
)
}
res <- fillCoreChromVariables(object@chromData)
res <- res[, columns, drop = drop]
res
}
)
We can now use chromData()
to either extract the full chromatogram data from
the backend, or only the data for selected variables.
#' Extract the full data
chromData(be)
## msLevel mz dataOrigin chromIndex collisionEnergy mzMin mzMax precursorMz
## 1 1 112.2 <NA> NA NA NA NA NA
## 2 1 123.3 <NA> NA NA NA NA NA
## precursorMzMin precursorMzMax productMz productMzMin productMzMax
## 1 NA NA NA NA NA
## 2 NA NA NA NA NA
#' Selected variables
chromData(be, c("mz", "msLevel"))
## mz msLevel
## 1 112.2 1
## 2 123.3 1
#' Only missing core spectra variables
chromData(be, c("collisionEnergy", "mzMin"))
## collisionEnergy mzMin
## 1 NA NA
## 2 NA NA
peaksVariables()
The peaksVariables()
function is supposed to provide the names of the
available peaks variables. If additional peaks variables would be available,
these could also be listed by the peaksVariables()
method.
There is a default implementation for peaksVaraibles()
that will return the
core peaks variables. However if a backend class defines additional
peaks variables, the peaksVariables()
method should be implemented to
return the names of these additional variables as well.
setMethod("peaksVariables", "ChromBackendTest", function(object) {
union(names(corePeaksVariables()), names(object@peaksData[[1]]))
})
We can now see what peaks variables are present in our object:
peaksVariables(be)
## [1] "rtime" "intensity"
peaksData()
The peaksData()
method extracts the chromatographic data (peaks), i.e., the
chromatograms’ retention time and intensity values. This data is returned as a
list
of data.frame
, with one array per chromatogram with columns being the
peaks variables (retention time and intensity values) and rows the individual
data pairs. Each backend must provide retention times and intensity values with
this method, but additional peaks variables (columns) are also supported.
In a similar way as for the chromatogram variables, a backend should support
defining and providing their own variables and each ChromBackend
class must provide also the core peaks variables (in the correct
data type). These can be listed by the corePeaksVariables()
function:
corePeaksVariables()
## rtime intensity
## "numeric" "numeric"
Below we implement the peaksData()
method for our backend.
#' method to extract the full chromatographic data as list of arrays
setMethod(
"peaksData", "ChromBackendTest",
function(object, columns = peaksVariables(object), drop = FALSE) {
if (!all(columns %in% peaksVariables(object))) {
stop("Some of the requested peaks variables are not available")
}
res <- lapply(object@peaksData, function(x) x[, columns, drop = drop])
res
}
)
And with this method we can now extract the peaks data from our backend.
#' Extract the *peaks* data (i.e. intensity and retention times)
peaksData(be)
## [[1]]
## rtime intensity
## 1 12.4 123.3
## 2 12.8 153.6
## 3 13.2 2354.3
## 4 14.6 243.4
##
## [[2]]
## rtime intensity
## 1 45.1 100.0
## 2 46.2 80.1
Since the peaksData()
method is the main function used by a Chromatograms
to retrieve data from the backend (and further process the values), this method
should be implemented in an efficient way.
[
The [
method allows to subset ChromBackend
objects. This operation is
expected to reduce a ChromBackend
object to the selected chromatograms
without changing values for the subset chromatograms. The method should
support to subset by indices or logical vectors and should also support
duplicating elements (i.e., when duplicated indices are used) as well as to
subset in arbitrary order. An error should be thrown if indices are out of
bounds, but the method should also support returning an empty backend with
[integer()]
. The MsCoreUtils::i2index
function can be used to check and
convert the provided parameter i
(defining the subset) to an integer vector.
Below we implement a possible [
for our test backend class. We ignore the
parameters j
from the definition of the [
generic, since we treat our data
to be one-dimensional (with each chromatogram being one element).
#' Main subset method.
setMethod("[", "ChromBackendTest", function(x, i, j, ..., drop = FALSE) {
i <- MsCoreUtils::i2index(i, length = length(x))
x@chromData <- x@chromData[i, ]
x@peaksData <- x@peaksData[i]
x
})
We can now subset our backend to the last two chromatograms.
a <- be[1]
chromData(a)
## msLevel mz dataOrigin chromIndex collisionEnergy mzMin mzMax precursorMz
## 1 1 112.2 <NA> NA NA NA NA NA
## precursorMzMin precursorMzMax productMz productMzMin productMzMax
## 1 NA NA NA NA NA
Or extracting the second chromatogram multiple times.
a <- be[c(1, 1, 1)]
chromData(a)
## msLevel mz dataOrigin chromIndex collisionEnergy mzMin mzMax precursorMz
## 1 1 112.2 <NA> NA NA NA NA NA
## 1.1 1 112.2 <NA> NA NA NA NA NA
## 1.2 1 112.2 <NA> NA NA NA NA NA
## precursorMzMin precursorMzMax productMz productMzMin productMzMax
## 1 NA NA NA NA NA
## 1.1 NA NA NA NA NA
## 1.2 NA NA NA NA NA
$
The $
method is expected to extract a single chromatogram or peaks variable
from a backend. Parameter name
should allow to name the variable to return.
Each ChromBackend
must support extracting the core chromatogram and core
peaks variables with this method (even if no data might be available for that
variable). In our example implementation below we make use of the chromData()
method, but more efficient implementations might be possible as well.
Also, the $
method should check if the requested variable is available and
should throw an error otherwise.
#' Access a single chromatogram variable
setMethod("$", "ChromBackendTest", function(x, name) {
if (name %in% union(chromVariables(x), names(coreChromVariables()))) {
res <- chromData(x, columns = name, drop = TRUE)
} else if (name %in% peaksVariables(x)) {
res <- peaksData(x, columns = name, drop = TRUE)
} else {
stop("The requested variable '", name, "' is not available")
}
res
})
With this we can now extract the MS levels
be$msLevel
## [1] 1 1
or a core chromatogram variable without values in our example backend.
be$precursorMz
## [1] NA NA
or also the intensity values
be$intensity
## [[1]]
## [1] 123.3 153.6 2354.3 243.4
##
## [[2]]
## [1] 100.0 80.1
backendMerge()
The backendMerge()
method merges (combines) ChromBackend
objects (of the
same type!) into a single instance. For our test backend we thus need to
combine the values in the @chromData
, @peaksData
slots. To support also
merging of data.frame
s with different sets of columns we use the
MsCoreUtils::rbindFill
function instead of a simple rbind
(this function
joins data frames making an union of all available columns filling eventually
missing columns with NA
).
#' Method allowing to join (concatenate) backends
setMethod("backendMerge", "ChromBackendTest", function(object, ...) {
res <- object
object <- unname(c(list(object), list(...)))
res@peaksData <- do.call(c, lapply(object, function(z) z@peaksData))
res@chromData <- do.call(
MsCoreUtils::rbindFill,
lapply(object, function(z) z@chromData)
)
validObject(res)
res
})
Testing the function by merging the example backend instance with itself.
a <- backendMerge(be, be[2], be)
a
## ChromBackendTest with 5 chromatograms
As stated in the general description, ChromBackend
implementations can also
be purely read-only resources allowing to just access, but not to replace
data. For these backends isReadOnly()
should return FALSE
. Data replacement
methods listed in this section would not need to be implemented. Our example
backend stores the full data in memory, within the object, and hence we can
easily change and replace values.
Since we support replacing values we also implement the isReadOnly()
method
for our example implementation to return FALSE
(instead of the default
TRUE
).
#' Default for backends:
isReadOnly(be)
## [1] FALSE
#' Implementation of isReadOnly for ChromBackendTest
setMethod("isReadOnly", "ChromBackendTest", function(object) FALSE)
isReadOnly(be)
## [1] FALSE
All data replacement function are expected to return an instance of the same backend class that was used as input.
chromData<-
The main replacement method is chromData<-
which should allow to replace the
chormtaogram variables content of a backend with new data. This data is
expected to be provided as a data.frame
(similar to the one returned by
chromData()
). While values can be replaced, the number of chromatograms
before and after a call to chromData<-
has to be the same.
#' Replacement method for the full chromatogram data
setReplaceMethod("chromData", "ChromBackendTest", function(object, value) {
if (is(value, "DataFrame")) {
value <- as(value, "data.frame")
}
if (!inherits(value, "data.frame")) {
stop("'value' is expected to be a 'data.frame'")
}
if (length(object) && length(object) != nrow(value)) {
stop("'value' has to be a 'data.frame' with ", length(object), " rows")
}
validChromData(value)
object@chromData <- value
object
})
To test this new method we extract the full chromatogram data from our example
data set, add an additional column (chromatogram variable) and use
chromData<-
to replace the data of the backend.
d <- chromData(be)
d$new_col <- c("a", "b")
chromData(be) <- d
Check that we have now also the new column available.
be$new_col
## [1] "a" "b"
$<-
The $<-
method should allow to replace values for an existing chromatogram
variable or to add an additional variable to the backend. As with all
replacement methods, the length
of value
has to match the number of
chromatograms represented by the backend. For replacement of retention time or
intensity values we need also to ensure that the data would be correct after
the operation, i.e., that the number of retention time and intensity values per
chromatogram are the identical and that all retention time and intensity values
are numeric. Finally, we use the validChromData()
function to ensure that,
after replacement, all core chromatogram variables have the correct data type.
#' Replace or add a single chromatogram variable.
setReplaceMethod("$", "ChromBackendTest", function(x, name, value) {
if (length(x) && length(value) != length(x)) {
stop(
"length of 'value' needs to match the number of chromatograms ",
"in object."
)
}
if (name %in% peaksVariables(x)) {
if (!is.list(value)) {
stop("The value for peaksData should be a list")
}
for (i in seq_along(value)) {
x@peaksData[[i]][[name]] <- value[[i]]
validPeaksData(x@peaksData)
}
} else {
x@chromData[, name] <- value
validChromData(x@chromData)
}
x
})
We can thus replace an existing chromatogram variable, such as msLevel
:
#' Values before replacement
be$msLevel
## [1] 1 1
#' Replace MS levels
be$msLevel <- c(3L, 2L)
#' Values after replacement
be$msLevel
## [1] 3 2
We can also add a new chromatogram variables:
#' Add a new chromatogram variable
be$name <- c("A", "B")
be$name
## [1] "A" "B"
Or also replace intensity values. Below we replace the intensity values by adding a value of +3 to each.
#' Replace intensity values
be$msLevel3 <- be$msLevel + 3
be$msLevel3
## [1] 6 5
peaksData<-
The peaksData<-
method should allow to replace the full peaks data (retention
time and intensity value pairs) of all chromatograms in a backend. As value
,
a list
of data.frame
should be provided with columns names "rtime"
and
"intensity"
. Because the full peaks data is provided at once, this method can
(and should) support changing also the number of peaks per chromatogram (while
the methods like rtime<-
or $rtime
would not allow).
#' replacement method for peaks data
setReplaceMethod("peaksData", "ChromBackendTest", function(object, value) {
if (!is.list(value)) {
stop("'value' is expected to be a list")
}
if (length(object) && length(object) != length(value)) {
stop("'value' has to be a list with ", length(object), " elements")
}
validPeaksData(value)
object@peaksData <- value
object
})
With this method we can now replace the peaks data of a backend:
#' Create a list with peaks matrices; our backend has 3 chromatograms
#' thus our `list` has to be of length 3
tmp <- list(
data.frame(
rtime = c(12.3, 14.4, 15.4, 16.4),
intensity = c(200, 312, 354.1, 232)
),
data.frame(
rtime = c(14.4),
intensity = c(13.4)
)
)
be_2 <- be
#' Assign this peaks data to one of our test backends
peaksData(be_2) <- tmp
#' Evaluate that we properly added the peaks data
peaksData(be_2)
## [[1]]
## rtime intensity
## 1 12.3 200.0
## 2 14.4 312.0
## 3 15.4 354.1
## 4 16.4 232.0
##
## [[2]]
## rtime intensity
## 1 14.4 13.4
Default implementations for the ChromBackend
class are available for a large
number of methods. Thus, any backend extending this class will automatically
inherit these default implementations. Alternative, class-specific, versions
can, but don’t need to be developed. The default versions are defined in the
R/ChromBackend.R file, and also listed in this section. If alternative
versions are implemented it should be ensured that the expected data type is
always used for core chromatogram variables. Use coreChromVariables()
and
corePeaksVariables()
to list these mandatory data types.
backendParallelFactor()
The backendParallelFactor()
function allows a backend to suggest a preferred
way it could be split for parallel processing. The default implementation
returns factor()
(i.e. a factor
of length 0) hence not suggesting any
specific splitting setup.
#' Is there a specific way how the object could be best split for
#' parallel processing?
setMethod("backendParallelFactor", "ChromBackend", function(object, ...) {
factor()
})
backendParallelFactor(be)
## factor()
## Levels:
chromIndex()
The chromIndex()
function should return the value for the "chromIndex"
chromatogram variable. As a result, an integer
of length equal to the number
of chromatograms in object
needs to be returned. The default implementation
is:
#' get the values for the chromIndex chromatogram variable
setMethod(
"chromIndex", "ChromBackend",
function(object, columns = chromVariables(object)) {
chromData(object, columns = "chromIndex", drop = TRUE)
}
)
The result of calling this method on our test backend:
chromIndex(be)
## [1] NA NA
collisionEnergy()
The collisionEnergy()
function should return the value for the
"collisionEnergy"
chromatogram variable. As a result, a numeric
of length
equal to the number of chromatograms has to be returned. The default
implementation is:
#' get the values for the collisionEnergy chromatogram variable
setMethod("collisionEnergy", "ChromBackend", function(object) {
chromData(object, columns = "collisionEnergy", drop = TRUE)
})
The result of calling this method on our test backend:
collisionEnergy(be)
## [1] NA NA
The default replacement method for the collisionEnergy
chromatogram variable
is:
#' Default replacement method for collisionEnergy
setReplaceMethod(
"collisionEnergy", "ChromBackend", function(object, value) {
object$collisionEnergy <- value
object
}
)
This method thus makes use of the $<-
replacement method we implemented
above. To test this function we replace the collision energy below.
#' Replace the collision energy
collisionEnergy(be) <- c(20, 30)
collisionEnergy(be)
## [1] 20 30
dataOrigin()
, dataOrigin<-
The dataOrigin()
and dataOrigin<-
methods return or set the value(s) for
the "dataOrigin"
chromatogram variable. The values for this chromatogram
variable need to be of type character
(the length equal to the number of
chromatograms). The default implementation for dataOrigin()
is:
#' Default implementation to access dataOrigin
setMethod("dataOrigin", "ChromBackend", function(object) {
chromData(object, columns = "dataOrigin", drop = TRUE)
})
Below we use this method to access the values of the dataOrigin
chromatogram
variable.
#' Access the dataOrigin values
dataOrigin(be)
## [1] NA NA
The default implementation for dataOrigin<-
uses, like all defaults for
replacement methods, the $<-
method:
#' Default implementation of the `dataOrigin<-` replacement method
setReplaceMethod("dataOrigin", "ChromBackend", function(object, value) {
object$dataOrigin <- value
object
})
For our backend we can change the values of the dataOrigin
variable:
#' Replace the backend's dataOrigin values
dataOrigin(be) <- rep("from somewhere", 2)
dataOrigin(be)
## [1] "from somewhere" "from somewhere"
intensity()
, intensity<-
The intensity()
and intensity<-
methods allow to extract or set the
intensity values of the individual chromatograms represented by the backend.
The default for the intensity()
function, which is expected to return a
list
of numeric
values with the intensity values of each chromatogram,
uses the peaksData()
method:
#' Default method to extract intensity values
setMethod("intensity", "ChromBackend", function(object) {
if (length(object)) {
peaksData(object, column = "intensity", drop = TRUE)
} else {
list()
}
})
The default replacement method for intensity values uses the $<-
method:
#' Default implementation of the replacement method for intensity values
setReplaceMethod("intensity", "ChromBackend", function(object, value) {
pd <- peaksData(object)
if (!is.list(value) || length(pd) != length(value)) {
stop("'value' should be a list of the same length as 'object'")
}
for (i in seq_along(pd)) {
if (length(value[[i]]) != nrow(pd[[i]])) {
stop(paste0(
"Length of 'value[[", i, "]]' does not match ",
"the number of rows in the intensity of chromatogram: ",
i, "'"
))
}
}
peaksData(object) <- lapply(seq_along(pd), function(i) {
pd[[i]]$intensity <- value[[i]]
return(pd[[i]])
})
object
})
#' Replace intensity values
intensity(be)[[1]] <- intensity(be)[[1]] + 10
intensity(be)
## [[1]]
## [1] 133.3 163.6 2364.3 253.4
##
## [[2]]
## [1] 100.0 80.1
isEmpty()
The isEmpty()
is a simple helper function to evaluate whether chromatograms
are empty, i.e. have no peaks (retention time and intensity values). It
should return a logical vector of length equal to the number of chromatograms
in the backend with TRUE
if a chromatogram is empty and FALSE
otherwise.
The default implementation uses the lengths()
method (defined further below)
that returns for each chromatogram the number of available data points (peaks).
#' Default implementation for `isEmpty()`
setMethod("isEmpty", "ChromBackend", function(x) {
lengths(x) == 0L
})
isEmpty(be)
## [1] FALSE FALSE
isReadOnly()
As discussed above, backends can also be read-only, hence only allowing to
access, but not to change any values (e.g. if the data is stored in a data base
and the connection to this data base does not support updating or replacing
data). In such cases, the default isReadOnly()
method can be used, which
returns always TRUE
:
#' Default implementation of `isReadOnly()`
setMethod("isReadOnly", "ChromBackend", function(object) {
TRUE
})
Backends that support changing data values should implement their own version
(like we did above) to return FALSE
instead:
isReadOnly(be)
## [1] FALSE
length()
The length()
method should return a single integer
with the total number of
chromatograms available through the backend. The default implementation for
this function is:
#' Default implementation for `length()`
setMethod("length", "ChromBackend", function(x) {
nrow(chromData(x, columns = "dataStorage"))
})
length(be)
## [1] 2
lengths()
The lengths()
function should return the number of data pairs (peaks;
retention time or intensity values) per chromatogram. The result should be an
integer
vector (of length equal to the number of chromatograms in the
backend) with these counts. The default implementation uses the intensity()
function.
#' Default implementation for `lengths()`
setMethod("lengths", "ChromBackend", function(x) {
lengths(intensity(x))
})
The number of peaks for our test backend:
lengths(be)
## [1] 4 2
msLevel()
, msLevel<-
The msLevel()
and msLevel<-
methods should allow extracting and setting the
MS level for the individual chromatograms. MS levels are encoded as integer
,
thus, msLevel()
must return an integer
vector of length equal to the number
of chromatograms of the backend and msLevel<-
should take/accept such a
vector as input. The default implementations for both methods are shown below.
#' Default methods to get or set MS levels
setMethod("msLevel", "ChromBackend", function(object) {
chromData(object, columns = "msLevel", drop = TRUE)
})
setReplaceMethod("msLevel", "ChromBackend", function(object, value) {
object$msLevel <- value
object
})
To test these we below replace the MS levels for our test data set and extract these values again.
msLevel(be) <- c(1L, 2L)
msLevel(be)
## [1] 1 2
mz()
, mz<-
The mz()
and mz<-
methods should allow to extract or set the m/z value for
each chromatogram. The m/z value of a chromatogram is encoded as numeric
,
thus, the methods are expected to return or accept a numeric
vector of length
equal to the number of chromatograms. The default implementations are shown
below.
#' Default implementations to get or set m/z value(s)
setMethod("mz", "ChromBackend", function(object) {
chromData(object, columns = "mz", drop = TRUE)
})
setReplaceMethod("mz", "ChromBackend", function(object, value) {
object$mz <- value
object
})
We below set and extract these target m/z values.
mz(be) <- c(314.3, 312.5)
mz(be)
## [1] 314.3 312.5
mzMax()
, mzMax<-
The mzMax()
and mzMax<-
methods should allow to extract or set the upper
m/z boundary for each chromatogram. m/z values are encoded as numeric
, thus,
the methods are expected to return or accept a numeric
vector of length equal
to the number of chromatograms. The default implementations are shown below.
#' Default implementations to get or set upper m/z limits
setMethod("mzMax", "ChromBackend", function(object) {
chromData(object, columns = "mzMax", drop = TRUE)
})
setReplaceMethod("mzMax", "ChromBackend", function(object, value) {
object$mzMax <- value
object
})
Testing these functions by replacing the upper m/z boundary with new values.
mzMax(be) <- mz(be) + 0.01
mzMax(be)
## [1] 314.31 312.51
mzMin(),
mzMin<-`The mzMin()
and mzMin<-
methods should allow to extract or set the lower
m/z boundary for each chromatogram. m/z values are encoded as numeric
, thus,
the methods are expected to return or accept a numeric
vector of length equal
to the number of chromatograms. The default implementations are shown below.
#' Default methods to get or set the lower m/z boundary
setMethod("mzMin", "ChromBackend", function(object) {
chromData(object, columns = "mzMin", drop = TRUE)
})
setReplaceMethod("mzMin", "ChromBackend", function(object, value) {
object$mzMin <- value
object
})
Testing these functions by replacing the lower m/z boundary with new values.
mzMin(be) <- mz(be) - 0.01
mzMin(be)
## [1] 314.29 312.49
precursorMz()
, precursorMz<-
The precursorMz()
and precursorMz<-
methods are expected to get or set the
values for the precursor m/z of each chromatogram (if available). These are
encoded as numeric
(one value per chromatogram) - and if a value is not
available NA_real_
should be returned. The default implementations are:
#' Default implementations to get or set the precursorMz chrom variable
setMethod("precursorMz", "ChromBackend", function(object) {
chromData(object, columns = "precursorMz", drop = TRUE)
})
setReplaceMethod("precursorMz", "ChromBackend", function(object, value) {
object$precursorMz <- value
object
})
Below we set and get the precursorMz
chromatogram variable for our backend.
precursorMz(be) <- c(NA_real_, 123.3)
precursorMz(be)
## [1] NA 123.3
precursorMzMax()
, precursorMzMax<-
These methods are supposed to allow to get and set the precursorMzMax
chromatogram variable. The default implementations are:
#' Default implementations for `precursorMzMax`
setMethod("precursorMzMax", "ChromBackend", function(object) {
chromData(object, columns = "precursorMzMax", drop = FALSE)
})
setReplaceMethod("precursorMzMax", "ChromBackend", function(object, value) {
object$precursorMzMax <- value
object
})
Below we test these functions by setting and extracting the values for this chromatogram variable.
precursorMzMax(be) <- precursorMz(be) + 0.1
precursorMzMax(be)
## [1] NA 123.4
precursorMzMin()
, precursorMzMin<-
These methods are supposed to allow to get and set the precursorMzMin
chromatogram variable. The default implementations are:
#' Default implementations for `precursorMzMin`
setMethod("precursorMzMin", "ChromBackend", function(object) {
chromData(object, columns = "precursorMzMin", drop = FALSE)
})
setReplaceMethod("precursorMzMin", "ChromBackend", function(object, value) {
object$precursorMzMin <- value
object
})
Below we test these functions by setting and extracting the values for this chromatogram variable.
precursorMzMin(be) <- precursorMz(be) - 0.1
precursorMzMin(be)
## [1] NA 123.2
productMz()
, productMz<-
These methods are supposed to allow to get and set the productMz
chromatogram
variable. The default implementations are:
#' Default implementations for `productMz`
setMethod("productMz", "ChromBackend", function(object) {
chromData(object, columns = "productMz", drop = TRUE)
})
setReplaceMethod("productMz", "ChromBackend", function(object, value) {
object$productMz <- value
object
})
Below we test these functions by setting and extracting the values for this chromatogram variable.
productMz(be) <- c(123.2, NA_real_)
productMz(be)
## [1] 123.2 NA
productMzMax()
, productMzMax<-
These methods are supposed to allow to get and set the productMzMax
chromatogram variable. The default implementations are:
#' Default implementations for `productMzMax`
setMethod("productMzMax", "ChromBackend", function(object) {
chromData(object, columns = "productMzMax", drop = FALSE)
})
setReplaceMethod("productMzMax", "ChromBackend", function(object, value) {
object$productMzMax <- value
object
})
Below we test these functions by setting and extracting the values for this chromatogram variable.
productMzMax(be) <- productMz(be) + 0.02
productMzMax(be)
## [1] 123.22 NA
productMzMin()
, productMzMin<-
These methods are supposed to allow to get and set the productMzMin
chromatogram variable. The default implementations are:
#' Default implementations for `productMzMin`
setMethod("productMzMin", "ChromBackend", function(object) {
chromData(object, columns = "productMzMin", drop = FALSE)
})
setReplaceMethod("productMzMin", "ChromBackend", function(object, value) {
object$productMzMin <- value
object
})
Below we test these functions by setting and extracting the values for this chromatogram variable.
productMzMin(be) <- productMz(be) - 0.2
productMzMin(be)
## [1] 123 NA
rtime()
, rtime<-
The rtime()
and rtime<-
methods allow to get and set the retention times of
the individual chromatograms of the backend. Similar to the method for the
intensity values described above they should return or accept a NumericList
,
each element being a numeric
vector with the retention time values of one
chromatogram. The default implementations of these methods are shown below.
#' Default methods for `rtime()` and `rtime<-`
setMethod("rtime", "ChromBackend", function(object) {
if (length(object)) {
peaksData(object, column = "rtime", drop = TRUE)
} else {
list()
}
})
setReplaceMethod("rtime", "ChromBackend", function(object, value) {
pd <- peaksData(object)
if (!is.list(value) || length(pd) != length(value)) {
stop("'value' should be a list of the same length as 'object'")
}
for (i in seq_along(pd)) {
if (length(value[[i]]) != nrow(pd[[i]])) {
stop(paste0(
"Length of 'value[[", i, "]]' does not match ",
"the number of rows in 'the rtime of chromatogram: ", i, "'"
))
}
}
peaksData(object) <- lapply(seq_along(pd), function(i) {
pd[[i]]$rtime <- value[[i]]
return(pd[[i]])
})
object
})
We below test this implementation replacing the retention times of our example backend by shifting all values by 2 seconds.
rtime(be)[[1]] <- rtime(be)[[1]] + 2
rtime(be)
## [[1]]
## [1] 14.4 14.8 15.2 16.6
##
## [[2]]
## [1] 45.1 46.2
split()
The split()
method should split the backend into a list
of backends
containing subsets of the original backend. The default implementation uses the
default implementation of split()
from R and should work in most cases. This
function uses the [
method to subset/split the object.
#' Default method to split a backend
setMethod("split", "ChromBackend", function(x, f, drop = FALSE, ...) {
split.default(x, f, drop = drop, ...)
})
We below test this by splitting the backend into two subsets.
split(be, f = c(1, 2, 1))
## Warning in split.default(x, f, drop = drop, ...): data length is not a multiple
## of split variable
## $`1`
## ChromBackendTest with 1 chromatograms
##
## $`2`
## ChromBackendTest with 1 chromatograms
sessionInfo()
## R version 4.5.0 (2025-04-11)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.2 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.22-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] Chromatograms_0.99.1 ProtGenerics_1.41.0 BiocParallel_1.43.2
## [4] BiocStyle_2.37.0
##
## loaded via a namespace (and not attached):
## [1] Spectra_1.19.2 cli_3.6.5 knitr_1.50
## [4] rlang_1.1.6 xfun_0.52 generics_0.1.4
## [7] jsonlite_2.0.0 clue_0.3-66 S4Vectors_0.47.0
## [10] htmltools_0.5.8.1 stats4_4.5.0 sass_0.4.10
## [13] rmarkdown_2.29 evaluate_1.0.3 jquerylib_0.1.4
## [16] MASS_7.3-65 fastmap_1.2.0 IRanges_2.43.0
## [19] yaml_2.3.10 lifecycle_1.0.4 bookdown_0.43
## [22] BiocManager_1.30.25 MsCoreUtils_1.21.0 cluster_2.1.8.1
## [25] compiler_4.5.0 codetools_0.2-20 fs_1.6.6
## [28] MetaboCoreUtils_1.17.1 digest_0.6.37 R6_2.6.1
## [31] parallel_4.5.0 bslib_0.9.0 tools_4.5.0
## [34] BiocGenerics_0.55.0 cachem_1.1.0