% % NOTE -- ONLY EDIT THE .Rnw FILE!!! The .tex file is % likely to be overwritten. % % \VignetteIndexEntry{goTools overview} % \VignetteDepends{tools, GO.db} % \VignetteSuggest{hgu133a.db} % \VignetteKeywords{Gene Ontology} % \VignettePackage{goTools} \documentclass[11pt]{article} \usepackage{amsmath,epsfig,fullpage} %\usepackage[authoryear,round]{natbib} \usepackage{hyperref} \newcommand{\Robject}[1]{{\texttt{#1}}} \newcommand{\Rfunction}[1]{{\texttt{#1}}} \newcommand{\Rpackage}[1]{{\textit{#1}}} \parindent 0in %\bibliographystyle{abbrvnat} \begin{document} \title{\bf Getting started with \Rpackage{goTools} package} \author{Agnes Paquet$^1$ and (Jean) Yee Hwa Yang$^2$} \maketitle \begin{center} 1. {\tt paquetagnes@yahoo.com}\\ 2. Department of Medicine, University of California, San Francisco, \url{http://www.biostat.ucsf.edu/jean} \end{center} % library(tools) % setwd("c:/MyDoc/Projects/Jean/CVS/goTools/inst/doc") % setwd("c:/MyDoc/Projects/madman/Rpacks-devel/goTools/inst/doc") % Rnwfile<- file.path("c:/MyDoc/Projects/madman/Rpacks-devel/goTools/inst/doc","goTools.Rnw") % Sweave(Rnwfile,pdf=TRUE,eps=TRUE,stylepath=TRUE,driver=RweaveLatex()) \tableofcontents %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Getting started} This document provides a tutorial for the \Rpackage{goTools} package, which allows graphical comparisons of functional groups between two sets of genes. \paragraph{Installing the package:} To install the \Rpackage{goTools} package, go to the Bioconductor installation web site \url{http://www.bioconductor.org/help/faq} for more detailed instructions. \paragraph{Help files:} As with any R package, detailed information on functions, classes and methods can be obtained in the help files. For instance, to view the help file for the function {\tt ontoCompare} in a browser, use {\tt help.start()} followed by {\tt ?ontoCompare}. \\ We demonstrate the functionality with a randomly selected set of probe IDs from both Affymetrix hgu133a chip (\Robject{affylist}). To load the \Robject{probeID} dataset, use {\tt data(probeID)}, and to view a description of the experiments and data, type {\tt ?probeID}. \paragraph{Sweave:} This document was generated using the \Rfunction{Sweave} function from the R \Rpackage{tools} package. The source file is in the {\tt /inst.doc} directory of the package \Rpackage{goTools}. \paragraph{} To begin, let's load the package and the probeID datasets into your R session. <>= library("goTools", verbose=FALSE) data(probeID) @ As shown below, \Robject{affylist} is a vector of lists containing 3 list of vectors of probe ids from Affymetrix hgu133a chip. <>= class(affylist) length(affylist) affylist[[1]][1:5] @ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Graphical comparisons of two sets of genes} Gene Ontology is a Direct Acyclic Graph (DAG) that provides three structured networks of defined terms to describe gene product attributes: Molecular Function (MF), Biological Process (BP) and Cellular Component (CC). A gene product has one or more molecular functions and is used in one or more biological processes; it might be associated with one or more cellular components. To learn more about GO and DAG, please refer to Gene Ontology web site \url{http://www.geneontology.org/}.\\ We have created a set of R functions that use GO structure to describe and compare the composition of sets of genes (or probes). We use the following algorithm:\\ \begin{enumerate} \item{Read in a {\bf list} of sets of probe id you want to compare.} \item{Map each probe id to corresponding ontologies in the GO tree, if any.} \item{Create the set of GO ids of interest used to compare your datasets (\Robject{endnode}). The function \Rfunction{EndNodeList()} will create a set of nodes of the DAG located one level under MF, BP or CC, but you can use any sets of GO ids.} \item{For each GO id, go up the GO tree until reaching the nodes in \Robject{endnode}. Search may be limited to MF, BP or CC if specified in \Robject{goType}.} \item{Compute the percentage of direct children found under each node in \Robject{endnode}.} \item{Return the results. Plot them if \Robject{plot=TRUE}.} \end{enumerate} \subsection{How to use goTools} The main function that we provide is \Rfunction{ontoCompare}. It takes as argument {\bf a list} of probe ids. Their type must be specified in the argument \Robject{probeType}. For more details about it,you can refer to the corresponding help file by typing: \Rfunction{?ontoCompare}. %\subsubsection{Examples} %The following examples demonstrate how to use \Rfunction{ontoCompare} on the %\Robject{probeID} dataset. <>= library(GO.db) subset=c(L1=list(affylist[[1]][1:5]),L2=list(affylist[[2]][1:5])) res <-ontoCompare(subset, probeType="hgu133a") @ \subsubsection{Methods for computing percentages} \Rfunction{ontoCompare} allows you to choose from 3 different methods to estimate the percentage of probes under each element of \Robject{endnode}. The default method is \Robject{TGenes}. \begin{enumerate} \item{\Robject{TGenes}: for each end node, return the number of direct children found / total number of probe ids.\\ This includes oligos which do not have GO annotations.} \item{\Robject{TIDS}: for each end node, return the number of direct children found / total number of GO ids describing the list.} \item{\Robject{none}: for each end node, return the number of direct children found.} \end{enumerate} \subsection{Plotting the results} The plots are produced using the function \Rfunction{ontoPlot}. It is called by \Rfunction{ontoCompare} when you set \Robject{plot=TRUE}. You can also call it directly, passing as argument \Rfunction{ontoCompare} results. If only one set of genes is passed to \Rfunction{ontoCompare}, \Rfunction{ontoPlot} will return a pie chart. In other cases, it will return a bargraph. You can modify \Rfunction{ontoPlot} layout parameters using usual R graphics layout parameters. For more details, type \Rfunction{?par}. <>= library(GO.db) subset=c(L1=list(affylist[[1]][1:5]),L2=list(affylist[[2]][1:5])) res <- ontoCompare(subset, probeType="hgu133a", plot=TRUE) @ %See Figure \ref{fig:ontoPlotEg} \section{How to set up the "end nodes"} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Default list} The default end nodes list is defined by a call to the function \Rfunction{EndNodeList}. It contains all children of MF(GO:0003674), BP(GO:0008150) and CC (GO:0005575). <>= EndNodeList() @ \subsection{Customized end node list} If you want to use more ontologies to describe your set of genes, you can use the function\\ \mbox{\Rfunction{CustomEndNodeList(id,rank)}} to create a bigger set of end nodes. It returns all GO ids children of \Robject{id} up to \Robject{rank} levels below \Robject{id}. <>= MFendnode <- CustomEndNodeList("GO:0003674", rank=2) @ Finally, the code below shows you how to use a custom end node list, and also how to modify the \Robject{goType} argument to select only Molecular Function (MF) ontologies. <>= res <- ontoCompare(subset, probeType="hgu133a", endnode=MFendnode, goType="MF") @ You can also create a list of GO ids of nodes of interest and pass it directly to the \Robject{endnode} argument in \Rfunction{ontoCompare}. GO ids must be in the following format: \mbox{"GO:\it{XXXXXXX}.}"\\ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %\begin{figure} %[htpb] %\begin{center} %\begin{tabular}{cc} %\includegraphics[width=3in,height=3in,angle=0]{ontoPlotEg1} %& %\includegraphics[width=3in,height=3in,angle=0]{ontoPlotEg2} \\ %(a) & (b)\\ %\end{tabular} %\end{center} %\caption{Plots obtained using \Rfunction{ontoCompare} on \Robject{probeID} dataset} %\protect\label{fig:ontoPlotEg} %\end{figure} %\bibliography{marrayPacks} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \end{document}