This package is intended to provide a method whereby the Pvalues of a set of GO annotations can be determined for a set of genes, based on the number of genes that exist in the particular genome (or in a selected background distribution from the genome), and their annotation, and the frequency with which the GO nodes are annotated across the provided set of genes.
The Pvalue is simply calculated using the hypergeometric distribution as the probability of x or more out of n genes having a given annotation, given that G of N have that annotation in the genome in general. We chose the hypergeometric distribution (sampling without replacement) since it is more accurate, though slower to calculate, than the binomial distibution (sampling with replacement).
In addition, a corrected pvalue can be calculated, to correct for multiple hypothesis testing. The correction factor used is the total number of nodes to which the provided list of genes are annotated, excepting any nodes which have only a single annotation in the background, as a priori, we know that these cannot be significantly enriched.
The client has access to both the corrected and uncorrected values. It is also possible to correct the pvalue using 1000 simulations, which control the Family Wise Error Rate  using this option suggests that the Bonferroni correction is in fact somewhat liberal, rather than conservative, as might be expected. Finally, the False Discovery Rate can also be calculated.
The general idea is that a list of genes may have been identified for some reason, e.g. they are coregulated, and TermFinder can be used to find out if any nodes annotate the set of genes to a level which is extremely improbable if the genes had simply been picked at random.
