TuMult: a tool for multiple tumor analysis

 

TuMult requires four arguments. The input should be formatted as follows:


- Profile table (example: Profiles_P3.csv)

A csv table containing the copy number profiles of the tumors from the same patient. Each sample (e.g. Sample1) is described by two columns: one for the exact log ratio (Sample1.value), and one for the discretized copy number status (Sample1.status) encoded as follows: -2 homozygous deletion, -1 deletion, 0 normal, 1 gain, 2 amplification. Missing values should be indicated by ‘NA’. The lines should be ordered as in the Probes.csv table, such that line x in the profile table corresponds to line x in the probe table.


- Probe table (example: Probe_table_bladder_CGH.csv)

A csv table providing information about the probes on the array: an identifier (Name), the chromosome targeted (Chromosome), the position of the probe on the chromosome (either a single column named Position, or two columns named StartPosition and EndPosition), and the cytoband targeted (either a single column named Cytoband, or two columns named StartCytoband and EndCytoband).


- Reference data table (example: Reference_dataset_bladder_CGH.Rdata)

A table containing the discretized profiles (values in -2, -1, 0, 1, 2) of the samples in the reference data set. As this may be a large file for high-definition arrays, it should be provided in Rdata format, which can be read more quickly than a csv table.


- th.bkp

Integer representing the number of probes below which two breakpoints in two samples from the patient will be considered identical and be merged.

INPUT

OUTPUT

TuMult returns five files to the working directory:


- Patient_segments.csv

Table describing the ‘homogenous segments’ delimited by the algorithm.


- Patient_tree_segments.dot

The tree in .dot format. This format can be easily read to produce a picture within the open source Graphviz program (http://www.graphviz.org). In this file, edges are labeled in terms of segments, which, together with the segment table, provide the exact boundaries of each aberration.


- Patient_tree_cytobands.dot

The tree in .dot format. In this file, edges are labeled in terms of cytobands, giving a more intuitive reading of the tree.


- Patient_tree.Rdata

An R object fully describing the tree as a list of nodes, edges and events. This object is useful for further analysis of the trees within R.


- Patient_common_breakpoints.pdf

A .pdf image displaying the tumor profiles and the common breakpoints in the common precursor of all tumors. Breakpoints still present in all samples are represented by dashed lines, whereas breakpoints that were inferred by the algorithm to balance the chromosome profile are indicated by dotted lines.

HOW TO RUN TuMult ?

R must be installed on the computer to run TuMult script. However, it can be launched either within R, or directly from the shell.


Running TuMult within R

To run TuMult within R, complete the ARGUMENTS section in the script as in the following example:

            Prof.file = ”P3.csv”

            Probes.file = ”Probe_table_bladder_CGH.csv”

            Reference.file = “Reference_dataset_bladder_CGH.Rdata”

            th.bkp = 2

Then run the script.


Running TuMult directly from the shell

TuMult can be run directly from the shell. To do so, first make it executable:

            chmod +x TuMult.R

Then launch it with its 4 arguments:

            ./TuMult.R  P3.csv  Probe_table_bladder_CGH.csv  Reference_dataset_bladder_CGH.Rdata  2

TuMult was developed for the analysis of several tumors from the same patient. Using the chromosome breakpoints these tumors have in common, TuMult reconstructs the tumor lineage and the sequence of chromosome aberrations occurring during tumorigenesis. TuMult may be applied to any kind of copy number data. You can download here the R script of the algorithm and examples of formatted data. A brief tutorial explaining the use of this algorithm is provided below.

DOWNLOADS

  1. -Annotated R script of the algorithm: TuMult.R

  2. -Formatted data of four bladder tumors from the same patient analyzed on CGH arrays:
            Profile table:
    P3.csv
            Probe table:
    Probe_table_bladder_CGH.csv
            Reference data table:
    Reference_dataset_bladder_CGH.Rdata
            Recommended th.bkp: 2
            Shell command: 
    ./TuMult.R  P3.csv  Probes_table_bladder_CGH.csv  Reference_dataset_bladder_CGH.Rdata  2

  3. -Formatted data for a breast primary tumor/ ipsilateral recurrence pair from the same patient, analyzed with SNP arrays:
            Profile table:
    Pair13.csv
            Probe table:
    Probe_table_breast_SNP.csv
            Reference data table:
    Reference_dataset_breast_SNP.Rdata
            Recommended th.bkp: 10
            Shell command:  ./TuMult.R  Pair13.csv  Probes_table_breast_SNP.csv  Reference_dataset_breast_SNP.Rdata  10

CONTACT