Historically, CISs were defined by Southern analysis, which meant that the retroviral integration sites (RISs) were usually located within ~30 kb of each other. However, with the mouse sequence it is now possible to indentify retroviruses located much further apart. Since retrovirus can affect gene expression over distances much larger than 30 kb, we decided to use a larger genomic window to define a CIS, keeping in mind that the interval size used should not lead to a large number of false CISs.
Based upon Monte Carlo computer simulations, we decide to use a window size of 100 kb, 50 kb, and 30 kb for CISs with 4 (or more), 3, or 2 insertions respectively in each model (approximately 1,000 insertion spots).
For more detail, please check our publication Suzuki et al. (2002) Nat. Genet.. Mikkers et al. (2002) Nat. Genet. calculated ideal window sizes for various number of screening sets based on Poisson distribution. Other references of retroviral insertional mutagenesis are also listed in our reference section.
Candidate genes are usually the nearest neighboring RefSeq gene from each integration sites. However, we do not apply this rule strictly toward common integration sites. We choose CIS candidate genes based on directions of insertions and gene functions. In addition, we may have missed the latest additions and modifications of gene definitions.
If you notice a mistake, please contact to our help desk by e-mail.
Please visit UCSC Genome Bioinformatics Homepage (http://genome.ucsc.edu). There is a step-by-step instruction how to create user custom tracks.