Difference between revisions of "Xenopus Genome Project"

From Marcotte Lab
Jump to: navigation, search
(Assembled transcripts)
Line 21: Line 21:
 
If you are looking for assembled transcripts per dataset (that I posted before releasing Oktoberfest gene model), see http://daudin.icmb.utexas.edu/pub/final.2012sep/ ([[User:TaejoonKwon|TaejoonKwon]] 14:33, 10 October 2013 (CDT))
 
If you are looking for assembled transcripts per dataset (that I posted before releasing Oktoberfest gene model), see http://daudin.icmb.utexas.edu/pub/final.2012sep/ ([[User:TaejoonKwon|TaejoonKwon]] 14:33, 10 October 2013 (CDT))
  
 +
* [[Taira2013_XENLA]]
 
* [[XENLA_Oktoberfest]] - released on October, 2012 (code name "Oktoberfest")
 
* [[XENLA_Oktoberfest]] - released on October, 2012 (code name "Oktoberfest")
 
* [[XENLA_Mayball]] - released on May, 2013 (code name "MayBall")
 
* [[XENLA_Mayball]] - released on May, 2013 (code name "MayBall")
 
= Assembled transcripts from J-strain <i>X. laevis</i> RNA-seq data (190,591) =
 
You can see more detailed info at [[Taira201203_XENLA]]
 
 
== J-strain stage data (78,546) ==
 
Courtesy of Masanori Taira (m_taira at biol dot s dot u-tokyo dot ac dot jp), University of Tokyo, Japan
 
* cDNA: [[xdata:/final.2012sep/Taira201203_XENLA_stage_cdna_final.fa]]
 
* Protein: [[xdata:/final.2012sep/Taira201203_XENLA_stage_pep_final.fa]]
 
 
== J-strain tissue data (112,045) ==
 
Courtesy of Masanori Taira (m_taira at biol dot s dot u-tokyo dot ac dot jp), University of Tokyo, Japan
 
* cDNA: [[xdata:/final.2012sep/Taira201203_XENLA_tissue_cdna_final.fa]]
 
* Protein: [[xdata:/final.2012sep/Taira201203_XENLA_tissue_pep_final.fa]]
 
 
= Assembled transcripts from wild-type <i>X. laevis</i> RNA-seq data (632,791) =
 
 
== Amin201106_XENLA (53,369) ==
 
Courtesy of Nirav Amin (nmamin at email dot unc dot edu), Frank Conlon (frank_conlon at med dot unc dot edu), University of North Carolina at Chapel Hill, USA
 
<b>Only control samples are used for this assembly.</b>
 
 
* cDNA: [[xdata:/final.2012sep/Amin201106_XENLA_cdna_final.fa]]
 
* Protein: [[xdata:/final.2012sep/Amin201106_XENLA_pep_final.fa]]
 
 
== Park201106_XENLA (76,455) ==
 
Courtesy of Tae Joo Park (tjpark01 at gmail dot com), UNIST, Republic of Korea
 
 
* cDNA: [[xdata:/final.2012sep/Park201106_XENLA_cdna_final.fa]]
 
* Protein: [[xdata:/final.2012sep/Park201106_XENLA_pep_final.fa]]
 
 
== Kwon201107_XENLA (57,315) ==
 
Courtesy of Edward Marcotte (marcotte at icmb dot utexas dot edu), Taejoon Kwon(taejoon dot kwon at marcottelab dot org), Meii Chung(meii at utexas dot edu), John Wallingford(wallingford at mail dot utexas dot edu), University of Texas at Austin, USA
 
 
* cDNA: [[xdata:/final.2012sep/Kwon201107_XENLA_cdna_final.fa]]
 
* Protein: [[xdata:/final.2012sep/Kwon201107_XENLA_pep_final.fa]]
 
 
== Chung201110_XENLA (48,184) ==
 
Meii Chung (meii at utexas dot edu), John Wallingford (wallingford at mail dot utexas dot edu), University of Texas at Austin, USA
 
 
* cDNA: [[xdata:/final.2012sep/Chung201110_XENLA_cdna_final.fa]]
 
* Protein: [[xdata:/final.2012sep/Chung201110_XENLA_pep_final.fa]]
 
 
== Quigley201112_XENLA (57,315) ==
 
Courtesy of Ian Quigley (iquigley at salk dot edu), Chris Kintner (kintner at salk dot edu), Salk Institute, USA
 
 
* cDNA: [[xdata:/final.2012sep/Quigley201112_XENLA_cdna_final.fa]]
 
* Protein: [[xdata:/final.2012sep/Quigley201112_XENLA_pep_final.fa]]
 
 
== Horb201201_XENLA (124,086) ==
 
Courtesy of Marko Horb (mhorb at mbl dot edu), Marine Biological Laboratory, USA
 
 
* cDNA: [[xdata:/final.2012sep/Horb201201_XENLA_cdna_final.fa]]
 
* Protein: [[xdata:/final.2012sep/Horb201201_XENLA_pep_final.fa]]
 
 
== TeperekTkacz201202_XENLA (66,806) ==
 
Courtesy of Marta Teperek-Tkacz (mt446 at cam dot ac dot uk), University of Cambridge/Gurdon Institute, UK
 
 
* cDNA: [[xdata:/final.2012sep/TeperekTkacz201202_XENLA_cdna_final.fa]]
 
* Protein: [[xdata:/final.2012sep/TeperekTkacz201202_XENLA_pep_final.fa]]
 
 
== Ismailoglu201203_XENLA (40,339) ==
 
Courtesy of Ismail Ismailoglu (iismailoglu at rockefeller dot edu), Ali Brivanlou(brvnlou at mail dot rockefeller dot edu), Rockefeller University, USA
 
 
* cDNA: [[xdata:/final.2012sep/Ismailoglu201203_XENLA_cdna_final.fa]]
 
* Protein: [[xdata:/final.2012sep/Ismailoglu201203_XENLA_pep_final.fa]]
 
 
== Veenstra201204_XENLA (22,399) ==
 
Courtesy of Gert J Veenstra (G.Veenstra at ncmls dot ru dot nl), Radboud University Nijmegen, The Netherlands
 
 
* cDNA: [[xdata:/final.2012sep/Veenstra201204_XENLA_cdna_final.fa]]
 
* Protein: [[xdata:/final.2012sep/Veenstra201204_XENLA_pep_final.fa]]
 
 
== Audic201207_XENLA (6,830) ==
 
Courtesy of Yann Audic (yann dot audic at univ-rennes1 dot fr), Université de RENNES I, France
 
 
* cDNA: [[xdata:/final.2012sep/Audic201207_XENLA_cdna_final.fa]]
 
* Protein: [[xdata:/final.2012sep/Audic201207_XENLA_pep_final.fa]]
 
 
== Quigley201207_XENLA (77,916) ==
 
Courtesy of Ian Quigley (iquigley at salk dot edu), Chris Kintner (kintner at salk dot edu), Salk Institute, USA
 
 
* cDNA: [[xdata:/final.2012sep/Quigley201207_XENLA_cdna_final.fa]]
 
* Protein: [[xdata:/final.2012sep/Quigley201207_XENLA_pep_final.fa]]
 
 
== TeperekTkacz201206_XENLA (70,920) ==
 
Courtesy of Marta Teperek-Tkacz (mt446 at cam dot ac dot uk), University of Cambridge/Gurdon Institute, UK & Taejoon Kwon (taejoon dot kwon at marcottelab dot org), University of Texas at Austin, USA
 
 
* cDNA: [[xdata:/final.2012sep/TeperekTkacz201206_XENLA_cdna_final.fa]]
 
* Protein: [[xdata:/final.2012sep/TeperekTkacz201206_XENLA_pep_final.fa]]
 
 
= Assembled transcripts from wild-type <i>S. tropicalis</i> RNA-seq data (96,654) =
 
== Tan201204_XENTR (96,654) ==
 
Courtesy of Meng How Tan (menghow at stanford dot edu), Stanford University, USA
 
Now published in Genome Research [http://genome.cshlp.org/content/early/2012/09/07/gr.141424.112.abstract Early Access]
 
 
* cDNA: [[xdata:/final.2012sep/Tan201204_XENTR_cdna_final.fa]]
 
* Protein: [[xdata:/final.2012sep/Tan201204_XENTR_pep_final.fa]]
 
  
 
= CHORI-219 BAC sequencing =
 
= CHORI-219 BAC sequencing =

Revision as of 14:33, 10 October 2013

Xenopus-PV.jpg

Xenopus laevis is an essential model organism in several areas of biology. In addition to the key attributes of these embryos for in vivo imaging, cell-free extracts from Xenopus provide among the most powerful in vitro systems for studies of cell and molecular biology. A complete sequence of the X. laevis genome is an essential resource for accurate identification of peptides for mass-spec analyses, for cloning of an ORFeome, for identifying evolutionarily conserved regulatory regions, and for design of morpholino-oligonucleotides for gene knockdowns.

The Wallingford and Marcotte labs obtained funding from the Texas Institute for Drug and Diagnostic Development (TI3D), in conjunction with projects funded by the National Institutes of Health, to begin sequencing of the X. laevis genome. We began the project with Scott Hunicke-Smith at the University of Texas Genome Sequencing and Analysis facility, with funding sufficient for ~20x coverage of the X. laevis genome using ABI SOLiD next-generation sequencing. The project rapidly expanded to include de novo reconstruction of X. laevis transcripts, in collaboration with groups around the world donating Illumina Hi-Seq RNA sequencing datasets, coordinating these efforts with genome sequencing by the Harland and Rokhsar groups at UC Berkeley and with Taira and collaborators at the University of Tokyo, Japan. We're posting our intermediate datasets here in advance of publication for use by the wider community.

Disclaimer

  • Data users may freely download and analyze data. They may use data in publications focused around individual genes.
  • Data users may use data to analyze their own data, i.e. reference database for MS/MS proteomics data, and/or RNA-seq data.
  • The publication and presentation of global analysis of data with these sequences are not allowed until 'data owner' published the paper. As soon as the paper is accepted, we will post that info on this website.

If you have any question about this data in general, please contact Taejoon Kwon.

Contents

Web server

Assembled transcripts

If you are looking for assembled transcripts per dataset (that I posted before releasing Oktoberfest gene model), see http://daudin.icmb.utexas.edu/pub/final.2012sep/ (TaejoonKwon 14:33, 10 October 2013 (CDT))

CHORI-219 BAC sequencing

We started the first runs by sequencing 96 BACs from the CHORI-219 library (vector: pBACGK1.1) at ~100X coverage. The selected BACs include ~70 genes of interest (Shroom3, Wnt5a, Glypican-4, Noggin, Gremlin, Pax6, Formin, etc., as initially identified by the group of Jan-Fang Cheng via probing the CHORI-219 library), as well as 10 BACs that have already been sequenced by the DOE Joint Genome Institute/HudsonAlpha Genome Sequencing Center to serve as positive controls for the sequencing and assembly pipeline.

See /XENLA_SA09023 for more details. Three mate paired libraries were sequenced:

  • X_laevis_WG - the X. laevis whole genome library, 5kb insert size - about 4.4GB raw data, 0.4GB high quality data
  • X_laevis_2kb - The set of 96 BACs, with 2kb insert size - about 3.6GB raw data, 0.3GB high quality data
  • X_laevis_5kb - The set of 96 BACs, with 5kb insert size - about 2.8GB raw data, 0.2GB high quality data

This (very roughly) corresponds to >600X coverage by raw data, ~50X coverage by high quality data, of the BAC set.

  • Given that we currently see better mapping of the shotgun SA09023 reads to X. tropicalis than to X. laevis (both to BACs and mRNAs), we're confirming the sample identity before continuing with whole genome sequencing. See the 'sanity check' /Species_Identification for details.

J-strain whole genome sequencing

In addition, we are generating several mate pair libraries of different sizes from genomic DNA prepared by Mustafa Khokha from J strain frogs obtained from Jacques Robert, sequencing each to multiple-fold coverage of the genome.

The primary data from this project will be made available as soon as possible for use by the community. We plan to periodically post reports on our progress below.

See also

References

Protocol