Siggia Lab
Publications
Projects
Software
Lab Members
|
Evolution
Fly Patterning
-
There are now upwards of 12 fully sequence species of flies covering a range of
divergence time from D.melanogaster of 2 to over 50 million years. To explore the
molecular evolution of regulatory sequence we have found D.yakuba is an optimal
comparison for D.melanogaster, with D.pseudoobscura as an outgroup to distinguish
insertions from deletions. Tandem duplcations are an important source of indels
which account for more base pairs of change than do point mutations.
In another project with Ulrike Gaul, we have examined ~100 known and predicted
anterior-posterior blastoderm patterning modules in D.melanogaster for those that
show interesting variation with D.pseudoobscura. We find modules with duplicate
expression domains are liable to change, instances where homologous regulatory
regions (eg eve stripe 3-7) give different patterns, and cases of large (100's bp)
deletions in known modules.
It is only within the academy that gene regulatory analysis and protein structure
modeling are distinct enterprises; within the cell they are not. Do the regulatory
proteins change their specificities between D.melanogaster and D.pseudoobscura, and
can one predict the binding preferences of the mosquito homologues to fly proteins?
-
Conservation of Regulatory Elements between two species of Drosophila
E. Emberly, N. Rajewsky, and E. Siggia BMC Bioinformatics, 4 57, 2003.
-
A probabilistic method to detect regulatory modules
S. Sinha, E. Van Nimwegen, E.D. Siggia, Bioinformatics. Jul;19 Suppl 1:I292-I301 2003
-
Cross-species comparisons significantly improves genome-wide prediction of cis-regulatory modules in Drosophila
Saurabh Sinha, Mark Schroeder, Ulrich Unnerstall, Ulrike Gaul, and Eric D. Siggia, BMC Bioinformatics, 5 129, 2004.
-
Sequence Turnover and Tandem Repeats in cis-Regulatory Modules
in Drosophilah S. Sinha and E.D.Siggia
Mol. Bio. Evol. 22, 874-85 2005.
Antibiotic resistance
-
The kansas farmers who do not believe in evolution have only to consult their own records of pesticide usage to conclude that insects change to become resistant to chemicals that kill them and that these changes are inheritable. Evolution happens even more rapidly in hospitals in bacteria such as Staphylococcus aureus which is a benign inhabitant of the skin and mucosal surfaces but deadly when infecting wounds. Many antibiotics target a specific gene (eg rifampacin a component of RNA polymerase, .. DNA gyrase etc ). The bacteria then develop resistance by point mutations in the affected genes. Alternatively a substitute gene is acquired via a mobile element, as happened with mecA in S.aureus. Resistance to beta-lactams incurs a growth penalty, so hospital isolates are 99.9..% susceptible, but mutate at a low strain dependent way to the resistant phenotype. Resistance also requires 10's of auxiliary genes. These bacteria (so called MRSA) have also infected the
popular press
The next line of defense against MRSA is another cell wall inhibitor vancomycin, and resistance to it develops via a series of unknown genetic changes.
With the Tomasz lab at Rockefeller we are utilizing a number of approaches to sequence the entire genomes of closely matched pairs of susceptible and resistant bacteria. The comparison of mRNA expression data has not pointed to the relevant genetic changes, and we believe genes with pleiotropic effects are involved. The appeal of antibiotic resistance for evolutionary studies is the availability of money, a historical record, and a semi-natural context.
Gene Regulation
Mobydick
-
Motifs in biological sequence data can be defined as strings whose probability of
occurrence greatly exceeds that expected for background. The problem is to
decide what constitutes background and the natural limits on a motif since
large enough pieces of a motif will themselves show up in a list of improbable
strings. An algorithm to resolve both issues has been constructed by analogy with
the statistical mechanics of disordered systems and has been usefully applied to
decode all the regulatory sequence in yeast. Some of the output is given
here .
The algorithm was tested on the eponymous novel by Melville. Random letters
were inserted between the words and the result reduced to a string of lower case letters.
The code was then asked to recover the english dictionary, (or the subset used
by Melville, which was substantial).
A sampling of the dictionaries that were
created as longer and longer strings were searched is shown as plain text files.
Enteric Bacteria (E.coli and relatives)
-
There are intrinsic limits to what can be inferred from a single genome by
probabilistic methods. The cell classifies sequence motifs with proteins whose
DNA binding specificity we cannot calculate.
Given only sequence, we have to cluster similar
patterns together, which for sparse data is much harder. To circumvent this
limitation, we do what the cell cannot do, namely compare the regulation of
homologous genes from related organisms. Mathematically this provides more
samples from the same distribution and thus makes clusters visible.
Here is a compilation of
Inferred E.coli Regulons.
There are
approximately 10 sequenced species of enteric bacteria that are close enough to
E.coli to share regulatory motifs. We have designed algorithms to measure how
fast minimally constrained regulatory sequence evolves and then with respect to
this rate quantified the significance of motifs that evolved less rapidly. The
transcription factors themselves evolve at a rate determined by the number of
genes they regulate. The results from our Genome Research paper are displayed here:
E.coli Regulatory Comparisons
-
The Evolution of DNA Regulatory Regions for Proteo-gamma Bacteria by Interspecies Comparisons N. Rajewsky, N. Socci, M. Zapotocky and E.D. Siggia, Genome Research 12 298-308 (2002).
-
Probabalistic Clustering of Sequences: Inferring new bacterial regulons by comparative genomics E. van Nimwegen, M. Zavolan, N. Rajewsky, E.D. Siggia, PNAS 99 7323-8 (2002).
-
Identification of the binding sites of Regulatory Proteins in Bacterial Genomes H. Li, V. Rhodius, C. Gross, and E.D. Siggia, Proc Natl Acad Sci (US) 99 11772-7 2002.
Gram Positive Bacteria
-
B.subtilis is the second most intensively studied bacteria, and it was of interest
to apply the algorithms we developed for E.coli to it. Because of its proximity to
B.anthracis, there is now a cluster of related genomes on which to explore comparative
algorithms. More distant species such as the Streptococcaciae, and Staphylococcus aureus have become antibiotic resistance and are thus a serious medical problem but
provide interesting data for evolutionary studies.
Patterning Fly Embryos
-
There has been a very productive convergence between evolutionary biology and
development around the idea that most evolutionary novelty is due to changes in
the regulation of existing genes rather than production of new genes. Our
understanding of regulatory evolution will progress in tandem with better
algorithms to recognize and parse regulatory sequence. In collaboration with
Ulrike Gaul's lab, we are testing algorithms that enable us to identify cis-
regulatory modules (~500 bp regions with multiple-factor binding sites) in the fly
genome using collections of known binding sites. Alternatively, binding motifs
can be found from intervals of sequences that are known to be functional. One
key test is for the segmentation gene hierarchy, a prototype of combinatorial
control where we have been quite successful in finding new blastoderm patterned
genes and new binding motifs:
Ahab.
More recent work uses both sequenced Drosophila genomes in the
search and as a byproduct can screen for homologous regulatory modules that
have changed between the two species. A more challenging task will be to dissect
the regulatory cascade that gives rise to glial cells, a case where there is a known
master regulator (Gcm) but with very few direct targets.
-
Computational detection of genomic cis-regulatory modules, applied to body patterning in the early Drosophila embryo N. Rajewsky, M. Vergassola, U. Gaul and E.D. Siggia, BMC Bioinformatics. 3 30, 2002.
-
Transcriptional Control in the Segmentation Gene Network of Drosophila
Mark D. Schroeder, Michael Pearce, John Fak, HongQing Fan , Ulrich Unnerstall, Eldon Emberly , Nikolaus Rajewsky, Eric D. Siggia, and Ulrike Gaul, PLoS 2 E 271, 2004.
Budding Yeast
-
With Frederick Cross, we are using our analysis of gene expression in yeast to
design experiments to probe the "grammar" of regulatory elements. It is seldom
stated, but true in all cases we have examined, that most (75 percent) of the sites
of the best-characterized factors in yeast do not imply expression for their cognate
genes. Analysis of chip and more recently comparative sequence data
is still far from providing a clear list of regulatory elements as detailed in a recent review.
A orthogonal approach to examining natural sequence is to construct random libraries from a restricted class of sequences and then assay them for function. The yeast "oracle" then pronounces which sentences are meaningful. One strategy for doing this, using a single sporulation specific activator and random linkers was recently published. Firm rules still did not emerge after sequencing several hundred random constructs equally divided between functional and dead sequences.
Another project examines the regulation of 20 genes by the well characterized cell cycle
factors MBF and SBF using Northerns, immunoprecipitations, and deletion of the factors.
There is a minimal correlation between the presence of the factor on the promoter and response
to deletion.
-
Regulatory Element Detection Using Correlation with Expression H.J. Bussemaker, H. Li, and E.D. Siggia, Nature Genetics 27, 167-174 (2001).
-
Computational methods for transcriptional regulation
E.D. Siggia, Curr. Opin. Genet. Dev. 15, 214-21, (2005)
-
High functional overlap between MluI cell-cycle box binding factor and Swi4/6 cell-cycle box binding factor in the G1/S transcriptional program in Saccharomyces cerevisiae
Bean JM, Siggia ED, Cross FR, Genetics 171, 49-61 (2005).
-
Gene Expression from Random Libraries of Yeast Promoters
Ligr M, Siddharthan R, Cross F, Siggia E, Genetics 172 2113-22 2006.
Protein structure modeling
-
The dream of computational protein structure prediction has created a field of structure prediction that if realized could revolutionize the prediction of gene regulation. Models of protein DNA interactions fall roughly into two categories: empirical (relying on many examples of one family to fit the matrix of base-residue interactions eg the Zn fingers), and biophysical (still data intensive, since some comparison 3D structure is required). With a postdoc schooled in the trade, we have looked at the reliability of potential based predictions of DNA binding specificity. Based on multiple factors the most favorable ratio of output per effort was achieved by a careful structural based definition of the protein DNA interface and a simple counting of contacts to define specificity. Potentials were very useful in cases where the specificity derived in part from the variability ability of DNA to bend to fit the shape defined by the protein. An extreme form of 'induced specificity' is nucleosome positioning. Merely knowing which interface residues are changed relative to a reference structure (and which bases are contacted) furnishes a very informative prior on motif searches.
Biophysics
Cell Cycle and variability
-
On the list of basic biological processes, the cell cycle ranks must rank in
importance just below basic metabolism. As a 'network', to use an atavastic term,
the cell cycle presents a realistic mixture of transcriptional and post translational
regulation (mostly the later) that does not fall within any existing bioinformatic category.
A neglected aspect of cell cycle research over the past two decades, is its variation at the
single cell level, which makes contact with earlier work on stochasticity in gene regulation.
With Fred Cross, we are making movies as a single yeast cell grows to a colony
of ~40 cells, using both phase and fluorescence microscopy. There are many markers
available that record defining events in the cell division process. Yeast is a
versitile system for 'noise' studies since it can be grown with variable ploidy.
Part of this project involves custom image analysis and annotation software.
We are also contemplating the feasibility of imposing time dependent perturbations on the cell cycle.
-
Stochastic gene expression in a single cell
M.B. Elowitz, A.J. Levine, E.D. Siggia and P.S. Swain, Science 297 1183-1186,(2002).
-
Intrinsic and Extrinsic contributions to stochasticity in gene expression
P.S. Swain, M.B. Elowitz and E.D. Siggia, Proc Natl Acad Sci 99 12795-800 (2002).
-
Coherence and timing of cell cycle start examined at single-cell resolution.
Bean JM, Siggia ED, Cross FR, Mol Cell 21, 3-14 (2006).
-
Mode locking the cell cycle
Cross FR, Siggia ED. Phys. Rev. E. 72, , (2005).
Cellular biophysics
-
Although the
basic physics applicable to the cellular domain was understood early in this
century, its utility in addressing "messy" problems has advanced considerably in
the last few decades. Physics applied to cell biology is less reductionist than
biochemistry. The challenge for the theorist is to deduce novel and quantitative
conclusions from less than full chemical detail. The opportunities for doing so
are enhanced when physics contributes to the experimental design rather than being
added at the end to fit curves.
One very productive collaboration along these lines is with the laboratory of
Jennifer Lippincott-Schwartz
(U.S. National Institutes of Health), which uses
green fluorescent chimeric proteins to follow various steps in protein trafficking
and the maintenance of organelles during the cell cycle. Given a particular cell,
we can simulate diffusion of any marker, such as a membrane protein in the
endoplasmic reticulum (ER), and compare with photo bleach experiments on
culture cells in vivo. The quantitative agreement between theory and experiment
has been used to argue, for instance, that both an inner nuclear envelope marker
and a Golgi marker reside in a continuous membrane system throughout mitosis.
By contrast, the time course seen during the Brefeldin A-induced dissolution of
the Golgi is not diffusive, and we speculate that it may involve a tension-driven
flow, such as occurs during wetting or the spreading of surfactant on an interface.
Our code for simulating diffusion in a inhomogeneous two dimensional system has been
distributed to other labs. Ref 98.
-
Nuclear Membrane Dynamics and Reassembly In Living Cells: Targeting of anInner Nuclear Membrane Protein in Interphase and Mitosis Jan Ellenberg, Eric D. Siggia, Jorge E. Moreira, Carolyn L. Smith, John Presley, Howard J. Worman and Jennifer Lippincott-Schwartz,J Cell Biology 138, 1193-1206 1997.
-
Golgi Tubule Traffic and the Effects of Brefeldin A Visualized in Living Cells by N. Sciaky, J. Presley, C. Smith, C.Zaal, N. Cole, J. Moreira, M. Terasaki, E.D. Siggia, and J. Lippincott-Schwartz, J. Cell Bio,139, 1137-1155 1997.
-
Kinetic Analysis of Secretory Protein Traffic and Characterization of Golgi to Plasma Membrane Transport Intermediates in Living Cells K. Hirschberg, J. Ellenberg, J. Presley, C. Miller, K. Zaal, N. Cole, E. Siggia, R Phair, and J. Lippincott-Schwartz J. Cell Bio. 143, 1485-1503 1998.
-
Golgi membranes are absorbed into and re-emerge from the ER during mitosis Kristien Zaal, C. Smith, R. Polishchuk, N. Altan, N. Cole, J. Presley, K.Hirschberg, J. Ellenberg, T. Roberts, R. Phair, E. Siggia,and J. Lippincott-Schwartz, Cell 99, 589-601 1999.
-
Dynamics and retention of misfolded proteins in native ER membranes S. Nehls, E.L.Snapp, N. B. Cole, K.J.M. Zaal, A.K. Kenworthy,T. H. Roberts, J. Ellenberg, J.F. Presley, E.D. Siggia, J. Lippincott-Schwartz, Nature Cell Biology 2000.
-
Dissection of COPI and Arf1 dynamics in vivo and role in Golgi membrane transport J.F. Presley, T. H. Ward, E.D. Siggia, R.D. Phair, and J. Lippincott-Schwartz,Nature, 417 187-93 (2002).
Polymers/nucleic acid
-
Phenomological theories of polymers have proved very successful in explaining
the mechanical properties of DNA, the shapes of supercoiled plasmids and the
kinetics of reactions on these substrates. The morphology and mechanics of the
brush-like chromosomes characteristic of meiotic prophase are also amenable to
treatment. Another problem in polymer physics is the kinetics of RNA folding at
the level of secondary structure. We derived expressions for the energies of
pseudoknots (which cannot be treated by existing codes) in terms of known
parameters, allowed overlapping stems and optimized them so as to calculate
plausible saddle points between various topologies. Structures as large at the
400bp group I introns can be folded with plausible kinetics.
-
Modeling RNA Folding Paths With Pseudoknots:Application to Hepatitis Delta Virus Ribozyme H. Isambert and E.D. Siggia, Proc Natl Acad Sci (US) 97, 6515-6520 (2000).
-
Driving Proteins off DNA Using Applied Tension J.F. Marko and E.D. Siggia, Biophysical Journal, 73, 2173-2178 1997.
-
Polymer Models of Meiotic and Mitotic Chromosomes J.F. Marko and E.D. Siggia, Mol. Bio. Cell. 8, 2217-2231 1997.
-
"Stretching DNA", J. Marko and E.D. Siggia, Macromolecules,28, 8759 (1995).
-
"Physical Limits on the Mechanical Measurements of the Secondary Structure of Bio-molecules", R. Thompson and E.D. Siggia, Europhysics Letters, 31, 335 (1995).
-
Statistical Mechanics of Supercoiled DNA J.F. Marko and E.D.Siggia, Phys. Rev. E, 52, 2912 (1995).
-
Entropy Elasticity of Lambda-Phage DNA C. Bustamante, J.F. Marko, E.D. Siggia, and S. Smith, Science, (technical comment) 265, 1599 (1994).
-
Fluctuations and Supercoiling of DNA J.F. Marko and E.D. Siggia, Science, 265, 506-508 (1994).
-
"Bending and Twisting Elasticity of DNA", J. Marko and E.D. Siggia, Macromolecules, 27, 981 (1994).
|